Counting 2,870 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1

Author
Last Commit
Jun. 16, 2018
Created
May. 7, 2017

PyGDF

Documentation Status

PyGDF implements the Python interface to access and manipulate the GPU DataFrame of GPU Open Analytics Initiative (GoAi). We aim to provide a simple interface that is similar to the Pandas DataFrame and hide the details of GPU programming.

Read more about GoAi and the GDF

Setup

Conda

You can get a minimal conda installation with Miniconda or get the full installation with Anaconda.

You can install and update PyGDF using the conda command:

conda install -c numba -c conda-forge -c gpuopenanalytics/label/dev -c defaults pygdf=0.1.0a2

You can create and activate a development environment using the conda command:

conda env create --name pygdf_dev --file conda_environments/testing_py35.yml
source activate pygdf_dev

Install from Source

To install PyGDF from source, clone the repository and run the python install command:

git clone https://github.com/gpuopenanalytics/pygdf.git
python setup.py install

Note: This will not install dependencies automatically, so it is recommended to use the conda environment.

Pip

Currently, we don't support pip install yet. Please use conda for the time being.

Testing

This project uses py.test.

In the source root directory and with the development environment activated, run:

py.test

Getting Started

Please see the Demo Docker Repository for example notebooks on how you can utilize the GPU DataFrame.

GPU Open Analytics Initiative

The GPU Open Analytics Initiative (GoAi) seeks to foster and develop open collaboration between GPU analytics projects and products to enable data scientists to efficiently combine the best tools for their workflows. The first project of GoAi is the GPU DataFrame (GDF), which enables tabular data to be directly exchanged between libraries and applications on the GPU.

GPU DataFrame

The GPU DataFrame is a common API that enables efficient interchange of tabular data between processes running on the GPU. End-to-end computation on the GPU avoids unnecessary copying and converting of data off the GPU, reducing compute time and cost for high-performance analytics common in artificial intelligence workloads. The GPU DataFrame uses the Apache Arrow columnar data format on the GPU. Currently, a subset of the features in Arrow are supported.

Latest Releases
v0.1.0a2
 Aug. 28 2017
v0.1.0a1
 May. 7 2017