dask.dataframe.from_array — Dask documentation (original) (raw)
dask.dataframe.from_array#
dask.dataframe.from_array(arr, chunksize=50000, columns=None, meta=None)[source]#
Read any sliceable array into a Dask Dataframe
Uses getitem syntax to pull slices out of the array. The array need not be a NumPy array but must support slicing syntax
x[50000:100000]
and have 2 dimensions:
x.ndim == 2
or have a record dtype:
x.dtype == [(‘name’, ‘O’), (‘balance’, ‘i8’)]
Parameters:
xarray_like
chunksizeint, optional
The number of rows per partition to use.
columnslist or string, optional
list of column names if DataFrame, single string if Series
metaobject, optional
An optional meta parameter can be passed for dask to specify the concrete dataframe type to use for partitions of the Dask dataframe. By default, pandas DataFrame is used.
Returns:
dask.DataFrame or dask.Series
A dask DataFrame/Series