pandas.DataFrame — pandas 3.0.0rc0+31.g944c527c0a documentation (original) (raw)
Two-dimensional, size-mutable, potentially heterogeneous tabular data.
Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.
Constructing DataFrame from a dictionary.
Notice that the inferred dtype is int64.
| T | The transpose of the DataFrame. |
|---|---|
| at | Access a single value for a row/column label pair. |
| attrs | Dictionary of global attributes of this dataset. |
| axes | Return a list representing the axes of the DataFrame. |
| columns | The column labels of the DataFrame. |
| dtypes | Return the dtypes in the DataFrame. |
| empty | Indicator whether Series/DataFrame is empty. |
| flags | Get the properties associated with this pandas object. |
| iat | Access a single value for a row/column pair by integer position. |
| iloc | Purely integer-location based indexing for selection by position. |
| index | The index (row labels) of the DataFrame. |
| loc | Access a group of rows and columns by label(s) or a boolean array. |
| ndim | Return an int representing the number of axes / array dimensions. |
| shape | Return a tuple representing the dimensionality of the DataFrame. |
| size | Return an int representing the number of elements in this object. |
| style | Returns a Styler object. |
| values | Return a Numpy representation of the DataFrame. |
| abs() | Return a Series/DataFrame with absolute numeric value of each element. |
|---|---|
| add(other[, axis, level, fill_value]) | Get Addition of dataframe and other, element-wise (binary operator add). |
| add_prefix(prefix[, axis]) | Prefix labels with string prefix. |
| add_suffix(suffix[, axis]) | Suffix labels with string suffix. |
| agg([func, axis]) | Aggregate using one or more operations over the specified axis. |
| aggregate([func, axis]) | Aggregate using one or more operations over the specified axis. |
| align(other[, join, axis, level, copy, ...]) | Align two objects on their axes with the specified join method. |
| all(*[, axis, bool_only, skipna]) | Return whether all elements are True, potentially over an axis. |
| any(*[, axis, bool_only, skipna]) | Return whether any element is True, potentially over an axis. |
| apply(func[, axis, raw, result_type, args, ...]) | Apply a function along an axis of the DataFrame. |
| asfreq(freq[, method, how, normalize, ...]) | Convert time series to specified frequency. |
| asof(where[, subset]) | Return the last row(s) without any NaNs before where. |
| assign(**kwargs) | Assign new columns to a DataFrame. |
| astype(dtype[, copy, errors]) | Cast a pandas object to a specified dtype dtype. |
| at_time(time[, asof, axis]) | Select values at particular time of day (e.g., 9:30AM). |
| between_time(start_time, end_time[, ...]) | Select values between particular times of the day (e.g., 9:00-9:30 AM). |
| bfill(*[, axis, inplace, limit, limit_area]) | Fill NA/NaN values by using the next valid observation to fill the gap. |
| boxplot([column, by, ax, fontsize, rot, ...]) | Make a box plot from DataFrame columns. |
| clip([lower, upper, axis, inplace]) | Trim values at input threshold(s). |
| combine(other, func[, fill_value, overwrite]) | Perform column-wise combine with another DataFrame. |
| combine_first(other) | Update null elements with value in the same location in other. |
| compare(other[, align_axis, keep_shape, ...]) | Compare to another DataFrame and show the differences. |
| convert_dtypes([infer_objects, ...]) | Convert columns from numpy dtypes to the best dtypes that support pd.NA. |
| copy([deep]) | Make a copy of this object's indices and data. |
| corr([method, min_periods, numeric_only]) | Compute pairwise correlation of columns, excluding NA/null values. |
| corrwith(other[, axis, drop, method, ...]) | Compute pairwise correlation. |
| count([axis, numeric_only]) | Count non-NA cells for each column or row. |
| cov([min_periods, ddof, numeric_only]) | Compute pairwise covariance of columns, excluding NA/null values. |
| cummax([axis, skipna, numeric_only]) | Return cumulative maximum over a DataFrame or Series axis. |
| cummin([axis, skipna, numeric_only]) | Return cumulative minimum over a DataFrame or Series axis. |
| cumprod([axis, skipna, numeric_only]) | Return cumulative product over a DataFrame or Series axis. |
| cumsum([axis, skipna, numeric_only]) | Return cumulative sum over a DataFrame or Series axis. |
| describe([percentiles, include, exclude]) | Generate descriptive statistics. |
| diff([periods, axis]) | First discrete difference of element. |
| div(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator truediv). |
| divide(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator truediv). |
| dot(other) | Compute the matrix multiplication between the DataFrame and other. |
| drop([labels, axis, index, columns, level, ...]) | Drop specified labels from rows or columns. |
| drop_duplicates([subset, keep, inplace, ...]) | Return DataFrame with duplicate rows removed. |
| droplevel(level[, axis]) | Return Series/DataFrame with requested index / column level(s) removed. |
| dropna(*[, axis, how, thresh, subset, ...]) | Remove missing values. |
| duplicated([subset, keep]) | Return boolean Series denoting duplicate rows. |
| eq(other[, axis, level]) | Get Not equal to of dataframe and other, element-wise (binary operator eq). |
| equals(other) | Test whether two objects contain the same elements. |
| eval(expr, *[, inplace]) | Evaluate a string describing operations on DataFrame columns. |
| ewm([com, span, halflife, alpha, ...]) | Provide exponentially weighted (EW) calculations. |
| expanding([min_periods, method]) | Provide expanding window calculations. |
| explode(column[, ignore_index]) | Transform each element of a list-like to a row, replicating index values. |
| ffill(*[, axis, inplace, limit, limit_area]) | Fill NA/NaN values by propagating the last valid observation to next valid. |
| fillna(value, *[, axis, inplace, limit]) | Fill NA/NaN values with value. |
| filter([items, like, regex, axis]) | Subset the DataFrame or Series according to the specified index labels. |
| first_valid_index() | Return index for first non-missing value or None, if no value is found. |
| floordiv(other[, axis, level, fill_value]) | Get Integer division of dataframe and other, element-wise (binary operator floordiv). |
| from_arrow(data) | Construct a DataFrame from a tabular Arrow object. |
| from_dict(data[, orient, dtype, columns]) | Construct DataFrame from dict of array-like or dicts. |
| from_records(data[, index, exclude, ...]) | Convert structured or record ndarray to DataFrame. |
| ge(other[, axis, level]) | Get Greater than or equal to of dataframe and other, element-wise (binary operator ge). |
| get(key[, default]) | Get item from object for given key (ex: DataFrame column). |
| groupby([by, level, as_index, sort, ...]) | Group DataFrame using a mapper or by a Series of columns. |
| gt(other[, axis, level]) | Get Greater than of dataframe and other, element-wise (binary operator gt). |
| head([n]) | Return the first n rows. |
| hist([column, by, grid, xlabelsize, xrot, ...]) | Make a histogram of the DataFrame's columns. |
| idxmax([axis, skipna, numeric_only]) | Return index of first occurrence of maximum over requested axis. |
| idxmin([axis, skipna, numeric_only]) | Return index of first occurrence of minimum over requested axis. |
| infer_objects([copy]) | Attempt to infer better dtypes for object columns. |
| info([verbose, buf, max_cols, memory_usage, ...]) | Print a concise summary of a DataFrame. |
| insert(loc, column, value[, allow_duplicates]) | Insert column into DataFrame at specified location. |
| interpolate([method, axis, limit, inplace, ...]) | Fill NaN values using an interpolation method. |
| isetitem(loc, value) | Set the given value in the column with position loc. |
| isin(values) | Whether each element in the DataFrame is contained in values. |
| isna() | Detect missing values. |
| isnull() | DataFrame.isnull is an alias for DataFrame.isna. |
| items() | Iterate over (column name, Series) pairs. |
| iterrows() | Iterate over DataFrame rows as (index, Series) pairs. |
| itertuples([index, name]) | Iterate over DataFrame rows as namedtuples. |
| join(other[, on, how, lsuffix, rsuffix, ...]) | Join columns of another DataFrame. |
| keys() | Get the 'info axis' (see Indexing for more). |
| kurt(*[, axis, skipna, numeric_only]) | Return unbiased kurtosis over requested axis. |
| kurtosis(*[, axis, skipna, numeric_only]) | Return unbiased kurtosis over requested axis. |
| last_valid_index() | Return index for last non-missing value or None, if no value is found. |
| le(other[, axis, level]) | Get Greater than or equal to of dataframe and other, element-wise (binary operator le). |
| lt(other[, axis, level]) | Get Greater than of dataframe and other, element-wise (binary operator lt). |
| map(func[, na_action]) | Apply a function to a Dataframe elementwise. |
| mask(cond[, other, inplace, axis, level]) | Replace values where the condition is True. |
| max(*[, axis, skipna, numeric_only]) | Return the maximum of the values over the requested axis. |
| mean(*[, axis, skipna, numeric_only]) | Return the mean of the values over the requested axis. |
| median(*[, axis, skipna, numeric_only]) | Return the median of the values over the requested axis. |
| melt([id_vars, value_vars, var_name, ...]) | Unpivot DataFrame from wide to long format, optionally leaving identifiers set. |
| memory_usage([index, deep]) | Return the memory usage of each column in bytes. |
| merge(right[, how, on, left_on, right_on, ...]) | Merge DataFrame or named Series objects with a database-style join. |
| min(*[, axis, skipna, numeric_only]) | Return the minimum of the values over the requested axis. |
| mod(other[, axis, level, fill_value]) | Get Modulo of dataframe and other, element-wise (binary operator mod). |
| mode([axis, numeric_only, dropna]) | Get the mode(s) of each element along the selected axis. |
| mul(other[, axis, level, fill_value]) | Get Multiplication of dataframe and other, element-wise (binary operator mul). |
| multiply(other[, axis, level, fill_value]) | Get Multiplication of dataframe and other, element-wise (binary operator mul). |
| ne(other[, axis, level]) | Get Not equal to of dataframe and other, element-wise (binary operator ne). |
| nlargest(n, columns[, keep]) | Return the first n rows ordered by columns in descending order. |
| notna() | Detect existing (non-missing) values. |
| notnull() | DataFrame.notnull is an alias for DataFrame.notna. |
| nsmallest(n, columns[, keep]) | Return the first n rows ordered by columns in ascending order. |
| nunique([axis, dropna]) | Count number of distinct elements in specified axis. |
| pct_change([periods, fill_method, freq]) | Fractional change between the current and a prior element. |
| pipe(func, *args, **kwargs) | Apply chainable functions that expect Series or DataFrames. |
| pivot(*, columns[, index, values]) | Return reshaped DataFrame organized by given index / column values. |
| pivot_table([values, index, columns, ...]) | Create a spreadsheet-style pivot table as a DataFrame. |
| pop(item) | Return item and drop it from DataFrame. |
| pow(other[, axis, level, fill_value]) | Get Exponential power of dataframe and other, element-wise (binary operator pow). |
| prod(*[, axis, skipna, numeric_only, min_count]) | Return the product of the values over the requested axis. |
| product(*[, axis, skipna, numeric_only, ...]) | Return the product of the values over the requested axis. |
| quantile([q, axis, numeric_only, ...]) | Return values at the given quantile over requested axis. |
| query(expr, *[, parser, engine, local_dict, ...]) | Query the columns of a DataFrame with a boolean expression. |
| radd(other[, axis, level, fill_value]) | Get Addition of dataframe and other, element-wise (binary operator radd). |
| rank([axis, method, numeric_only, ...]) | Compute numerical data ranks (1 through n) along axis. |
| rdiv(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
| reindex([labels, index, columns, axis, ...]) | Conform DataFrame to new index with optional filling logic. |
| reindex_like(other[, method, copy, limit, ...]) | Return an object with matching indices as other object. |
| rename([mapper, index, columns, axis, copy, ...]) | Rename columns or index labels. |
| rename_axis([mapper, index, columns, axis, ...]) | Set the name of the axis for the index or columns. |
| reorder_levels(order[, axis]) | Rearrange index or column levels using input order. |
| replace([to_replace, value, inplace, regex]) | Replace values given in to_replace with value. |
| resample(rule[, closed, label, convention, ...]) | Resample time-series data. |
| reset_index([level, drop, inplace, ...]) | Reset the index, or a level of it. |
| rfloordiv(other[, axis, level, fill_value]) | Get Integer division of dataframe and other, element-wise (binary operator rfloordiv). |
| rmod(other[, axis, level, fill_value]) | Get Modulo of dataframe and other, element-wise (binary operator rmod). |
| rmul(other[, axis, level, fill_value]) | Get Multiplication of dataframe and other, element-wise (binary operator rmul). |
| rolling(window[, min_periods, center, ...]) | Provide rolling window calculations. |
| round([decimals]) | Round numeric columns in a DataFrame to a variable number of decimal places. |
| rpow(other[, axis, level, fill_value]) | Get Exponential power of dataframe and other, element-wise (binary operator rpow). |
| rsub(other[, axis, level, fill_value]) | Get Subtraction of dataframe and other, element-wise (binary operator rsub). |
| rtruediv(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
| sample([n, frac, replace, weights, ...]) | Return a random sample of items from an axis of object. |
| select_dtypes([include, exclude]) | Return a subset of the DataFrame's columns based on the column dtypes. |
| sem(*[, axis, skipna, ddof, numeric_only]) | Return unbiased standard error of the mean over requested axis. |
| set_axis(labels, *[, axis, copy]) | Assign desired index to given axis. |
| set_flags(*[, copy, allows_duplicate_labels]) | Return a new object with updated flags. |
| set_index(keys, *[, drop, append, inplace, ...]) | Set the DataFrame index using existing columns. |
| shift([periods, freq, axis, fill_value, suffix]) | Shift index by desired number of periods with an optional time freq. |
| skew(*[, axis, skipna, numeric_only]) | Return unbiased skew over requested axis. |
| sort_index(*[, axis, level, ascending, ...]) | Sort object by labels (along an axis). |
| sort_values(by, *[, axis, ascending, ...]) | Sort by the values along either axis. |
| squeeze([axis]) | Squeeze 1 dimensional axis objects into scalars. |
| stack([level, dropna, sort, future_stack]) | Stack the prescribed level(s) from columns to index. |
| std(*[, axis, skipna, ddof, numeric_only]) | Return sample standard deviation over requested axis. |
| sub(other[, axis, level, fill_value]) | Get Subtraction of dataframe and other, element-wise (binary operator sub). |
| subtract(other[, axis, level, fill_value]) | Get Subtraction of dataframe and other, element-wise (binary operator sub). |
| sum(*[, axis, skipna, numeric_only, min_count]) | Return the sum of the values over the requested axis. |
| swaplevel([i, j, axis]) | Swap levels i and j in a MultiIndex. |
| tail([n]) | Return the last n rows. |
| take(indices[, axis]) | Return the elements in the given positional indices along an axis. |
| to_clipboard(*[, excel, sep]) | Copy object to the system clipboard. |
| to_csv([path_or_buf, sep, na_rep, ...]) | Write object to a comma-separated values (csv) file. |
| to_dict([orient, into, index]) | Convert the DataFrame to a dictionary. |
| to_excel(excel_writer, *[, sheet_name, ...]) | Write object to an Excel sheet. |
| to_feather(path, **kwargs) | Write a DataFrame to the binary Feather format. |
| to_hdf(path_or_buf, *, key[, mode, ...]) | Write the contained data to an HDF5 file using HDFStore. |
| to_html([buf, columns, col_space, header, ...]) | Render a DataFrame as an HTML table. |
| to_iceberg(table_identifier[, catalog_name, ...]) | Write a DataFrame to an Apache Iceberg table. |
| to_json([path_or_buf, orient, date_format, ...]) | Convert the object to a JSON string. |
| to_latex([buf, columns, header, index, ...]) | Render object to a LaTeX tabular, longtable, or nested table. |
| to_markdown([buf, mode, index, storage_options]) | Print DataFrame in Markdown-friendly format. |
| to_numpy([dtype, copy, na_value]) | Convert the DataFrame to a NumPy array. |
| to_orc([path, engine, index, engine_kwargs]) | Write a DataFrame to the Optimized Row Columnar (ORC) format. |
| to_parquet([path, engine, compression, ...]) | Write a DataFrame to the binary parquet format. |
| to_period([freq, axis, copy]) | Convert DataFrame from DatetimeIndex to PeriodIndex. |
| to_pickle(path, *[, compression, protocol, ...]) | Pickle (serialize) object to file. |
| to_records([index, column_dtypes, index_dtypes]) | Convert DataFrame to a NumPy record array. |
| to_sql(name, con, *[, schema, if_exists, ...]) | Write records stored in a DataFrame to a SQL database. |
| to_stata(path, *[, convert_dates, ...]) | Export DataFrame object to Stata dta format. |
| to_string([buf, columns, col_space, header, ...]) | Render a DataFrame to a console-friendly tabular output. |
| to_timestamp([freq, how, axis, copy]) | Cast PeriodIndex to DatetimeIndex of timestamps, at beginning of period. |
| to_xarray() | Return an xarray object from the pandas object. |
| to_xml([path_or_buffer, index, root_name, ...]) | Render a DataFrame to an XML document. |
| transform(func[, axis]) | Call func on self producing a DataFrame with the same axis shape as self. |
| transpose(*args[, copy]) | Transpose index and columns. |
| truediv(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator truediv). |
| truncate([before, after, axis, copy]) | Truncate a Series or DataFrame before and after some index value. |
| tz_convert(tz[, axis, level, copy]) | Convert tz-aware axis to target time zone. |
| tz_localize(tz[, axis, level, copy, ...]) | Localize time zone naive index of a Series or DataFrame to target time zone. |
| unstack([level, fill_value, sort]) | Pivot a level of the (necessarily hierarchical) index labels. |
| update(other[, join, overwrite, ...]) | Modify in place using non-NA values from another DataFrame. |
| value_counts([subset, normalize, sort, ...]) | Return a Series containing the frequency of each distinct row in the DataFrame. |
| var(*[, axis, skipna, ddof, numeric_only]) | Return unbiased variance over requested axis. |
| where(cond[, other, inplace, axis, level]) | Replace values where the condition is False. |
| xs(key[, axis, level, drop_level]) | Return cross-section from the Series/DataFrame. |