pandas.DataFrame — pandas 3.0.0.dev0+2102.g839747c3f6 documentation (original) (raw)
Two-dimensional, size-mutable, potentially heterogeneous tabular data.
Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.
Constructing DataFrame from a dictionary.
Notice that the inferred dtype is int64.
T | The transpose of the DataFrame. |
---|---|
at | Access a single value for a row/column label pair. |
attrs | Dictionary of global attributes of this dataset. |
axes | Return a list representing the axes of the DataFrame. |
columns | The column labels of the DataFrame. |
dtypes | Return the dtypes in the DataFrame. |
empty | Indicator whether Series/DataFrame is empty. |
flags | Get the properties associated with this pandas object. |
iat | Access a single value for a row/column pair by integer position. |
iloc | Purely integer-location based indexing for selection by position. |
index | The index (row labels) of the DataFrame. |
loc | Access a group of rows and columns by label(s) or a boolean array. |
ndim | Return an int representing the number of axes / array dimensions. |
shape | Return a tuple representing the dimensionality of the DataFrame. |
size | Return an int representing the number of elements in this object. |
style | Returns a Styler object. |
values | Return a Numpy representation of the DataFrame. |
abs() | Return a Series/DataFrame with absolute numeric value of each element. |
---|---|
add(other[, axis, level, fill_value]) | Get Addition of dataframe and other, element-wise (binary operator add). |
add_prefix(prefix[, axis]) | Prefix labels with string prefix. |
add_suffix(suffix[, axis]) | Suffix labels with string suffix. |
agg([func, axis]) | Aggregate using one or more operations over the specified axis. |
aggregate([func, axis]) | Aggregate using one or more operations over the specified axis. |
align(other[, join, axis, level, copy, ...]) | Align two objects on their axes with the specified join method. |
all(*[, axis, bool_only, skipna]) | Return whether all elements are True, potentially over an axis. |
any(*[, axis, bool_only, skipna]) | Return whether any element is True, potentially over an axis. |
apply(func[, axis, raw, result_type, args, ...]) | Apply a function along an axis of the DataFrame. |
asfreq(freq[, method, how, normalize, ...]) | Convert time series to specified frequency. |
asof(where[, subset]) | Return the last row(s) without any NaNs before where. |
assign(**kwargs) | Assign new columns to a DataFrame. |
astype(dtype[, copy, errors]) | Cast a pandas object to a specified dtype dtype. |
at_time(time[, asof, axis]) | Select values at particular time of day (e.g., 9:30AM). |
between_time(start_time, end_time[, ...]) | Select values between particular times of the day (e.g., 9:00-9:30 AM). |
bfill(*[, axis, inplace, limit, limit_area]) | Fill NA/NaN values by using the next valid observation to fill the gap. |
boxplot([column, by, ax, fontsize, rot, ...]) | Make a box plot from DataFrame columns. |
clip([lower, upper, axis, inplace]) | Trim values at input threshold(s). |
combine(other, func[, fill_value, overwrite]) | Perform column-wise combine with another DataFrame. |
combine_first(other) | Update null elements with value in the same location in other. |
compare(other[, align_axis, keep_shape, ...]) | Compare to another DataFrame and show the differences. |
convert_dtypes([infer_objects, ...]) | Convert columns from numpy dtypes to the best dtypes that support pd.NA. |
copy([deep]) | Make a copy of this object's indices and data. |
corr([method, min_periods, numeric_only]) | Compute pairwise correlation of columns, excluding NA/null values. |
corrwith(other[, axis, drop, method, ...]) | Compute pairwise correlation. |
count([axis, numeric_only]) | Count non-NA cells for each column or row. |
cov([min_periods, ddof, numeric_only]) | Compute pairwise covariance of columns, excluding NA/null values. |
cummax([axis, skipna, numeric_only]) | Return cumulative maximum over a DataFrame or Series axis. |
cummin([axis, skipna, numeric_only]) | Return cumulative minimum over a DataFrame or Series axis. |
cumprod([axis, skipna, numeric_only]) | Return cumulative product over a DataFrame or Series axis. |
cumsum([axis, skipna, numeric_only]) | Return cumulative sum over a DataFrame or Series axis. |
describe([percentiles, include, exclude]) | Generate descriptive statistics. |
diff([periods, axis]) | First discrete difference of element. |
div(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator truediv). |
divide(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator truediv). |
dot(other) | Compute the matrix multiplication between the DataFrame and other. |
drop([labels, axis, index, columns, level, ...]) | Drop specified labels from rows or columns. |
drop_duplicates([subset, keep, inplace, ...]) | Return DataFrame with duplicate rows removed. |
droplevel(level[, axis]) | Return Series/DataFrame with requested index / column level(s) removed. |
dropna(*[, axis, how, thresh, subset, ...]) | Remove missing values. |
duplicated([subset, keep]) | Return boolean Series denoting duplicate rows. |
eq(other[, axis, level]) | Get Not equal to of dataframe and other, element-wise (binary operator eq). |
equals(other) | Test whether two objects contain the same elements. |
eval(expr, *[, inplace]) | Evaluate a string describing operations on DataFrame columns. |
ewm([com, span, halflife, alpha, ...]) | Provide exponentially weighted (EW) calculations. |
expanding([min_periods, method]) | Provide expanding window calculations. |
explode(column[, ignore_index]) | Transform each element of a list-like to a row, replicating index values. |
ffill(*[, axis, inplace, limit, limit_area]) | Fill NA/NaN values by propagating the last valid observation to next valid. |
fillna(value, *[, axis, inplace, limit]) | Fill NA/NaN values with value. |
filter([items, like, regex, axis]) | Subset the DataFrame or Series according to the specified index labels. |
first_valid_index() | Return index for first non-missing value or None, if no value is found. |
floordiv(other[, axis, level, fill_value]) | Get Integer division of dataframe and other, element-wise (binary operator floordiv). |
from_dict(data[, orient, dtype, columns]) | Construct DataFrame from dict of array-like or dicts. |
from_records(data[, index, exclude, ...]) | Convert structured or record ndarray to DataFrame. |
ge(other[, axis, level]) | Get Greater than or equal to of dataframe and other, element-wise (binary operator ge). |
get(key[, default]) | Get item from object for given key (ex: DataFrame column). |
groupby([by, level, as_index, sort, ...]) | Group DataFrame using a mapper or by a Series of columns. |
gt(other[, axis, level]) | Get Greater than of dataframe and other, element-wise (binary operator gt). |
head([n]) | Return the first n rows. |
hist([column, by, grid, xlabelsize, xrot, ...]) | Make a histogram of the DataFrame's columns. |
idxmax([axis, skipna, numeric_only]) | Return index of first occurrence of maximum over requested axis. |
idxmin([axis, skipna, numeric_only]) | Return index of first occurrence of minimum over requested axis. |
infer_objects([copy]) | Attempt to infer better dtypes for object columns. |
info([verbose, buf, max_cols, memory_usage, ...]) | Print a concise summary of a DataFrame. |
insert(loc, column, value[, allow_duplicates]) | Insert column into DataFrame at specified location. |
interpolate([method, axis, limit, inplace, ...]) | Fill NaN values using an interpolation method. |
isetitem(loc, value) | Set the given value in the column with position loc. |
isin(values) | Whether each element in the DataFrame is contained in values. |
isna() | Detect missing values. |
isnull() | DataFrame.isnull is an alias for DataFrame.isna. |
items() | Iterate over (column name, Series) pairs. |
iterrows() | Iterate over DataFrame rows as (index, Series) pairs. |
itertuples([index, name]) | Iterate over DataFrame rows as namedtuples. |
join(other[, on, how, lsuffix, rsuffix, ...]) | Join columns of another DataFrame. |
keys() | Get the 'info axis' (see Indexing for more). |
kurt(*[, axis, skipna, numeric_only]) | Return unbiased kurtosis over requested axis. |
kurtosis(*[, axis, skipna, numeric_only]) | Return unbiased kurtosis over requested axis. |
last_valid_index() | Return index for last non-missing value or None, if no value is found. |
le(other[, axis, level]) | Get Greater than or equal to of dataframe and other, element-wise (binary operator le). |
lt(other[, axis, level]) | Get Greater than of dataframe and other, element-wise (binary operator lt). |
map(func[, na_action]) | Apply a function to a Dataframe elementwise. |
mask(cond[, other, inplace, axis, level]) | Replace values where the condition is True. |
max(*[, axis, skipna, numeric_only]) | Return the maximum of the values over the requested axis. |
mean(*[, axis, skipna, numeric_only]) | Return the mean of the values over the requested axis. |
median(*[, axis, skipna, numeric_only]) | Return the median of the values over the requested axis. |
melt([id_vars, value_vars, var_name, ...]) | Unpivot DataFrame from wide to long format, optionally leaving identifiers set. |
memory_usage([index, deep]) | Return the memory usage of each column in bytes. |
merge(right[, how, on, left_on, right_on, ...]) | Merge DataFrame or named Series objects with a database-style join. |
min(*[, axis, skipna, numeric_only]) | Return the minimum of the values over the requested axis. |
mod(other[, axis, level, fill_value]) | Get Modulo of dataframe and other, element-wise (binary operator mod). |
mode([axis, numeric_only, dropna]) | Get the mode(s) of each element along the selected axis. |
mul(other[, axis, level, fill_value]) | Get Multiplication of dataframe and other, element-wise (binary operator mul). |
multiply(other[, axis, level, fill_value]) | Get Multiplication of dataframe and other, element-wise (binary operator mul). |
ne(other[, axis, level]) | Get Not equal to of dataframe and other, element-wise (binary operator ne). |
nlargest(n, columns[, keep]) | Return the first n rows ordered by columns in descending order. |
notna() | Detect existing (non-missing) values. |
notnull() | DataFrame.notnull is an alias for DataFrame.notna. |
nsmallest(n, columns[, keep]) | Return the first n rows ordered by columns in ascending order. |
nunique([axis, dropna]) | Count number of distinct elements in specified axis. |
pct_change([periods, fill_method, freq]) | Fractional change between the current and a prior element. |
pipe(func, *args, **kwargs) | Apply chainable functions that expect Series or DataFrames. |
pivot(*, columns[, index, values]) | Return reshaped DataFrame organized by given index / column values. |
pivot_table([values, index, columns, ...]) | Create a spreadsheet-style pivot table as a DataFrame. |
pop(item) | Return item and drop it from DataFrame. |
pow(other[, axis, level, fill_value]) | Get Exponential power of dataframe and other, element-wise (binary operator pow). |
prod(*[, axis, skipna, numeric_only, min_count]) | Return the product of the values over the requested axis. |
product(*[, axis, skipna, numeric_only, ...]) | Return the product of the values over the requested axis. |
quantile([q, axis, numeric_only, ...]) | Return values at the given quantile over requested axis. |
query(expr, *[, inplace]) | Query the columns of a DataFrame with a boolean expression. |
radd(other[, axis, level, fill_value]) | Get Addition of dataframe and other, element-wise (binary operator radd). |
rank([axis, method, numeric_only, ...]) | Compute numerical data ranks (1 through n) along axis. |
rdiv(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
reindex([labels, index, columns, axis, ...]) | Conform DataFrame to new index with optional filling logic. |
reindex_like(other[, method, copy, limit, ...]) | Return an object with matching indices as other object. |
rename([mapper, index, columns, axis, copy, ...]) | Rename columns or index labels. |
rename_axis([mapper, index, columns, axis, ...]) | Set the name of the axis for the index or columns. |
reorder_levels(order[, axis]) | Rearrange index or column levels using input order. |
replace([to_replace, value, inplace, regex]) | Replace values given in to_replace with value. |
resample(rule[, closed, label, convention, ...]) | Resample time-series data. |
reset_index([level, drop, inplace, ...]) | Reset the index, or a level of it. |
rfloordiv(other[, axis, level, fill_value]) | Get Integer division of dataframe and other, element-wise (binary operator rfloordiv). |
rmod(other[, axis, level, fill_value]) | Get Modulo of dataframe and other, element-wise (binary operator rmod). |
rmul(other[, axis, level, fill_value]) | Get Multiplication of dataframe and other, element-wise (binary operator rmul). |
rolling(window[, min_periods, center, ...]) | Provide rolling window calculations. |
round([decimals]) | Round numeric columns in a DataFrame to a variable number of decimal places. |
rpow(other[, axis, level, fill_value]) | Get Exponential power of dataframe and other, element-wise (binary operator rpow). |
rsub(other[, axis, level, fill_value]) | Get Subtraction of dataframe and other, element-wise (binary operator rsub). |
rtruediv(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
sample([n, frac, replace, weights, ...]) | Return a random sample of items from an axis of object. |
select_dtypes([include, exclude]) | Return a subset of the DataFrame's columns based on the column dtypes. |
sem(*[, axis, skipna, ddof, numeric_only]) | Return unbiased standard error of the mean over requested axis. |
set_axis(labels, *[, axis, copy]) | Assign desired index to given axis. |
set_flags(*[, copy, allows_duplicate_labels]) | Return a new object with updated flags. |
set_index(keys, *[, drop, append, inplace, ...]) | Set the DataFrame index using existing columns. |
shift([periods, freq, axis, fill_value, suffix]) | Shift index by desired number of periods with an optional time freq. |
skew(*[, axis, skipna, numeric_only]) | Return unbiased skew over requested axis. |
sort_index(*[, axis, level, ascending, ...]) | Sort object by labels (along an axis). |
sort_values(by, *[, axis, ascending, ...]) | Sort by the values along either axis. |
squeeze([axis]) | Squeeze 1 dimensional axis objects into scalars. |
stack([level, dropna, sort, future_stack]) | Stack the prescribed level(s) from columns to index. |
std(*[, axis, skipna, ddof, numeric_only]) | Return sample standard deviation over requested axis. |
sub(other[, axis, level, fill_value]) | Get Subtraction of dataframe and other, element-wise (binary operator sub). |
subtract(other[, axis, level, fill_value]) | Get Subtraction of dataframe and other, element-wise (binary operator sub). |
sum(*[, axis, skipna, numeric_only, min_count]) | Return the sum of the values over the requested axis. |
swaplevel([i, j, axis]) | Swap levels i and j in a MultiIndex. |
tail([n]) | Return the last n rows. |
take(indices[, axis]) | Return the elements in the given positional indices along an axis. |
to_clipboard(*[, excel, sep]) | Copy object to the system clipboard. |
to_csv([path_or_buf, sep, na_rep, ...]) | Write object to a comma-separated values (csv) file. |
to_dict([orient, into, index]) | Convert the DataFrame to a dictionary. |
to_excel(excel_writer, *[, sheet_name, ...]) | Write object to an Excel sheet. |
to_feather(path, **kwargs) | Write a DataFrame to the binary Feather format. |
to_hdf(path_or_buf, *, key[, mode, ...]) | Write the contained data to an HDF5 file using HDFStore. |
to_html([buf, columns, col_space, header, ...]) | Render a DataFrame as an HTML table. |
to_json([path_or_buf, orient, date_format, ...]) | Convert the object to a JSON string. |
to_latex([buf, columns, header, index, ...]) | Render object to a LaTeX tabular, longtable, or nested table. |
to_markdown([buf, mode, index, storage_options]) | Print DataFrame in Markdown-friendly format. |
to_numpy([dtype, copy, na_value]) | Convert the DataFrame to a NumPy array. |
to_orc([path, engine, index, engine_kwargs]) | Write a DataFrame to the Optimized Row Columnar (ORC) format. |
to_parquet([path, engine, compression, ...]) | Write a DataFrame to the binary parquet format. |
to_period([freq, axis, copy]) | Convert DataFrame from DatetimeIndex to PeriodIndex. |
to_pickle(path, *[, compression, protocol, ...]) | Pickle (serialize) object to file. |
to_records([index, column_dtypes, index_dtypes]) | Convert DataFrame to a NumPy record array. |
to_sql(name, con, *[, schema, if_exists, ...]) | Write records stored in a DataFrame to a SQL database. |
to_stata(path, *[, convert_dates, ...]) | Export DataFrame object to Stata dta format. |
to_string([buf, columns, col_space, header, ...]) | Render a DataFrame to a console-friendly tabular output. |
to_timestamp([freq, how, axis, copy]) | Cast PeriodIndex to DatetimeIndex of timestamps, at beginning of period. |
to_xarray() | Return an xarray object from the pandas object. |
to_xml([path_or_buffer, index, root_name, ...]) | Render a DataFrame to an XML document. |
transform(func[, axis]) | Call func on self producing a DataFrame with the same axis shape as self. |
transpose(*args[, copy]) | Transpose index and columns. |
truediv(other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator truediv). |
truncate([before, after, axis, copy]) | Truncate a Series or DataFrame before and after some index value. |
tz_convert(tz[, axis, level, copy]) | Convert tz-aware axis to target time zone. |
tz_localize(tz[, axis, level, copy, ...]) | Localize time zone naive index of a Series or DataFrame to target time zone. |
unstack([level, fill_value, sort]) | Pivot a level of the (necessarily hierarchical) index labels. |
update(other[, join, overwrite, ...]) | Modify in place using non-NA values from another DataFrame. |
value_counts([subset, normalize, sort, ...]) | Return a Series containing the frequency of each distinct row in the DataFrame. |
var(*[, axis, skipna, ddof, numeric_only]) | Return unbiased variance over requested axis. |
where(cond[, other, inplace, axis, level]) | Replace values where the condition is False. |
xs(key[, axis, level, drop_level]) | Return cross-section from the Series/DataFrame. |