GitHub - narwhals-dev/narwhals: Lightweight and extensible compatibility layer between dataframe libraries! (original) (raw)

Narwhals

narwhals_small

PyPI version Downloads Trusted publishing PYPI - Types

Extremely lightweight and extensible compatibility layer between dataframe libraries!

Seamlessly support all, without depending on any!

Get started!

Installation

conda install -c conda-forge narwhals  

Usage

There are three steps to writing dataframe-agnostic code using Narwhals:

  1. use narwhals.from_native to wrap a pandas/Polars/Modin/cuDF/PyArrow DataFrame/LazyFrame in a Narwhals class
  2. use the subset of the Polars API supported by Narwhals
  3. use narwhals.to_native to return an object to the user in its original dataframe flavour. For example:
    • if you started with pandas, you'll get pandas back
    • if you started with Polars, you'll get Polars back
    • if you started with Modin, you'll get Modin back (and compute will be distributed)
    • if you started with cuDF, you'll get cuDF back (and compute will happen on GPU)
    • if you started with PyArrow, you'll get PyArrow back

narwhals_gif

Example

Narwhals allows you to define dataframe-agnostic functions. For example:

import narwhals as nw from narwhals.typing import IntoFrameT

def agnostic_function( df_native: IntoFrameT, date_column: str, price_column: str, ) -> IntoFrameT: return ( nw.from_native(df_native) .group_by(nw.col(date_column).dt.truncate("1mo")) .agg(nw.col(price_column).mean()) .sort(date_column) .to_native() )

You can then pass pandas.DataFrame, polars.DataFrame, polars.LazyFrame, duckdb.DuckDBPyRelation,pyspark.sql.DataFrame, pyarrow.Table, and more, to agnostic_function. In each case, no additional dependencies will be required, and computation will stay native to the input library:

import pandas as pd import polars as pl from datetime import datetime

data = { "date": [datetime(2020, 1, 1), datetime(2020, 1, 8), datetime(2020, 2, 3)], "price": [1, 4, 3], } print("pandas result:") print(agnostic_function(pd.DataFrame(data), "date", "price")) print() print("Polars result:") print(agnostic_function(pl.DataFrame(data), "date", "price"))

pandas result:
        date  price
0 2020-01-01    2.5
1 2020-02-01    3.0

Polars result:
shape: (2, 2)
┌─────────────────────┬───────┐
│ date                ┆ price │
│ ---                 ┆ ---   │
│ datetime[μs]        ┆ f64   │
╞═════════════════════╪═══════╡
│ 2020-01-01 00:00:00 ┆ 2.5   │
│ 2020-02-01 00:00:00 ┆ 3.0   │
└─────────────────────┴───────┘

See the tutorial for several examples!

Scope

If you said yes to both, we'd love to hear from you!

Roadmap

See roadmap discussion on GitHubfor an up-to-date plan of future work.

Used by

Join the party!

Feel free to add your project to the list if it's missing, and/orchat with us on Discord if you'd like any support.

Sponsors and institutional partners

Narwhals is 100% independent, community-driven, and community-owned. We are extremely grateful to the following organisations for having provided some funding / development time:

If you contribute to Narwhals on your organization's time, please let us know. We'd be happy to add your employer to this list!

Support

If you'd like to say "thank you", please give us a ⭐ star ⭐.

Please contact hello_narwhals@proton.me if you would like to:

Appears on

Narwhals has been featured in several talks, podcasts, and blog posts:

Why "Narwhals"?

Coz they are so awesome.

Thanks to Olha Urdeichuk for the illustration!