torch_frame.datasets — pytorch-frame documentation (original) (raw)

AdultCensusIncome

The Adult Census Income dataset from Kaggle.

AmazonFineFoodReviews

The Amazon Fine Food Reviews dataset.

The Amphibians dataset.

The Bank Marketing dataset.

DataFrameBenchmark

A collection of standardized datasets for tabular learning, covering categorical and numerical features.

DataFrameTextBenchmark

A collection of datasets for tabular learning with text columns, covering categorical, numerical, multi-categorical and timestamp features.

The Dota2 Game Results dataset.

The Titanic dataset from the Titanic Kaggle competition.

ForestCoverType

The Forest Cover Type dataset from Kaggle.

HuggingFaceDatasetDict

Load a Hugging Face datasets.DatasetDict dataset to a torch_frame.data.Dataset with pre-defined split information.

KDDCensusIncome

The KDD Census Income dataset.

The Mercari Price Suggestion Challenge dataset from Kaggle.

The MovieLens 1M rating dataset, assembled by GroupLens Research from the MovieLens web site, consisting of movies (3,883 nodes) and users (6,040 nodes) with approximately 1 million ratings between them.

MultimodalTextBenchmark

The tabular data with text columns benchmark datasets used by "Benchmarking Multimodal AutoML for Tabular Data with Text Fields".

The Mushroom classification Kaggle competition dataset.

The Poker Hand dataset.

TabularBenchmark

A collection of Tabular benchmark datasets introduced in "Why do tree-based models still outperform deep learning on tabular data?".

The Yandex dataset collections used by "Revisiting Deep Learning Models for Tabular Data".

The Diamond Images dataset from Kaggle.