torch_frame.datasets — pytorch-frame documentation (original) (raw)

AdultCensusIncome

The Adult Census Income dataset from Kaggle.

AmazonFineFoodReviews

The Amazon Fine Food Reviews dataset.

Amphibians

The Amphibians dataset.

BankMarketing

The Bank Marketing dataset.

DataFrameBenchmark

A collection of standardized datasets for tabular learning, covering categorical and numerical features.

DataFrameTextBenchmark

A collection of datasets for tabular learning with text columns, covering categorical, numerical, multi-categorical and timestamp features.

Dota2

The Dota2 Game Results dataset.

Titanic

The Titanic dataset from the Titanic Kaggle competition.

ForestCoverType

The Forest Cover Type dataset from Kaggle.

HuggingFaceDatasetDict

Load a Hugging Face datasets.DatasetDict dataset to a torch_frame.data.Dataset with pre-defined split information.

KDDCensusIncome

The KDD Census Income dataset.

Mercari

The Mercari Price Suggestion Challenge dataset from Kaggle.

Movielens1M

The MovieLens 1M rating dataset, assembled by GroupLens Research from the MovieLens web site, consisting of movies (3,883 nodes) and users (6,040 nodes) with approximately 1 million ratings between them.

MultimodalTextBenchmark

The tabular data with text columns benchmark datasets used by "Benchmarking Multimodal AutoML for Tabular Data with Text Fields".

Mushroom

The Mushroom classification Kaggle competition dataset.

PokerHand

The Poker Hand dataset.

TabularBenchmark

A collection of Tabular benchmark datasets introduced in "Why do tree-based models still outperform deep learning on tabular data?".

Yandex

The Yandex dataset collections used by "Revisiting Deep Learning Models for Tabular Data".

DiamondImages

The Diamond Images dataset from Kaggle.