Graphormer (original) (raw)

PyTorch

This model is in maintenance mode only, we don’t accept any new PRs changing its code. If you run into any issues running this model, please reinstall the last version that supported this model: v4.40.2. You can do so by running the following command: pip install -U transformers==4.40.2.

Overview

The Graphormer model was proposed in Do Transformers Really Perform Bad for Graph Representation? by Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen and Tie-Yan Liu. It is a Graph Transformer model, modified to allow computations on graphs instead of text sequences by generating embeddings and features of interest during preprocessing and collation, then using a modified attention.

The abstract from the paper is the following:

The Transformer architecture has become a dominant choice in many domains, such as natural language processing and computer vision. Yet, it has not achieved competitive performance on popular leaderboards of graph-level prediction compared to mainstream GNN variants. Therefore, it remains a mystery how Transformers could perform well for graph representation learning. In this paper, we solve this mystery by presenting Graphormer, which is built upon the standard Transformer architecture, and could attain excellent results on a broad range of graph representation learning tasks, especially on the recent OGB Large-Scale Challenge. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model. To this end, we propose several simple yet effective structural encoding methods to help Graphormer better model graph-structured data. Besides, we mathematically characterize the expressive power of Graphormer and exhibit that with our ways of encoding the structural information of graphs, many popular GNN variants could be covered as the special cases of Graphormer.

This model was contributed by clefourrier. The original code can be found here.

Usage tips

This model will not work well on large graphs (more than 100 nodes/edges), as it will make the memory explode. You can reduce the batch size, increase your RAM, or decrease the UNREACHABLE_NODE_DISTANCE parameter in algos_graphormer.pyx, but it will be hard to go above 700 nodes/edges.

This model does not use a tokenizer, but instead a special collator during training.

GraphormerConfig

class transformers.GraphormerConfig

< source >

( num_classes: int = 1 num_atoms: int = 4608 num_edges: int = 1536 num_in_degree: int = 512 num_out_degree: int = 512 num_spatial: int = 512 num_edge_dis: int = 128 multi_hop_max_dist: int = 5 spatial_pos_max: int = 1024 edge_type: str = 'multi_hop' max_nodes: int = 512 share_input_output_embed: bool = False num_hidden_layers: int = 12 embedding_dim: int = 768 ffn_embedding_dim: int = 768 num_attention_heads: int = 32 dropout: float = 0.1 attention_dropout: float = 0.1 activation_dropout: float = 0.1 layerdrop: float = 0.0 encoder_normalize_before: bool = False pre_layernorm: bool = False apply_graphormer_init: bool = False activation_fn: str = 'gelu' embed_scale: typing.Optional[float] = None freeze_embeddings: bool = False num_trans_layers_to_freeze: int = 0 traceable: bool = False q_noise: float = 0.0 qn_block_size: int = 8 kdim: typing.Optional[int] = None vdim: typing.Optional[int] = None bias: bool = True self_attention: bool = True pad_token_id = 0 bos_token_id = 1 eos_token_id = 2 **kwargs )

Parameters

This is the configuration class to store the configuration of a ~GraphormerModel. It is used to instantiate an Graphormer model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the Graphormergraphormer-base-pcqm4mv1 architecture.

Configuration objects inherit from PretrainedConfig and can be used to control the model outputs. Read the documentation from PretrainedConfig for more information.

GraphormerModel

class transformers.GraphormerModel

< source >

( config: GraphormerConfig )

The Graphormer model is a graph-encoder model.

It goes from a graph to its representation. If you want to use the model for a downstream classification task, use GraphormerForGraphClassification instead. For any other downstream task, feel free to add a new class, or combine this model with a downstream model of your choice, following the example in GraphormerForGraphClassification.

forward

< source >

( input_nodes: LongTensor input_edges: LongTensor attn_bias: Tensor in_degree: LongTensor out_degree: LongTensor spatial_pos: LongTensor attn_edge_type: LongTensor perturb: typing.Optional[torch.FloatTensor] = None masked_tokens: None = None return_dict: typing.Optional[bool] = None **unused )

GraphormerForGraphClassification

class transformers.GraphormerForGraphClassification

< source >

( config: GraphormerConfig )

This model can be used for graph-level classification or regression tasks.

It can be trained on

forward

< source >

( input_nodes: LongTensor input_edges: LongTensor attn_bias: Tensor in_degree: LongTensor out_degree: LongTensor spatial_pos: LongTensor attn_edge_type: LongTensor labels: typing.Optional[torch.LongTensor] = None return_dict: typing.Optional[bool] = None **unused )

< > Update on GitHub