CPMAnt (original) (raw)

PyTorch

Overview

CPM-Ant is an open-source Chinese pre-trained language model (PLM) with 10B parameters. It is also the first milestone of the live training process of CPM-Live. The training process is cost-effective and environment-friendly. CPM-Ant also achieves promising results with delta tuning on the CUGE benchmark. Besides the full model, we also provide various compressed versions to meet the requirements of different hardware configurations. See more

This model was contributed by OpenBMB. The original code can be found here.

Resources

CpmAntConfig

class transformers.CpmAntConfig

< source >

( vocab_size: int = 30720 hidden_size: int = 4096 num_attention_heads: int = 32 dim_head: int = 128 dim_ff: int = 10240 num_hidden_layers: int = 48 dropout_p: int = 0.0 position_bias_num_buckets: int = 512 position_bias_max_distance: int = 2048 eps: int = 1e-06 init_std: float = 1.0 prompt_types: int = 32 prompt_length: int = 32 segment_types: int = 32 use_cache: bool = True **kwargs )

Parameters

This is the configuration class to store the configuration of a CpmAntModel. It is used to instantiate an CPMAnt model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the CPMAntopenbmb/cpm-ant-10b architecture.

Configuration objects inherit from PretrainedConfig and can be used to control the model outputs. Read the documentation from PretrainedConfig for more information.

Example:

from transformers import CpmAntModel, CpmAntConfig

configuration = CpmAntConfig()

model = CpmAntModel(configuration)

configuration = model.config

CpmAntTokenizer

class transformers.CpmAntTokenizer

< source >

( vocab_file bod_token = '' eod_token = '' bos_token = '' eos_token = '' pad_token = '' unk_token = '' line_token = '' space_token = '</_>' padding_side = 'left' **kwargs )

Parameters

Construct a CPMAnt tokenizer. Based on byte-level Byte-Pair-Encoding.

build_inputs_with_special_tokens

< source >

( token_ids_0: list token_ids_1: typing.Optional[list[int]] = None ) → list[int]

Parameters

The model input with special tokens.

Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating and adding special tokens. A CPMAnt sequence has the following format:

get_special_tokens_mask

< source >

( token_ids_0: list token_ids_1: typing.Optional[list[int]] = None already_has_special_tokens: bool = False ) → list[int]

Parameters

A list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.

Retrieve sequence ids from a token list that has no special tokens added. This method is called when adding special tokens using the tokenizer prepare_for_model method.

CpmAntModel

class transformers.CpmAntModel

< source >

( config: CpmAntConfig )

Parameters

The bare Cpmant Model outputting raw hidden-states without any specific head on top.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.Tensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None past_key_values: typing.Optional[tuple[tuple[torch.Tensor]]] = None use_cache: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None **kwargs ) → transformers.modeling_outputs.BaseModelOutputWithPast or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.BaseModelOutputWithPast or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (CpmAntConfig) and inputs.

The CpmAntModel forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

CpmAntForCausalLM

class transformers.CpmAntForCausalLM

< source >

( config: CpmAntConfig )

Parameters

The CPMAnt Model with a language modeling head on top (linear layer with weights tied to the input embeddings).

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.Tensor] = None past_key_values: typing.Optional[list[tuple[torch.Tensor, torch.Tensor]]] = None use_cache: typing.Optional[bool] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None labels: typing.Optional[torch.Tensor] = None return_dict: typing.Optional[bool] = None attention_mask: typing.Optional[torch.Tensor] = None **kwargs ) → transformers.modeling_outputs.CausalLMOutputWithPast or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.CausalLMOutputWithPast or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (CpmAntConfig) and inputs.

The CpmAntForCausalLM forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

Text Generation with CpmAntForCausalLM.

from transformers import CPMAntTokenizer, CpmAntForCausalLM

texts = "今天天气不错," model = CpmAntForCausalLM.from_pretrained("openbmb/cpm-ant-10b") tokenizer = CPMAntTokenizer.from_pretrained("openbmb/cpm-ant-10b") input_ids = tokenizer(texts, return_tensors="pt") outputs = model.generate(**input_ids) output_texts = tokenizer.batch_decode(outputs) print(output_texts) ['今天天气不错,阳光明媚,我和妈妈一起去超市买东西。\n在超市里,我看到了一个很好玩的玩具,它的名字叫“机器人”。它有一个圆圆的脑袋,两只圆圆的眼睛,还有一个圆圆的']

< > Update on GitHub