LLaMA Implementation by zphang · Pull Request #21955 · huggingface/transformers (original) (raw)


Co-authored-by: ydshieh ydshieh@users.noreply.github.com


Co-authored-by: edbeeching edbeeching@users.noreply.github.com


Co-authored-by: Matt Rocketknight1@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: Arthur 48595927+ArthurZucker@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: Arthur 48595927+ArthurZucker@users.noreply.github.com Co-authored-by: sanchit-gandhi sanchit@huggingface.co Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Fix resume_from_checkpoint for deepspeed, by ensuring that the deepspeed engine is the one to load the checkpoint.

Removed deepspeed skipping inside the _load_from_checkpoint function, as it is obsolete


Co-authored-by: ydshieh ydshieh@users.noreply.github.com Co-authored-by: Stas Bekman stas@stason.org

Fix docstring gpt2 config

make concrete_args from outside available

Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com

Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com


Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com

Co-authored-by: sgugger sylvain.gugger@gmail.com


Co-authored-by: sgugger sylvain.gugger@gmail.com

fix nn.init.trunc_normal_ call on half data

fix quality with ruff 0.0.253

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Let's give TF a bit more love ❤️ 🙏

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Co-authored-by: saswatmeher saswatmeher@cse.iitb.ac.in

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com Co-authored-by: Alara Dirik 8944735+alaradirik@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com Co-authored-by: Alara Dirik 8944735+alaradirik@users.noreply.github.com Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com


Co-authored-by: akkikiki akkikiki@users.noreply.github.com


Co-authored-by: Tiep Le 97980157+tileintel@users.noreply.github.com Co-authored-by: Tiep Le tiep.le@intel.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Co-authored-by: saswatmeher saswatmeher@cse.iitb.ac.in


Co-authored-by: Stas Bekman stas@stason.org

Italian translation of community.mdx gh-17459

fix blip doctest

removed BLIP mention from the troubleshooting guide


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

apply normal_ after assigning weight as nn.Parameter to avoid unnecessary initialization computation

Co-authored-by: Joao Gante joaofranciscocardosogante@gmail.com

Co-authored-by: Joao Gante joaofranciscocardosogante@gmail.com

Co-authored-by: Joao Gante joaofranciscocardosogante@gmail.com

Co-authored-by: Joao Gante joaofranciscocardosogante@gmail.com


Co-authored-by: Joao Gante joaofranciscocardosogante@gmail.com

Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.

Co-authored-by: saswatmeher saswatmeher@cse.iitb.ac.in


Co-authored-by: ydshieh ydshieh@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

add correct revision after model was overwritten

Co-authored-by: Sanchit Gandhi 93869735+sanchit-gandhi@users.noreply.github.com

Co-authored-by: bofeng huang bofenghuang7@gmail.com


Co-authored-by: Sanchit Gandhi 93869735+sanchit-gandhi@users.noreply.github.com Co-authored-by: bofeng huang bofenghuang7@gmail.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: Yih-Dar 2521628+ydshieh@users.noreply.github.com


Co-authored-by: Yih-Dar 2521628+ydshieh@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Lots of details everywhere.

Co-authored-by: Arthur 48595927+ArthurZucker@users.noreply.github.com

Co-authored-by: Arthur 48595927+ArthurZucker@users.noreply.github.com


Co-authored-by: Arthur 48595927+ArthurZucker@users.noreply.github.com


Co-authored-by: NielsRogge NielsRogge@users.noreply.github.com Co-authored-by: ydshieh ydshieh@users.noreply.github.com

skip for now

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

The pipeline makes separate calls to model for each candidate label. This commit combines all labels into one call. Original code takes more that 60 seconds to process one image and 1000 candidate labels. Updated code takes less than 2 seconds.

Unfortunately super tailored towards CLIP.

Co-Authored-By: Yessen Kanapin yessen@deepinfra.com


Co-authored-by: Yessen Kanapin yessen@deepinfra.com

Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com


Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

upgrade to large VM

Co-authored-by: ydshieh ydshieh@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: younesbelakda younesbelkada@gmail.com Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com

Fix feature normalization in WhisperFeatureExtractor

update values

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Disable DDp for neuron

Co-authored-by: EC2 Default User ec2-user@ip-172-31-42-72.us-west-2.compute.internal

Co-authored-by: saswatmeher saswatmeher@cse.iitb.ac.in

Step 1 - Change use_cache fix

Four parameters in LayoutLM config were missing definitions, Added their definition (copied from BertConfig).

Use larger atol

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

update expected values for xglm

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Make Format


Co-authored-by: pdhall99


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Remove cast to Bool


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

skip test_multi_gpu_data_parallel_forward for some model tests

Co-authored-by: ydshieh ydshieh@users.noreply.github.com


Co-authored-by: Niels Rogge nielsrogge@Nielss-MacBook-Pro.local

This reverts commit 11a081e09e92771e51a5d2758d53a9afb59547f0.

Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com

Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com

Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com

Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com

Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com

Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com

Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com

Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com

This reverts commit b4cbddfa05e3bd739b79569cd3c3b89e316f2451.


Co-authored-by: Kashif Rasul kashif.rasul@gmail.com Co-authored-by: NielsRogge 48327001+NielsRogge@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

cur_len is 1 token shorter comparing to the length of the sequence whose best_sum_logprobs is the numerator.


Co-authored-by: Chiming chiming@biomap.com

Use valid dummy pixel values

fix

Co-authored-by: ydshieh ydshieh@users.noreply.github.com


Co-authored-by: Tiep Le 97980157+tileintel@users.noreply.github.com Co-authored-by: Tiep Le tiep.le@intel.com

add tokenize_kwargs doc in the FeatureExtractionPipeline

Co-authored-by: Sanchit Gandhi 93869735+sanchit-gandhi@users.noreply.github.com

Co-authored-by: Sanchit Gandhi 93869735+sanchit-gandhi@users.noreply.github.com


Co-authored-by: Sanchit Gandhi 93869735+sanchit-gandhi@users.noreply.github.com


Co-authored-by: Joao Gante joaofranciscocardosogante@gmail.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

fix slow tokenizers with passing offset_mapping


Co-authored-by: njindal njindal@adobe.com Co-authored-by: Joao Gante joaofranciscocardosogante@gmail.com

In ZSH, not using ' ' around pip install fails

Running

pip install transformers[torch]

in the default ZSH terminal will fail with the error zsh: no matches found: transformers[torch]

The solution is to wrap the installation path in ' ' like

pip install 'transformers[torch]'

Relevant StackOverflow: https://stackoverflow.com/questions/30539798/zsh-no-matches-found-requestssecurity


Co-authored-by: testbot lucainp@hf.co


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

This reverts commit cd5179070930e03020d96d98eb51dec3eb21ef75.

rm $ symbol from code block

Removed the $ symbol from the code block to make copy-pasting easier.


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Update the script

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Co-authored-by: EC2 Default User ec2-user@ip-172-31-42-72.us-west-2.compute.internal

fix broken links


Co-authored-by: Joao Gante joaofranciscocardosogante@gmail.com

fix hint

Revert "[GPT2] Propose fix for #21080 (#21853)" to avoid CI failure

This reverts commit a3fef89b2694fac4dd642a3f77d3e96d0c3df82a.

Adds AutoModelForZeroShotImageClassification to transformers


Co-authored-by: yue kun yuekun.wp@alibaba-inc.com

skip accelerate test

Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com


Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com

Co-authored-by: Alara Dirik 8944735+alaradirik@users.noreply.github.com


Co-authored-by: Alara Dirik 8944735+alaradirik@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Update configuration_align.py

updated projected_dim=640 from 512 in arguments of AlignConfig

Co-authored-by: Arthur 48595927+ArthurZucker@users.noreply.github.com


Co-authored-by: Arthur 48595927+ArthurZucker@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: Stas Bekman stas00@users.noreply.github.com

Co-authored-by: Stas Bekman stas00@users.noreply.github.com


Co-authored-by: Stas Bekman stas00@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com


Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com

Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com


Co-authored-by: Steven Liu 59462357+stevhliu@users.noreply.github.com Co-authored-by: Sylvain Gugger 35901082+sgugger@users.noreply.github.com

Co-authored-by: Sanchit Gandhi 93869735+sanchit-gandhi@users.noreply.github.com


Co-authored-by: Sanchit Gandhi 93869735+sanchit-gandhi@users.noreply.github.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

update values

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

added perf_train_cpu and perf_train_cpu_many

Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com


Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com

Revert "Enforce same behavior as PyTorch 2.0 for older versions (#22136)"

This reverts commit 1c801d65eb42a71ea52db797af760bd96c8b113f.

Fix: unfinished_sequences with correct device

The original code was causing errors when running torch.jit.trace due to the tensor options being incorrect. I fixed this by using torch.ones to create a tensor with the correct device and dtype. This should resolve the issue with running torch.jit.trace.

Revert changes


Co-authored-by: Tiep Le 97980157+tileintel@users.noreply.github.com Co-authored-by: Tiep Le tiep.le@intel.com Co-authored-by: Yih-Dar 2521628+ydshieh@users.noreply.github.com Co-authored-by: ydshieh ydshieh@users.noreply.github.com


Co-authored-by: Prathik Rao prathikrao@microsoft.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Fix align docs typo

Update values

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

minor fixes


Co-authored-by: Stella Biderman stellabiderman@gmail.com


Co-authored-by: Stella Biderman stellabiderman@gmail.com


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

This reverts commit 6e95a108042118d204da447729f3834affa354fc.

This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8.


Co-authored-by: ydshieh ydshieh@users.noreply.github.com

fix

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Co-authored-by: yue kun yuekun.wp@alibaba-inc.com

Use dash 2.8.1 for now

Co-authored-by: ydshieh ydshieh@users.noreply.github.com

Signed-off-by: Wang, Yi A yi.a.wang@intel.com

Signed-off-by: Wang, Yi A yi.a.wang@intel.com


Signed-off-by: Wang, Yi A yi.a.wang@intel.com

Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com

Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com


Co-authored-by: Younes Belkada 49240599+younesbelkada@users.noreply.github.com