Add support for BERT embedding models by iamlemec · Pull Request #5423 · ggml-org/llama.cpp (original) (raw)
and others added 7 commits
Co-authored-by: Jared Van Bortel cebtenzzre@gmail.com
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request
- BERT model graph construction (build_bert)
- WordPiece tokenizer (llm_tokenize_wpm)
- Add flag for non-causal attention models
- Allow for models that only output embeddings
- Support conversion of BERT models to GGUF
- Based on prior work by @xyzhang626 and @skeskinen
Co-authored-by: Jared Van Bortel jared@nomic.ai Co-authored-by: Jared Van Bortel cebtenzzre@gmail.com Co-authored-by: Georgi Gerganov ggerganov@gmail.com
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request
- BERT model graph construction (build_bert)
- WordPiece tokenizer (llm_tokenize_wpm)
- Add flag for non-causal attention models
- Allow for models that only output embeddings
- Support conversion of BERT models to GGUF
- Based on prior work by @xyzhang626 and @skeskinen
Co-authored-by: Jared Van Bortel jared@nomic.ai Co-authored-by: Jared Van Bortel cebtenzzre@gmail.com Co-authored-by: Georgi Gerganov ggerganov@gmail.com
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request
- BERT model graph construction (build_bert)
- WordPiece tokenizer (llm_tokenize_wpm)
- Add flag for non-causal attention models
- Allow for models that only output embeddings
- Support conversion of BERT models to GGUF
- Based on prior work by @xyzhang626 and @skeskinen
Co-authored-by: Jared Van Bortel jared@nomic.ai Co-authored-by: Jared Van Bortel cebtenzzre@gmail.com Co-authored-by: Georgi Gerganov ggerganov@gmail.com
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request
- BERT model graph construction (build_bert)
- WordPiece tokenizer (llm_tokenize_wpm)
- Add flag for non-causal attention models
- Allow for models that only output embeddings
- Support conversion of BERT models to GGUF
- Based on prior work by @xyzhang626 and @skeskinen
Co-authored-by: Jared Van Bortel jared@nomic.ai Co-authored-by: Jared Van Bortel cebtenzzre@gmail.com Co-authored-by: Georgi Gerganov ggerganov@gmail.com
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request
- BERT model graph construction (build_bert)
- WordPiece tokenizer (llm_tokenize_wpm)
- Add flag for non-causal attention models
- Allow for models that only output embeddings
- Support conversion of BERT models to GGUF
- Based on prior work by @xyzhang626 and @skeskinen
Co-authored-by: Jared Van Bortel jared@nomic.ai Co-authored-by: Jared Van Bortel cebtenzzre@gmail.com Co-authored-by: Georgi Gerganov ggerganov@gmail.com
AlexiAlp pushed a commit to minghaop/llama.cpp that referenced this pull request
- BERT model graph construction (build_bert)
- WordPiece tokenizer (llm_tokenize_wpm)
- Add flag for non-causal attention models
- Allow for models that only output embeddings
- Support conversion of BERT models to GGUF
- Based on prior work by @xyzhang626 and @skeskinen
Co-authored-by: Jared Van Bortel jared@nomic.ai Co-authored-by: Jared Van Bortel cebtenzzre@gmail.com Co-authored-by: Georgi Gerganov ggerganov@gmail.com
AlexiAlp pushed a commit to minghaop/llama.cpp that referenced this pull request
AlexiAlp pushed a commit to minghaop/llama.cpp that referenced this pull request
- BERT model graph construction (build_bert)
- WordPiece tokenizer (llm_tokenize_wpm)
- Add flag for non-causal attention models
- Allow for models that only output embeddings
- Support conversion of BERT models to GGUF
- Based on prior work by @xyzhang626 and @skeskinen
Co-authored-by: Jared Van Bortel jared@nomic.ai Co-authored-by: Jared Van Bortel cebtenzzre@gmail.com Co-authored-by: Georgi Gerganov ggerganov@gmail.com
AlexiAlp pushed a commit to minghaop/llama.cpp that referenced this pull request
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})