Bridging the Gap for Tokenizer-Free Language Models (original) (raw)
Related papers
Character-Level Language Modeling with Deeper Self-Attention
Proceedings of the AAAI Conference on Artificial Intelligence, 2019
Improving language models by retrieving from trillions of tokens
Cornell University - arXiv, 2021
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Overview of the progression of state-of-the-art language models
TELKOMNIKA Telecommunication Computing Electronics and Control, 2024
Language Modeling with Deep Transformers
Interspeech 2019
COMPARATIVE ANALYSIS OF TRANSFORMER BASED LANGUAGE MODELS
Computer Science & Information Technology (CS & IT) Computer Science Conference Proceedings (CSCP)
Improving N-gram Language Models with Pre-trained Deep Transformer
2019
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
arXiv (Cornell University), 2024
Large Margin Neural Language Model
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Enhancing recurrent neural network-based language models by word tokenization
Human-centric Computing and Information Sciences, 2018
Language-Independent Text Tokenization Using Unsupervised Deep Learning
Intelligent Automation & Soft Computing
Scaling recurrent neural network language models
2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015
On Losses for Modern Language Models
Stéphane Aroca-Ouellette, Frank Rudzicz
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
Pre-training Polish Transformer-Based Language Models at Scale
Artificial Intelligence and Soft Computing, 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
ArXiv, 2019
2022
End-to-End Transformer-Based Models in Textual-Based NLP
AI
arXiv (Cornell University), 2020
Improving the training and evaluation efficiency of recurrent neural network language models
Information
Overview of the Transformer-based Models for NLP Tasks
Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, 2020
Survey of Neural Text Representation Models
Information, 2020
A Tensorized Transformer for Language Modeling
ArXiv, 2019
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
ArXiv, 2021
Proceedings of the 3rd Workshop on Neural Generation and Translation
One billion word benchmark for measuring progress in statistical language modeling
Sequence-to-Sequence Lexical Normalization with Multilingual Transformers
Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)
Word-Phrase-Entity Recurrent Neural Networks for Language Modeling
Interspeech 2016, 2016
Long-span language modeling for speech recognition
arXiv: Computation and Language, 2019
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
ArXiv, 2021
DeepNorm-A Deep Learning Approach to Text Normalization
ArXiv, 2017
AttViz: Online exploration of self-attention for transparent neural language modeling
ArXiv, 2020