Factorization-Aware Training of Transformers for Natural Language Understanding on the Edge (original) (raw)
Related papers
Transformer Model Compression for End-to-End Speech Recognition on Mobile Devices
2022 30th European Signal Processing Conference (EUSIPCO)
Lightweight Convolutional Representations for On-Device Natural Language Processing
ArXiv, 2020
arXiv (Cornell University), 2020
A Short Study on Compressing Decoder-Based Language Models
ArXiv, 2021
Benchmarking Transformers-based models on French Spoken Language Understanding tasks
2022
Accelerating Natural Language Understanding in Task-Oriented Dialog
Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI
End-to-End Neural Transformer Based Spoken Language Understanding
2020
Training end-to-end speech-to-text models on mobile phones
arXiv (Cornell University), 2021
Language Modeling with Deep Transformers
Interspeech 2019
CS 224S Final Report: Compression of Deep Speech Recognition Networks
2017
End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios
Interspeech 2019
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021
PaLM: Scaling Language Modeling with Pathways
arXiv (Cornell University), 2022
Enabling On-Device Training of Speech Recognition Models With Federated Dropout
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
COMPARATIVE ANALYSIS OF TRANSFORMER BASED LANGUAGE MODELS
Computer Science & Information Technology (CS & IT) Computer Science Conference Proceedings (CSCP)
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
arXiv (Cornell University), 2024
Overview of the progression of state-of-the-art language models
TELKOMNIKA Telecommunication Computing Electronics and Control, 2024
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
ArXiv, 2019
SuperShaper: Task-Agnostic Super Pre-training of BERT Models with Variable Hidden Dimensions
ArXiv, 2021
2022
Multi-Task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
End-to-End Transformer-Based Models in Textual-Based NLP
AI
Variable Scale Pruning for Transformer Model Compression in End-to-End Speech Recognition
Algorithms, 2023
Improving N-gram Language Models with Pre-trained Deep Transformer
2019
Lightweight and Efficient End-To-End Speech Recognition Using Low-Rank Transformer
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020
Overview of the Transformer-based Models for NLP Tasks
Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, 2020
Run-Time Efficient RNN Compression for Inference on Edge Devices
2019 2nd Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications (EMC2), 2019
Interspeech 2021
Deep Speech: Scaling up end-to-end speech recognition
2014
TangoBERT: Reducing Inference Cost by using Cascaded Architecture
ArXiv, 2022
Attention Based On-Device Streaming Speech Recognition with Large Speech Corpus
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
Speech Model Pre-training for End-to-End Spoken Language Understanding
Efficient pre-training objectives for Transformers
arXiv (Cornell University), 2021