Factorization-Aware Training of Transformers for Natural Language Understanding on the Edge (original) (raw)

Transformer Model Compression for End-to-End Speech Recognition on Mobile Devices

Leïla Ben Letaifa

2022 30th European Signal Processing Conference (EUSIPCO)

View PDFchevron_right

Lightweight Convolutional Representations for On-Device Natural Language Processing

Geoffrey Goh

ArXiv, 2020

View PDFchevron_right

AxFormer: Accuracy-driven Approximation of Transformers for Faster, Smaller and more Accurate NLP Models

Sanchari Sen

arXiv (Cornell University), 2020

View PDFchevron_right

A Short Study on Compressing Decoder-Based Language Models

Nithin Anchuri

ArXiv, 2021

View PDFchevron_right

Benchmarking Transformers-based models on French Spoken Language Understanding tasks

Oralie Cattan

2022

View PDFchevron_right

Accelerating Natural Language Understanding in Task-Oriented Dialog

Ojas Ahuja

Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI

View PDFchevron_right

End-to-End Neural Transformer Based Spoken Language Understanding

Martin Radfar

2020

View PDFchevron_right

Training end-to-end speech-to-text models on mobile phones

Pooja Rao

arXiv (Cornell University), 2021

View PDFchevron_right

Language Modeling with Deep Transformers

Albert Zeyer

Interspeech 2019

View PDFchevron_right

CS 224S Final Report: Compression of Deep Speech Recognition Networks

Stephen Jie Hoon Koo

2017

View PDFchevron_right

End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios

imran sheikh

Interspeech 2019

View PDFchevron_right

Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality

Huy Vu

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

View PDFchevron_right

PaLM: Scaling Language Modeling with Pathways

Shivani Agrawal -XII D

arXiv (Cornell University), 2022

View PDFchevron_right

Enabling On-Device Training of Speech Recognition Models With Federated Dropout

Dhruv Guliani

ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View PDFchevron_right

COMPARATIVE ANALYSIS OF TRANSFORMER BASED LANGUAGE MODELS

Computer Science & Information Technology (CS & IT) Computer Science Conference Proceedings (CSCP)

View PDFchevron_right

EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference

Marco Antonio Diaz Donato

MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

View PDFchevron_right

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Mohammad Hossein Sekhavat

arXiv (Cornell University), 2024

View PDFchevron_right

Overview of the progression of state-of-the-art language models

TELKOMNIKA JOURNAL

TELKOMNIKA Telecommunication Computing Electronics and Control, 2024

View PDFchevron_right

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

Raul Puri

ArXiv, 2019

View PDFchevron_right

SuperShaper: Task-Agnostic Super Pre-training of BERT Models with Variable Hidden Dimensions

Vinod Ganesan

ArXiv, 2021

View PDFchevron_right

Extreme compression of sentence-transformer ranker models: faster inference, longer battery life, and less storage on edge devices

amit chaulwar

2022

View PDFchevron_right

Multi-Task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding

Martin Radfar

ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View PDFchevron_right

End-to-End Transformer-Based Models in Textual-Based NLP

Abir Rahali

AI

View PDFchevron_right

Variable Scale Pruning for Transformer Model Compression in End-to-End Speech Recognition

Leïla Ben Letaifa

Algorithms, 2023

View PDFchevron_right

Improving N-gram Language Models with Pre-trained Deep Transformer

Yutong Pang

2019

View PDFchevron_right

Lightweight and Efficient End-To-End Speech Recognition Using Low-Rank Transformer

Samuel Cahyawijaya

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020

View PDFchevron_right

Overview of the Transformer-based Models for NLP Tasks

anthony gillioz

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, 2020

View PDFchevron_right

Run-Time Efficient RNN Compression for Inference on Edge Devices

Jesse Beu

2019 2nd Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications (EMC2), 2019

View PDFchevron_right

An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling

Diamantino Caseiro

Interspeech 2021

View PDFchevron_right

Deep Speech: Scaling up end-to-end speech recognition

Greg Diamos

2014

View PDFchevron_right

TangoBERT: Reducing Inference Cost by using Cascaded Architecture

Moshe Wasserblat

ArXiv, 2022

View PDFchevron_right

Attention Based On-Device Streaming Speech Recognition with Large Speech Corpus

Jinsu Yeo

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019

View PDFchevron_right

Speech Model Pre-training for End-to-End Spoken Language Understanding

Mirco Ravanelli

View PDFchevron_right

Efficient pre-training objectives for Transformers

Luca Di Liello

arXiv (Cornell University), 2021

View PDFchevron_right