Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters (original) (raw)

Memory Requirement Reduction of Deep Neural Networks for Field Programmable Gate Arrays Using Low-Bit Quantization of Parameters

Niccolò Nicodemo

2020 28th European Signal Processing Conference (EUSIPCO)

View PDFchevron_right

A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks

Dianle Zhou

Computational Intelligence and Neuroscience, 2020

View PDFchevron_right

Bit Efficient Quantization for Deep Neural Networks

Sek Chai

2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS), 2019

View PDFchevron_right

Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition

Tariq Afzal

Interspeech 2022

View PDFchevron_right

Low-bit Quantization of Neural Networks for Efficient Inference

Pavel Kisilev

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019

View PDFchevron_right

CompactNet: High Accuracy Deep Neural Network Optimized for On-Chip Implementation

Mariana GarcĂ­a

2018 IEEE International Conference on Big Data (Big Data), 2018

View PDFchevron_right

AdaQAT: Adaptive Bit-Width Quantization-Aware Training

Silviu-Ioan Filip

2024

View PDFchevron_right

Power-of-Two Quantization for Low Bitwidth and Hardware Compliant Neural Networks

Barbara De Salvo

ArXiv, 2022

View PDFchevron_right

Accurate and Efficient 2-bit Quantized Neural Networks

Kailash Gopalakrishnan

2019

View PDFchevron_right

A White Paper on Neural Network Quantization

Marios Fournarakis

2021

View PDFchevron_right

Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths

Naigang Wang

2021

View PDFchevron_right

Exploring the Potential of Low-bit Training of Convolutional Neural Networks

Xuefei Ning

2020

View PDFchevron_right

Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks

joseph pal

ArXiv, 2019

View PDFchevron_right

Pruning vs XNOR-Net: A Comprehensive Study on Deep Learning for Audio Classification in Microcontrollers

Bernd Meyer

2021

View PDFchevron_right

4-Bit Quantization of LSTM-Based Speech Recognition Models

Mauricio Serrano

Interspeech 2021, 2021

View PDFchevron_right

BMPQ: Bit-Gradient Sensitivity-Driven Mixed-Precision Quantization of DNNs from Scratch

Qirui Sun

2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)

View PDFchevron_right

Adaptive Precision Training (AdaPT): A dynamic quantized training approach for DNNs

Wilfried Gansterer

Society for Industrial and Applied Mathematics eBooks, 2023

View PDFchevron_right

Scalable Methods for 8-bit Training of Neural Networks

Ron Banner

2018

View PDFchevron_right

Ultra-Low Precision 4-bit Training of Deep Neural Networks

Kaoutar El Maghraoui

2020

View PDFchevron_right

A Comprehensive Survey on Model Quantization for Deep Neural Networks

Ali Azarpeyvand

arXiv (Cornell University), 2022

View PDFchevron_right

DoubleQExt: Hardware and Memory Efficient CNN Through Two Levels of Quantization

Hui-Fuang Ng

IEEE Access, 2021

View PDFchevron_right

On Pruned, Quantized and Compact CNN Architectures for Vision Applications An Empirical Study

Peter Yamak, Kwao Gadosey

View PDFchevron_right

A Low-Complexity Approach to Rate-Distortion Optimized Variable Bit-Rate Compression for Split DNN Computing

Parual Datta

2022

View PDFchevron_right

Dither NN: Hardware/Algorithm Co-Design for Accurate Quantized Neural Networks

Masayuki Ikebe

IEICE Transactions on Information and Systems, 2019

View PDFchevron_right

Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)

Marios Fournarakis

2022

View PDFchevron_right

Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression

Shlomo Greenberg

IEEE Access, 2020

View PDFchevron_right

Adaptive Precision Training (AdaPT): A dynamic fixed point quantized training approach for DNNs

Wilfried Gansterer

2021

View PDFchevron_right

Deep Neural Network Compression With Single and Multiple Level Quantization

Aojun Zhou

Proceedings of the AAAI Conference on Artificial Intelligence

View PDFchevron_right

Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware

Avi Mendelson

ArXiv, 2018

View PDFchevron_right

A Study on Speech Enhancement Using Exponent-Only Floating Point Quantized Neural Network (EOFP-QNN)

Yu Tsao

2018 IEEE Spoken Language Technology Workshop (SLT), 2018

View PDFchevron_right

A Resource-Efficient Convolutional Neural Network Accelerator Using Fine-Grained Logarithmic Quantization

Yasar Becerikli

Intelligent Automation & Soft Computing, 2022

View PDFchevron_right

Pruning vs XNOR-Net: A Comprehensive Study of Deep Learning for Audio Classification on Edge-devices

Bernd Meyer

IEEE Access

View PDFchevron_right

Single-Bit-per-Weight Deep Convolutional Neural Networks without Batch-Normalization Layers for Embedded Systems

Hesham Mostafa

2019 4th Asia-Pacific Conference on Intelligent Robot Systems (ACIRS), 2019

View PDFchevron_right

Post training 4-bit quantization of convolutional networks for rapid-deployment

Ron Banner

2019

View PDFchevron_right

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

Milos Nikolic

Cornell University - arXiv, 2020

View PDFchevron_right