Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters (original) (raw)
Related papers
2020 28th European Signal Processing Conference (EUSIPCO)
A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
Computational Intelligence and Neuroscience, 2020
Bit Efficient Quantization for Deep Neural Networks
2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS), 2019
Interspeech 2022
Low-bit Quantization of Neural Networks for Efficient Inference
2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019
CompactNet: High Accuracy Deep Neural Network Optimized for On-Chip Implementation
2018 IEEE International Conference on Big Data (Big Data), 2018
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
2024
Power-of-Two Quantization for Low Bitwidth and Hardware Compliant Neural Networks
ArXiv, 2022
Accurate and Efficient 2-bit Quantized Neural Networks
2019
A White Paper on Neural Network Quantization
2021
Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths
2021
Exploring the Potential of Low-bit Training of Convolutional Neural Networks
2020
Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks
ArXiv, 2019
2021
4-Bit Quantization of LSTM-Based Speech Recognition Models
Interspeech 2021, 2021
BMPQ: Bit-Gradient Sensitivity-Driven Mixed-Precision Quantization of DNNs from Scratch
2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)
Adaptive Precision Training (AdaPT): A dynamic quantized training approach for DNNs
Society for Industrial and Applied Mathematics eBooks, 2023
Scalable Methods for 8-bit Training of Neural Networks
2018
Ultra-Low Precision 4-bit Training of Deep Neural Networks
2020
A Comprehensive Survey on Model Quantization for Deep Neural Networks
arXiv (Cornell University), 2022
DoubleQExt: Hardware and Memory Efficient CNN Through Two Levels of Quantization
IEEE Access, 2021
On Pruned, Quantized and Compact CNN Architectures for Vision Applications An Empirical Study
2022
Dither NN: Hardware/Algorithm Co-Design for Accurate Quantized Neural Networks
IEICE Transactions on Information and Systems, 2019
Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)
2022
Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression
IEEE Access, 2020
Adaptive Precision Training (AdaPT): A dynamic fixed point quantized training approach for DNNs
2021
Deep Neural Network Compression With Single and Multiple Level Quantization
Proceedings of the AAAI Conference on Artificial Intelligence
Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware
ArXiv, 2018
A Study on Speech Enhancement Using Exponent-Only Floating Point Quantized Neural Network (EOFP-QNN)
2018 IEEE Spoken Language Technology Workshop (SLT), 2018
Intelligent Automation & Soft Computing, 2022
Pruning vs XNOR-Net: A Comprehensive Study of Deep Learning for Audio Classification on Edge-devices
IEEE Access
2019 4th Asia-Pacific Conference on Intelligent Robot Systems (ACIRS), 2019
Post training 4-bit quantization of convolutional networks for rapid-deployment
2019
BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization
Cornell University - arXiv, 2020