Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks (original) (raw)
Related papers
Bit Efficient Quantization for Deep Neural Networks
2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS), 2019
Low-bit Quantization of Neural Networks for Efficient Inference
2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019
A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
Computational Intelligence and Neuroscience, 2020
Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters
arXiv (Cornell University), 2019
Accurate and Efficient 2-bit Quantized Neural Networks
2019
Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines
ArXiv, 2018
Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework
2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2021
DoubleQExt: Hardware and Memory Efficient CNN Through Two Levels of Quantization
IEEE Access, 2021
Power-of-Two Quantization for Low Bitwidth and Hardware Compliant Neural Networks
ArXiv, 2022
2020 28th European Signal Processing Conference (EUSIPCO)
BMPQ: Bit-Gradient Sensitivity-Driven Mixed-Precision Quantization of DNNs from Scratch
2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)
Post training 4-bit quantization of convolutional networks for rapid-deployment
2019
A Fixed-Point Quantization Technique for Convolutional Neural Networks Based on Weight Scaling
2019 IEEE International Conference on Image Processing (ICIP)
A Comprehensive Survey on Model Quantization for Deep Neural Networks
arXiv (Cornell University), 2022
A White Paper on Neural Network Quantization
2021
Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform
2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
Intelligent Automation & Soft Computing, 2022
Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths
2021
SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
2024
Scaling Neural Network Performance through Customized Hardware Architectures on Reconfigurable Logic
2017 IEEE International Conference on Computer Design (ICCD), 2017
CompactNet: High Accuracy Deep Neural Network Optimized for On-Chip Implementation
2018 IEEE International Conference on Big Data (Big Data), 2018
Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights
ArXiv, 2017
Scalable Methods for 8-bit Training of Neural Networks
2018
Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware
ArXiv, 2018
Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)
2022
Using Dataflow to Optimize Energy Efficiency of Deep Neural Network Accelerators
IEEE Micro, 2017
Efficient Execution of Quantized Deep Learning Models: A Compiler Approach
Cornell University - arXiv, 2020
AutoQ: Automated Kernel-Wise Neural Network Quantization
Cornell University - arXiv, 2019
Towards a Stable Quantized Convolutional Neural Networks: An Embedded Perspective
Proceedings of the 10th International Conference on Agents and Artificial Intelligence, 2018
Accuracy to Throughput Trade-Offs for Reduced Precision Neural Networks on Reconfigurable Logic
Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2018
Low-Precision Floating-Point for Efficient On-Board Deep Neural Network Processing
2023
Dither NN: Hardware/Algorithm Co-Design for Accurate Quantized Neural Networks
IEICE Transactions on Information and Systems, 2019
2018
BoolNet: Minimizing The Energy Consumption of Binary Neural Networks
ArXiv, 2021