Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks (original) (raw)

Bit Efficient Quantization for Deep Neural Networks

Sek Chai

2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS), 2019

View PDFchevron_right

Low-bit Quantization of Neural Networks for Efficient Inference

Pavel Kisilev

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019

View PDFchevron_right

A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks

Dianle Zhou

Computational Intelligence and Neuroscience, 2020

View PDFchevron_right

Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters

Niccolò Nicodemo

arXiv (Cornell University), 2019

View PDFchevron_right

Accurate and Efficient 2-bit Quantized Neural Networks

Kailash Gopalakrishnan

2019

View PDFchevron_right

Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines

Nicholas Fraser

ArXiv, 2018

View PDFchevron_right

Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework

Hayden K H So

2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2021

View PDFchevron_right

DoubleQExt: Hardware and Memory Efficient CNN Through Two Levels of Quantization

Hui-Fuang Ng

IEEE Access, 2021

View PDFchevron_right

Power-of-Two Quantization for Low Bitwidth and Hardware Compliant Neural Networks

Barbara De Salvo

ArXiv, 2022

View PDFchevron_right

Memory Requirement Reduction of Deep Neural Networks for Field Programmable Gate Arrays Using Low-Bit Quantization of Parameters

Niccolò Nicodemo

2020 28th European Signal Processing Conference (EUSIPCO)

View PDFchevron_right

BMPQ: Bit-Gradient Sensitivity-Driven Mixed-Precision Quantization of DNNs from Scratch

Qirui Sun

2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)

View PDFchevron_right

Post training 4-bit quantization of convolutional networks for rapid-deployment

Ron Banner

2019

View PDFchevron_right

A Fixed-Point Quantization Technique for Convolutional Neural Networks Based on Weight Scaling

Ralf Wittmann

2019 IEEE International Conference on Image Processing (ICIP)

View PDFchevron_right

A Comprehensive Survey on Model Quantization for Deep Neural Networks

Ali Azarpeyvand

arXiv (Cornell University), 2022

View PDFchevron_right

A White Paper on Neural Network Quantization

Marios Fournarakis

2021

View PDFchevron_right

Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform

Avi Mendelson

2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

View PDFchevron_right

A Resource-Efficient Convolutional Neural Network Accelerator Using Fine-Grained Logarithmic Quantization

Yasar Becerikli

Intelligent Automation & Soft Computing, 2022

View PDFchevron_right

Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths

Naigang Wang

2021

View PDFchevron_right

SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks

Nicholas Fraser

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018

View PDFchevron_right

AdaQAT: Adaptive Bit-Width Quantization-Aware Training

Silviu-Ioan Filip

2024

View PDFchevron_right

Scaling Neural Network Performance through Customized Hardware Architectures on Reconfigurable Logic

Michaela Blott

2017 IEEE International Conference on Computer Design (ICCD), 2017

View PDFchevron_right

CompactNet: High Accuracy Deep Neural Network Optimized for On-Chip Implementation

Mariana García

2018 IEEE International Conference on Big Data (Big Data), 2018

View PDFchevron_right

Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights

Aojun Zhou

ArXiv, 2017

View PDFchevron_right

Scalable Methods for 8-bit Training of Neural Networks

Ron Banner

2018

View PDFchevron_right

Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware

Avi Mendelson

ArXiv, 2018

View PDFchevron_right

Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)

Marios Fournarakis

2022

View PDFchevron_right

Using Dataflow to Optimize Energy Efficiency of Deep Neural Network Accelerators

Joel Emer

IEEE Micro, 2017

View PDFchevron_right

Efficient Execution of Quantized Deep Learning Models: A Compiler Approach

vin sharma

Cornell University - arXiv, 2020

View PDFchevron_right

AutoQ: Automated Kernel-Wise Neural Network Quantization

qian lou

Cornell University - arXiv, 2019

View PDFchevron_right

Towards a Stable Quantized Convolutional Neural Networks: An Embedded Perspective

Piyush Kaul

Proceedings of the 10th International Conference on Agents and Artificial Intelligence, 2018

View PDFchevron_right

Accuracy to Throughput Trade-Offs for Reduced Precision Neural Networks on Reconfigurable Logic

Michaela Blott

Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2018

View PDFchevron_right

Low-Precision Floating-Point for Efficient On-Board Deep Neural Network Processing

Silviu-Ioan Filip

2023

View PDFchevron_right

Dither NN: Hardware/Algorithm Co-Design for Accurate Quantized Neural Networks

Masayuki Ikebe

IEICE Transactions on Information and Systems, 2019

View PDFchevron_right

Understanding the Limitations of Existing Energy-Efficient Design Approaches for Deep Neural Networks

Joel Emer

2018

View PDFchevron_right

BoolNet: Minimizing The Energy Consumption of Binary Neural Networks

Xuefei Ning

ArXiv, 2021

View PDFchevron_right