[논문분석] Entropy-Driven Mixed-Precision Quantization for Deep Network Design

1 분 소요

QAT 경우 overhead 줄일 수만 있다면, PTQ를 대체해도 좋을까?

한줄요약 ✔

A one-stage solution that optimizes both the architecture and the corresponding quantization jointly and automatically. The key idea of our approach is to cast the joint architecture design and quantization as an Entropy Maximization process.

Its representation capacity measured by entropy is maximized under the given computational budget.
1. Quantization Entropy Score (QE-Score) with calibrated initialization to measure the expressiveness of the system.
2. Quantization Bits Refinement within evolution algorithm to adjust mixed-precision quantization.
Each layer is assigned with a proper quantization precision.
1. The Entropy-based ranking strategy of mixed-precision quantization networks.
The overall design loop can be made on the CPU; no GPU is required.

Introduction 🙌

Why Need Quantization?

Most IoT devices have very limited on-chip memory.
Deploying deep CNN on Internet-of-Things (IoT) devices is challenging due to the limited computational resources, such as limited SRAM memory and Flash storage.

The key is to control the peak memory during inference.

Trends in Traditional Lightweight CNN

(1)Re-design a small network for IoT devices, then (2)compress the network size by mixed-precision quantization.

Limitations

The incoherence of such a two-stage design procedure leads to the inadequate utilization of resources, therefore producing sub-optimal models within tight resource requirements for IoT devices.

Training-free NAS methods

Accelerates the progress of the model design using a proxy mechanism instead of a training-based accuracy indicator.

Limitation:

Still lacks key techniques for cooperating mixed-precision quantization.

Challenges and Main Idea💣

C1) Designing models under limited resources remains a challenging issue.

C2) Low-precision has a short range of expressible values, producing chronic accuracy degradation.

Idea) Build a training-free NAS on mixed-precision quantization for selected IoT devices.

Proposed Method 🧿

Quantization Entropy

Maximum Entropy for Full-precision Models

Quantization Entropy for Mixed-Precision Models

Gaussian Initialization Calibration

Resource Maximization for IoT Devices

Experiment 👀

Mixed-Precision Comparison

Random Correlation Study

Comparison with SOTA Models

Tiny Image Classification

Large-scale Classification on ImageNet

Low-energy Application on Visual Wake Words

Resource Maximization

Tiny Object Detection on WIDER FACE

Open Reivew 💗

Major Takeaways 😃

Conclusion ✨

Strength

Weakness

Reference

Twitter Facebook LinkedIn

한줄요약 ✔

Introduction 🙌

Why Need Quantization?

Trends in Traditional Lightweight CNN

Limitations

Related Work 😉

Training-free NAS methods

Challenges and Main Idea💣

Proposed Method 🧿

Quantization Entropy

Maximum Entropy for Full-precision Models

Quantization Entropy for Mixed-Precision Models

Gaussian Initialization Calibration

Resource Maximization for IoT Devices

Experiment 👀

Mixed-Precision Comparison

Random Correlation Study

Comparison with SOTA Models

Tiny Image Classification

Large-scale Classification on ImageNet

Low-energy Application on Visual Wake Words

Resource Maximization

Tiny Object Detection on WIDER FACE

Open Reivew 💗

Major Takeaways 😃

Conclusion ✨

Strength

Weakness

Reference

공유하기

댓글남기기

참고

2024.10.02 Evaluating on Image Hallucination for TTI Generative Models in I-HallA via PaliGemma

2023.12.04 [논문분석] Saliency as Pseudo-Pixel Supervision for Weakly and Semi-Supervised Semantic Segmentation (PAMI 2023)

2023.12.03 [논문분석] Segment Anything (ICCV 2023)

2023.12.03 [논문분석] Learning Transferable Visual Models From Natural Language Supervision (ICMR 2021)

2023.12.03 [논문분석] Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation (CVPR 2018)

2023.12.01 [논문분석] Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization (ICCV 2017)

2023.12.01 [논문분석] Entropy regularization for weakly supervised object localization (PRL 2023)

2023.11.29 [논문분석] Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation (CVPR 2021)

2023.11.25 [논문분석] Weaker Than You Think: A Critical Look at Weakly Supervised Learning (ACL 2023)

2023.08.03 [논문분석] PETR: Position Embedding Transformation for Multi-View 3D Object Detection (ECCV, 2022)

2024.10.02
Evaluating on Image Hallucination for TTI Generative Models in I-HallA via PaliGemma

2023.12.04
[논문분석] Saliency as Pseudo-Pixel Supervision for Weakly and Semi-Supervised Semantic Segmentation (PAMI 2023)

2023.12.03
[논문분석] Segment Anything (ICCV 2023)

2023.12.03
[논문분석] Learning Transferable Visual Models From Natural Language Supervision (ICMR 2021)

2023.12.03
[논문분석] Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation (CVPR 2018)

2023.12.01
[논문분석] Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization (ICCV 2017)

2023.12.01
[논문분석] Entropy regularization for weakly supervised object localization (PRL 2023)

2023.11.29
[논문분석] Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation (CVPR 2021)

2023.11.25
[논문분석] Weaker Than You Think: A Critical Look at Weakly Supervised Learning (ACL 2023)

2023.08.03
[논문분석] PETR: Position Embedding Transformation for Multi-View 3D Object Detection (ECCV, 2022)