LANA: Latency Aware Network Acceleration

Molchanov, Pavlo; Hall, Jimmy; Yin, Hongxu; Kautz, Jan; Fusi, Nicolo; Vahdat, Arash

Computer Science > Computer Vision and Pattern Recognition

arXiv:2107.10624 (cs)

[Submitted on 12 Jul 2021 (v1), last revised 18 Nov 2021 (this version, v2)]

Title:LANA: Latency Aware Network Acceleration

Authors:Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

View PDF

Abstract:We introduce latency-aware network acceleration (LANA) - an approach that builds on neural architecture search techniques and teacher-student distillation to accelerate neural networks. LANA consists of two phases: in the first phase, it trains many alternative operations for every layer of the teacher network using layer-wise feature map distillation. In the second phase, it solves the combinatorial selection of efficient operations using a novel constrained integer linear optimization (ILP) approach. ILP brings unique properties as it (i) performs NAS within a few seconds to minutes, (ii) easily satisfies budget constraints, (iii) works on the layer-granularity, (iv) supports a huge search space $O(10^{100})$, surpassing prior search approaches in efficacy and efficiency. In extensive experiments, we show that LANA yields efficient and accurate models constrained by a target latency budget, while being significantly faster than other techniques. We analyze three popular network architectures: EfficientNetV1, EfficientNetV2 and ResNeST, and achieve accuracy improvement for all models (up to $3.0\%$) when compressing larger models to the latency level of smaller models. LANA achieves significant speed-ups (up to $5\times$) with minor to no accuracy drop on GPU and CPU. The code will be shared soon.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2107.10624 [cs.CV]
	(or arXiv:2107.10624v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2107.10624

Submission history

From: Pavlo Molchanov [view email]
[v1] Mon, 12 Jul 2021 18:46:34 UTC (1,441 KB)
[v2] Thu, 18 Nov 2021 18:55:13 UTC (4,065 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:LANA: Latency Aware Network Acceleration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:LANA: Latency Aware Network Acceleration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators