HW-Flow: A Multi-Abstraction Level HW-CNN Codesign Pruning Methodology

Authors Manoj-Rohit Vemparala , Nael Fasfous , Alexander Frickenstein, Emanuele Valpreda , Manfredi Camalleri , Qi Zhao , Christian Unger, Naveen-Shankar Nagaraja , Maurizio Martina , Walter Stechele



PDF
Thumbnail PDF

File

LITES.8.1.3.pdf
  • Filesize: 8.15 MB
  • 30 pages

Document Identifiers

Author Details

Manoj-Rohit Vemparala
  • BMW Autonomous Driving, Munich, Germany
Nael Fasfous
  • Technical University of Munich, Munich, Germany
Alexander Frickenstein
  • BMW Autonomous Driving, Munich, Germany
Emanuele Valpreda
  • Politecnico di Torino, Turin, Italy
Manfredi Camalleri
  • BMW Autonomous Driving, Munich, Germany
Qi Zhao
  • BMW Autonomous Driving, Munich, Germany
Christian Unger
  • BMW Autonomous Driving, Munich, Germany
Naveen-Shankar Nagaraja
  • BMW Autonomous Driving, Munich, Germany
Maurizio Martina
  • Politecnico di Torino, Turin, Italy
Walter Stechele
  • Technical University of Munich, Munich, Germany

Cite AsGet BibTex

Manoj-Rohit Vemparala, Nael Fasfous, Alexander Frickenstein, Emanuele Valpreda, Manfredi Camalleri, Qi Zhao, Christian Unger, Naveen-Shankar Nagaraja, Maurizio Martina, and Walter Stechele. HW-Flow: A Multi-Abstraction Level HW-CNN Codesign Pruning Methodology. In LITES, Volume 8, Issue 1 (2022): Special Issue on Embedded Systems for Computer Vision. Leibniz Transactions on Embedded Systems, Volume 8, Issue 1, pp. 03:1-03:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LITES.8.1.3

Abstract

Convolutional neural networks (CNNs) have produced unprecedented accuracy for many computer vision problems in the recent past. In power and compute-constrained embedded platforms, deploying modern CNNs can present many challenges. Most CNN architectures do not run in real-time due to the high number of computational operations involved during the inference phase. This emphasizes the role of CNN optimization techniques in early design space exploration. To estimate their efficacy in satisfying the target constraints, existing techniques are either hardware (HW) agnostic, pseudo-HW-aware by considering parameter and operation counts, or HW-aware through inflexible hardware-in-the-loop (HIL) setups. In this work, we introduce HW-Flow, a framework for optimizing and exploring CNN models based on three levels of hardware abstraction: Coarse, Mid and Fine. Through these levels, CNN design and optimization can be iteratively refined towards efficient execution on the target hardware platform. We present HW-Flow in the context of CNN pruning by augmenting a reinforcement learning agent with key metrics to understand the influence of its pruning actions on the inference hardware. With 2× reduction in energy and latency, we prune ResNet56, ResNet50, and DeepLabv3 with minimal accuracy degradation on the CIFAR-10, ImageNet, and CityScapes datasets, respectively.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Artificial intelligence
Keywords
  • Convolutional Neural Networks
  • Optimization
  • Hardware Modeling
  • Pruning

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Han Cai, Ligeng Zhu, and Song Han. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. In International Conference on Learning Representations (ICLR), 2019. URL: https://dblp.org/rec/conf/iclr/CaiZH19.bib.
  2. C. Chen, A. Seff, A. Kornhauser, and J. Xiao. Deepdriving: Learning affordance for direct perception in autonomous driving. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 2722-2730, December 2015. URL: https://doi.org/10.1109/ICCV.2015.312.
  3. Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In The European Conference on Computer Vision (ECCV), Cham, 2018. Springer International Publishing. URL: https://doi.org/10.1007/978-3-030-01234-2_49.
  4. Y. Chen, J. Emer, and V. Sze. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks. In ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), 2016. URL: https://doi.org/10.1109/ISCA.2016.40.
  5. Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The Cityscapes Dataset for Semantic Urban Scene Understanding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016. URL: https://dblp.org/rec/journals/corr/CordtsORREBFRS16.bib.
  6. Yann Le Cun, John S. Denker, and Sara A. Solla. Optimal Brain Damage. In Advances in Neural Information Processing Systems (NeurIPS). Morgan Kaufmann Publishers Inc., 1990. Google Scholar
  7. Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, Peter Vajda, Matthew Uyttendaele, and Niraj K. Jha. Chamnet: Towards efficient network design through platform-aware model adaptation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. URL: https://doi.org/10.1109/CVPR.2019.01166.
  8. Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. In International Conference on Learning Representations, 2020. URL: https://openreview.net/forum?id=HJxyZkBKDr.
  9. Alexander Frickenstein, Manoj-Rohit Vemparala, Nael Fasfous, Laura Hauenschild, Naveen-Shankar Nagaraja, Christian Unger, and Walter Stechele. Alf: Autoencoder-based low-rank filter-sharing for efficient convolutional neural networks. In Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference, (DAC), 2020. URL: https://doi.org/10.1109/DAC18072.2020.9218501.
  10. Yiwen Guo, Anbang Yao, and Yurong Chen. Dynamic Network Surgery for Efficient DNNs. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems (NeurIPS). Curran Associates, Inc., 2016. URL: https://dblp.org/rec/conf/nips/GuoYC16.bib.
  11. Song Han, Jeff Pool, John Tran, and William Dally. Learning both Weights and Connections for Efficient Neural Network. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems (NeurIPS). Curran Associates, Inc., 2015. Google Scholar
  12. Babak Hassibi, David G. Stork, Gregory Wolff, and Takahiro Watanabe. Optimal Brain Surgeon: Extensions and Performance Comparisons. In Advances in Neural Information Processing Systems (NeurIPS), San Francisco, CA, USA, 1993. Morgan Kaufmann Publishers Inc. URL: https://doi.org/10.1109/ICNN.1993.298572.
  13. K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016. URL: https://doi.org/10.1109/CVPR.2016.90.
  14. Y. He, X. Zhang, and J. Sun. Channel Pruning for Accelerating Very Deep Neural Networks. In IEEE International Conference on Computer Vision (ICCV), 2017. URL: https://doi.org/10.1109/ICCV.2017.155.
  15. Yang He, Ping Liu, Ziwei Wang, Zhilan Hu, and Yang Yang. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018. URL: https://doi.org/10.1109/CVPR.2019.00447.
  16. Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. AMC: AutoML for Model Compression and Acceleration on Mobile Devices. In European Conference on Computer Vision (ECCV), 2018. URL: https://doi.org/10.1007/978-3-030-01234-2_48.
  17. Qiangui Huang, Shaohua Kevin Zhou, Suya You, and Ulrich Neumann. Learning to Prune Filters in Convolutional Neural Networks. IEEE Winter Conference on Applications of Computer Vision (WACV), 2018. URL: https://doi.org/10.1109/WACV.2018.00083.
  18. Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore. Reinforcement learning: A survey. Journal of artificial intelligence research, 4:237-285, 1996. Google Scholar
  19. Alex Krizhevsky. Learning Multiple Layers of Features from Tiny Images, 2009. University of Toronto. Google Scholar
  20. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012. URL: https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf.
  21. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems (NeurIPS). Curran Associates, Inc., 2012. URL: https://doi.org/10.1145/3065386.
  22. Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, and Yingyan Lin. HW-nas-bench: Hardware-aware neural architecture search benchmark. In International Conference on Learning Representations, 2021. URL: https://openreview.net/forum?id=_0kaDkv3dVf.
  23. Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015. Google Scholar
  24. Shuying Liu and Weihong Deng. Very deep convolutional neural network based image classification using small training sample size. In 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pages 730-734, 2015. URL: https://doi.org/10.1109/ACPR.2015.7486599.
  25. Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015. Google Scholar
  26. Hongzi Mao, Mohammad Alizadeh, Ishai Menache, and Srikanth Kandula. Resource management with deep reinforcement learning. In Proceedings of the 15th ACM workshop on hot topics in networks, pages 50-56, 2016. Google Scholar
  27. A. Parashar, P. Raina, Y. S. Shao, Y. Chen, V. A. Ying, A. Mukkara, R. Venkatesan, B. Khailany, S. W. Keckler, and J. Emer. Timeloop: A Systematic Approach to DNN Accelerator Evaluation. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2019. URL: https://doi.org/10.1109/ISPASS.2019.00042.
  28. S. Pereira, A. Pinto, V. Alves, and C. A. Silva. Brain tumor segmentation using convolutional neural networks in mri images. IEEE Transactions on Medical Imaging, 35(5):1240-1251, May 2016. URL: https://doi.org/10.1109/TMI.2016.2538465.
  29. Martin Riedmiller and Thomas Gabel. On experiences in a complex and competitive gaming domain: Reinforcement learning meets robocup. In 2007 IEEE Symposium on Computational Intelligence and Games, pages 17-23, 2007. Google Scholar
  30. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3), 2015. URL: https://doi.org/10.1007/s11263-015-0816-y.
  31. Hardik Sharma, Jongse Park, Naveen Suda, Liangzhen Lai, Benson Chau, Vikas Chandra, and Hadi Esmaeilzadeh. Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks. In ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), ISCA ’18. IEEE Press, 2018. URL: https://doi.org/10.1109/ISCA.2018.00069.
  32. V. Sze, Y. Chen, T. Yang, and J. S. Emer. Efficient Processing of Deep Neural Networks: A Tutorial and Survey. Proceedings of the IEEE (Volume: 105, Issue: 12), 105(12), November 2017. Google Scholar
  33. Christian Szegedy, W. Liu, Y. Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, D. Erhan, V. Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1-9, 2015. Google Scholar
  34. Mingxing Tan and Quoc Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning, 2019. Google Scholar
  35. F. Tu, S. Yin, P. Ouyang, S. Tang, L. Liu, and S. Wei. Deep convolutional neural network architecture with reconfigurable computation patterns. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 25(8):2220-2233, 2017. URL: https://doi.org/10.1109/TVLSI.2017.2688340.
  36. Manoj Rohit Vemparala, Nael Fasfous, Alexander Frickenstein, Sreetama Sarkar, Qi Zhao, Sabine Kuhn, Lukas Frickenstein, Anmol Singh, Christian Unger, Naveen Shankar Nagaraja, Christian Wressnegger, and Walter Stechele. Adversarial robust model compression using in-train pruning. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 66-75, 2021. Google Scholar
  37. R. Venkatesan, Y. Shao, Miaorong Wang, Jason Clemons, S. Dai, M. Fojtik, Ben Keller, Alicia Klinefelter, N. Pinckney, Priyanka Raina, Y. Zhang, B. Zimmer, W. Dally, J. Emer, Stephen W. Keckler, and B. Khailany. Magnet: A modular accelerator generator for neural networks. In 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2019. URL: https://doi.org/10.1109/ICCAD45719.2019.8942127.
  38. Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han. HAQ: Hardware-Aware Automated Quantization With Mixed Precision. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. Google Scholar
  39. Marco A Wiering. Multi-agent reinforcement learning for traffic light control. In Machine Learning: Proceedings of the Seventeenth International Conference (ICML'2000), pages 1151-1158, 2000. Google Scholar
  40. Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Péter Vajda, Yangqing Jia, and Kurt Keutzer. Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10726-10734, 2019. Google Scholar
  41. T. Yang, Y. Chen, and V. Sze. Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017. URL: https://doi.org/10.1109/CVPR.2017.643.
  42. Tien-Ju Yang, Andrew Howard, Bo Chen, Xiao Zhang, Alec Go, Mark Sandler, and Hartwig Sze, Vivienneand Adam. NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications. In The European Conference on Computer Vision (ECCV). Springer International Publishing, 2018. URL: https://dblp.org/rec/journals/corr/abs-1804-03230.bib.
  43. Xuan Yang, Mingyu Gao, Qiaoyi Liu, Jeff Setter, Jing Pu, Ankita Nayak, Steven Bell, Kaidi Cao, Heonjae Ha, Priyanka Raina, Christos Kozyrakis, and Mark Horowitz. Interstellar: Using halide’s scheduling language to analyze dnn accelerators. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 369-383, New York, NY, USA, 2020. Association for Computing Machinery. URL: https://doi.org/10.1145/3373376.3378514.
  44. C. Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, and Jason Cong. Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2016. URL: https://doi.org/10.1145/2966986.2967011.
  45. Xingyi Zhou, Dequan Wang, and Philipp Krähenbühl. Objects as points. In arXiv preprint, 2019. URL: http://arxiv.org/abs/1904.07850.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail