XNOR-Popcount, an Alternative Solution to the Accumulation Multiplication Method for Approximate Computations, to Improve Latency and Power Efficiency
Email tác giả liên hệ:
khoapv@hcmute.edu.vnDOI:
https://doi.org/10.54644/jte.2025.1537Từ khóa:
Multiply–accumulate operation, XNOR-popcount, Adder, Latency, Power consumptionTóm tắt
Convolutional operations on neural networks are computationally intensive tasks that require significant processing time due to their reliance on calculations from multiplication circuits. In binarized neural networks, XNOR-popcount is a hardware solution designed to replace the conventional multiplied accumulator (MAC) method, which uses complex multipliers. XNOR-popcount helps optimize design area, reduce power consumption, and increase processing speed. This study implements and evaluates the performance of the XNOR-popcount design at the transistor-level on the Cadence circuit design software using 90nm CMOS technology. Based on the simulation results, for the same computational function, if MAC operation uses XNOR-popcount, the power consumption, processing time, and design complexity can be maximally reduced by up to 69%, 50%, and 48% respectively when compared to the method using conventional multipliers. Thus, the XNOR-popcount design is a useful method to apply to edge-computing platforms with minimalist hardware design, small memory space, and limited power supply.
Tải xuống: 0
Tài liệu tham khảo
E. Nurvitadhi et al., "Can FPGAs beat GPUs in accelerating next-generation deep neural networks?" in Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 5-14. DOI: https://doi.org/10.1145/3020078.3021740
J. Chen and X. Ran. "Deep learning with edge computing: A review," Proceedings of the IEEE, vol. 107, no. 8, 1655-1674, 2019. DOI: https://doi.org/10.1109/JPROC.2019.2921977
Y. L. Cun, Y. Bengio, and G. Hinton. "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, 2015. DOI: https://doi.org/10.1038/nature14539
M. Horowitz, "1.1 computing's energy problem (and what we can do about it)," in IEEE international solid-state circuits conference digest of technical papers (ISSCC), 2014, pp. 10-14. DOI: https://doi.org/10.1109/ISSCC.2014.6757323
L. Lai, N. Suda, and V. Chandra. "Deep convolutional neural network inference with floating-point weights and fixed-point activations," arXiv preprint arXiv:1703.03073, 2017.
J. Wu, C. Leng, Y. Wang, Q. Hu, and J. Cheng, "Quantized convolutional neural networks for mobile devices," in IEEE conference on computer vision and pattern recognition, 2016, pp. 4820-4828. DOI: https://doi.org/10.1109/CVPR.2016.521
K. Hwang, and W. Sung. "Fixed-point feedforward deep neural network design using weights+ 1, 0, and− 1," in IEEE Workshop on Signal Processing Systems (SiPS), 2014, pp. 1-6. DOI: https://doi.org/10.1109/SiPS.2014.6986082
M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, "Xnor-net: Imagenet classification using binary convolutional neural networks," European conference on computer vision, pp. 525-542, 2016. DOI: https://doi.org/10.1007/978-3-319-46493-0_32
M. Courbariaux, I. Hubara, D. Soudry, R. E. Yaniv, and Y. Bengio. "Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1," arXiv preprint arXiv:1602.02830, 2016.
K. A. Asha and K. D. Shinde, "Performance analysis and implementation of array multiplier using various full adder designs for DSP applications: A VLSI based approach," Intelligent Systems Technologies and Applications, pp. 731-742, 2016. DOI: https://doi.org/10.1007/978-3-319-47952-1_59
A. Nigam and R. Singh, "Comparative Analysis of 28T Full adder with 14T Full adder using 180nm," International Journal of Engineering Science Advance Research, vol. 2, no. 1, pp. 27-32, 2016.
S. Pandey, A. A. Khan, and R. Sarma. "Comparative analysis of carry select adder using 8T and 10T full adder cells," in International Conference on Communication and Signal Processing, 2014, pp. 985-989. DOI: https://doi.org/10.1109/ICCSP.2014.6949993
H. Naseri and S. Timarchi, "Low-power and fast full adder by exploring new XOR and XNOR gates," Transactions on very large scale integration (VLSI) systems, vol. 26, no. 8, pp. 1481-1493, 2018. DOI: https://doi.org/10.1109/TVLSI.2018.2820999
S. Vaidya and D. Dandekar. "Delay-power performance comparison of multipliers in VLSI circuit design," International Journal of Computer Networks & Communications (IJCNC), vol. 2, no. 4, pp. 47-56, 2010. DOI: https://doi.org/10.5121/ijcnc.2010.2405
Tải xuống
Đã Xuất bản
Cách trích dẫn
Giấy phép
Bản quyền (c) 2025 Tạp chí Khoa học Giáo dục Kỹ Thuật
Tác phẩm này được cấp phép theo Giấy phép quốc tế Creative Commons Attribution-NonCommercial 4.0 .
Bản quyền thuộc về JTE.


