FreeMatch-PG: A Lightweight Semi-supervised Learning Framework for IoT Device Identification

ShuYi Song

doi:10.63313/JCSFT.9044

Authors

ShuYi Song School of Computer Science and Technology, Qingdao University, Qingdao 266071, China Author

DOI:

https://doi.org/10.63313/JCSFT.9044

Keywords:

IoT Security, Device Identification, Semi-Supervised Learning, Class Imbalance, Adaptive Threshold, Lightweight Model

Abstract

IoT device identification is a critical component of cybersecurity, yet its practical deployment faces the dual challenges of scarce labeled data and imbalanced device class distribution. To address these issues, this paper proposes a lightweight identification framework based on semi-supervised learning. The core innovations of the framework include: 1) A prior-guided adaptive threshold mechanism (FreeMatch-PG), which sets differentiated learning thresholds for various classes by simulating initial cognitive states, alleviating class imbalance from the source and significantly improving pseudo-label quality; 2) Domain-customized data augmentation and a weighted focal loss function, which jointly enhance model robustness in noisy environments; 3) A lightweight architecture based on an improved ResNet-18, substantially reducing model complexity. Experiments on two public datasets demonstrate that using only 8% of labeled data, the proposed method achieves recognition accuracies of 98.25% and 97.72% on the UNSW and CICIOT datasets, respectively. It significantly outperforms multiple baseline models and achieves an excellent balance between accuracy, efficiency, and deployment feasibility, offering a practical solution for resource-constrained real-world IoT environments.

References

[1] Gubbi, J., Buyya, R., Marusic, S., & Palaniswami, M. (2013). Internet of Things (IoT): A vision, architectural elements, and future directions. Future Generation Computer Systems, 29(7), 1645-1660.

[2] Ning, H., & Liu, H. (2015). Cyber-physical-social thinking based on the Internet of Things. IEEE Internet of Things Journal, 2(4), 288-295.

[3] Sivanathan, A., Gharakheili, H. H., Loi, F., Radford, A., Wijenayake, C., Vishwanath, A., & Sivaraman, V. (2018). Classifying IoT devices in smart environments using network traffic characteristics. IEEE Transactions on Mobile Computing, 18(8), 1745-1759.

[4] Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C. A., Cubuk, E. D., Kurakin, A., & Li, C. L. (2020). FixMatch: Simplifying semi-supervised learning with consistency and confidence. Proceedings of the 37th International Conference on Machine Learning(ICML), 119, 8967-8978.

[5] Wang, W., Zhou, T., Yu, F., Yang, J., Yu, C., & Liu, S. (2021). Traffic image representation for network anomaly detection: A survey. IEEE Communications Surveys & Tutorials, 23(2), 1024-1059.

[6] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 770-778.

[7] Moustafa, N., & Slay, J. (2015). UNSW-NB15: A comprehensive data set for network intrusion detection systems. Proceedings of the 2015 Military Communications and Information Systems Conference(MilCIS), 1-6

[8] Feng Y, Zhang Y, He H, et al. An IoT Device Identification Method Using Extracted Fingerprint From Sequence of Traffic Grayscale Images[J]. IEEE Transactions on Dependable and Secure Computing, 2024.

[9] Miettinen M, Marchal S, Hafeez I, et al. Iot sentinel: Automated device-type identification for security enforcement in iot[C]//2017 IEEE 37th international conference on distributed computing systems (ICDCS). IEEE, 2017: 2177-2184.

[10] Fan L, He L, Wu Y, et al. AutoIoT: Automatically updated IoT device identification with semi-supervised learning[J]. IEEE Transactions on Mobile Computing, 2022, 22(10): 5769-5786.

[11] Nukavarapu, S.K. and Nadeem, T.(2021) Securing Edge-based IoT Networks with Semi-Supervised GANs. 2021 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops), 579-584.

[12] Jin, Y., Zhou, J., & Gao, Y. (2024). HSGAN-IoT: A hierarchical semi-supervised generative adversarial networks for IoT device classification. Computer Networks, 243, 110299.

[13] Wang, Y., Chen, H., Heng, Q., Hou, W., Fan, Y., Wu, Z., ... & Xie, X. (2022). Freematch: Self-adaptive thresholding for semi-supervised learning. arXiv preprint arXiv:2205.07246.

[14] Sajjad Dadkhah, Hassan Mahdikhani, Priscilla Kyei Danso, Alireza Zohourian, Kevin Anh Truong, and Ali A Ghorbani. Towards the development of a realistic multidimensional iot profiling dataset. In 2022 19th Annual International Conference on Privacy, Security & Trust (PST), pages 1–11. IEEE, 2022.

[15] Arunan Sivanathan, Hassan Habibi Gharakheili, Franco Loi, Adam Radford, Chamith Wijenayake, Arun Vishwanath, and Vijay Sivaraman. Classifying iot devices in smart environments using network traffic characteristics. IEEE Transactions on Mobile Computing, 18(8):1745–1759, 2018.

[16] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

[17] Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.

[18] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.

[19] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.

[20] Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012). Fairness through awareness. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, 214-226.

[21] Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2980-2988.