CloudPayGuard: Hardware-Aware Real-Time Fraud Detection for Cloud-Native Credit Systems with OoO CPU Microarchitecture Optimization

Hanqing Yao; Jixiang Ding; Zifan Wang

doi:10.63313/JCSFT.9079

Authors

Hanqing Yao Stanford University, Stanford, CA, USA Author
Jixiang Ding University of Michigan, Ann Arbor, MI, USA Author
Zifan Wang Shanghai University, Shanghai, China Author

DOI:

https://doi.org/10.63313/JCSFT.9079

Keywords:

Cloud-native, Credit payment systems, Real-time fraud detection, Temporal Heterogeneous Graph Neural Network, LLM-driven security policy, Out-of-order CPU, Microarchitecture optimization, Hardware-aware deep learning

Abstract

Real-time fraud detection in cloud-native credit payment systems is a critical challenge due to the increasing complexity of transaction networks, the rapid evolution of fraudulent behaviors, and the high computational demands of modern deep learning models. To address these challenges, we propose CloudPayGuard, a hardware-aware framework that integrates Temporal Heterogeneous Graph Neural Networks (TH-GNN) for dynamic transaction modeling, Large Language Models (LLM) for automated security policy generation, and out-of-order (OoO) CPU microarchitecture performance prediction for hardware-accelerated inference. CloudPayGuard constructs multi-modal transaction graphs incorporating user behavior sequences, device fingerprints, and geolocation information, enabling real-time identification of suspicious activities with millisecond-level latency. The framework dynamically generates and verifies risk policies through LLM-based reasoning and constraint checking, ensuring trustworthy and adaptive deployment in cloud-native environments. To optimize inference performance, a deep learning-based CPU microarchitecture predictor estimates IPC and identifies potential bottlenecks in ROB, IQ, and LSQ resources, allowing dynamic adjustment of CPU parameters and task scheduling. Experiments on a large-scale financial transaction dataset show that CloudPayGuard achieves an F1-score of 0.91 and an average inference latency of 6 milliseconds, outperforming baseline TH-GNN and other models. The OoO CPU microarchitecture optimization reduces latency by 34–40%, while LLM-driven policy generation and TH-GNN-based graph modeling ensure accurate fraud detection. These results demonstrate CloudPayGuard’s efficiency, scalability, and effectiveness for real-time fraud detection in cloud-native credit systems.

References

[1] Wu Z, Pan S, Chen F, et al. A comprehensive survey on graph neural networks[J]. IEEE transactions on neural networks and learning systems, 2020, 32(1): 4-24.

[2] Zhang Z, Cui P, Zhu W. Deep learning on graphs: A survey[J]. IEEE Transactions on Knowledge and Data Engineering, 2020, 34(1): 249-270.

[3] Cheng D, Zou Y, Xiang S, et al. Graph neural networks for financial fraud detection: a review[J]. Frontiers of Computer Science, 2025, 19(9): 199609.

[4] Brown T, Mann B, Ryder N, et al. Language models are few-shot learners[J]. Advances in neural information processing systems, 2020, 33: 1877-1901.

[5] Chowdhery A, Narang S, Devlin J, et al. Palm: Scaling language modeling with pathways[J]. Journal of machine learning research, 2023, 24(240): 1-113.

[6] Yang H, Liu X Y, Wang C D. Fingpt: Open-source financial large language models[J]. arXiv preprint arXiv:2306.06031, 2023.

[7] Li J, Zeng P, Luo P. CANAO: A Cloud-Aware Native Agentic AI Framework for Adaptive Task Orchestration in Cloud-Native Environments[J]. Frontiers in Artificial Intelligence Research, 2026, 3(1): 187-198.

[8] Mirhoseini, Azalia, et al. "Device placement optimization with reinforcement learning." International conference on machine learning. PMLR, 2017.

[9] Puigdemont, Pol, et al. "A data-driven approach to dataflow-aware online scheduling for graph neural network inference." Proceedings of the 30th Asia and South Pacific Design Automation Conference. 2025.

[10] Chen, Tianqi, et al. "{TVM}: An automated {End-to-End} optimizing compiler for deep learning." 13th USENIX symposium on operating systems design and implementation (OSDI 18). 2018.

[11] Park, Jeman, et al. "NEST‐C: A deep learning compiler framework for heterogeneous computing systems with artificial intelligence accelerators." ETRI Journal 46.5 (2024): 851-864.

[12] Liu, Ji, et al. "Heterps: Distributed deep learning with reinforcement learning based scheduling in heterogeneous environments." Future Generation Computer Systems 148 (2023): 106-117.

[13] Lu, Wenyan, et al. "Flexflow: A flexible dataflow accelerator architecture for convolutional neural networks." 2017 IEEE international symposium on high performance computer architecture (HPCA). IEEE, 2017.

[14] Wei H, Wu Y, Li M. RAGN-IIoT: A Retrieval-Augmented NL2SQL Framework with Dynamic Sensor-Selection Guardrails for Industrial IoT Time-Series Data Warehouses[J]. Journal of Computer, Signal, and System Research, 2025, 2(7): 78-88.

[15] Sun Q, Zhao X, Lin X. Design of a Hardware-Software Co-designed Real-Time Machine Learning System for Big Data Streams[C]//Proceedings of the 2nd International Symposium on Integrated Circuit Design and Integrated Systems. 2025: 265-271.

[16] Lin, Ziyu, and Biliang Wang. "Adaptive load balancing algorithms for cloud computing distributed systems." IET Conference Proceedings CP952. Vol. 2025. No. 39. Stevenage, UK: The Institution of Engineering and Technology, 2025.

[17] Yao Y, Zhang W, Li M. Cloud-Edge Federated Incremental Learning Framework for PMSM Efficiency Optimization with Lightweight CNN-LSTM Models and OTA Differential Deployment[J]. Journal of Computer Science and Frontier Technologies, 2026, 3(1): 98-111.

[18] Liang Z, Wei W, Zhang K, et al. Research on multi-hop inference optimization of llm based on mquake framework[J]. arXiv preprint arXiv:2509.04770, 2025.

[19] Zhang Y, Bai Z. GenRiskNet: A GenAI-Driven Multi-Source Heterogeneous Data Fusion Framework for Financial Risk Prediction[J]. Economics and Management Innovation, 2026, 3(1): 112-121.