Design and Implementation of an LLM-Driven Intelligent Risk Identification System for Railway Cybersecurity
-
Graphical Abstract
-
Abstract
The growing cybersecurity risks in railways necessitate efficient analysis of vulnerability reports, yet manual processing remains time-consuming and error-prone. This paper presents an end-to-end large language model (LLM) based approach for automated intelligence extraction in railway cybersecurity.Built upon Qwen1.5-14B, the approach employs prefix tuning and instruction templates for parameter-efficient domain adaptation without modifying the base parameters. To mitigate hallucinations, we construct an expert-curated canonical risk list and map model outputs to standardized entries via set-based text similarity, ensuring consistency and verifiability. A knowledge base is built through multi-source text acquisition and structuring. Experimental results show that the information-extraction agent outperforms representative general-purpose LLMs across key elements such as threat groups, IP addresses, hashes, emails, URLs, and YARA rules; for end-to-end risk and hazard identification, the method achieves F1-scores of 82.2% and 83.6%, demonstrating strong practicality and robustness.
-
-