Incremental Hacker Forum Exploit Collection and Classification for Proactive Cyber Threat Intelligence: An Exploratory Study (original) (raw)
2018, 2018 IEEE International Conference on Intelligence and Security Informatics (ISI)
Cyber threats have emerged as a key societal concern. To counter the growing threat of cyber-attacks, organizations, in recent years, have begun investing heavily in developing Cyber Threat Intelligence (CTI). Fundamentally a data driven process, many organizations have traditionally collected and analyzed data from internal log files, resulting in reactive CTI. The online hacker community can offer significant proactive CTI value by alerting organizations to threats they were not previously aware of. Amongst various platforms, forums provide the richest metadata, data permanence, and tens of thousands of freely available Tools, Techniques, and Procedures (TTP). However, forums often employ anti-crawling measures such as authentication, throttling, and obfuscation. Such limitations have restricted many researchers to batch collections. This exploratory study aims to (1) design a novel web crawler augmented with numerous anti-crawling countermeasures to collect hacker exploits on an ongoing basis, (2) employ a state-of-the-art deep learning approach, Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN), to automatically classify exploits into pre-defined categories onthe-fly, and (3) develop interactive visualizations enabling CTI practitioners and researchers to explore collected exploits for proactive, timely CTI. The results of this study indicate, among other findings, that system and network exploits are shared significantly more than other exploit types.