(NSFOCUS Technologies Group Co.,,Ltd.,Beijing 100089,,China)
Abstract: With the rapid growth and spread of malware, cybersecurity system is facing great threat in recent years. Meanwhile, the continuous development of attack technology can bypass the threat analysis and detection of security system, which poses new challenges to security analysts. Due to the limitation of resources in traditional manual malware analysis, the traditional method faces difficulties in uncovering the potential attack vectors and technologies of malware even with the help of automated analysis tools, and it is difficult to find the commonality between malware.This paper designs a malware association analysis system called SimMal,,which can clearly show the relationship between various dimensions of malware by heterogeneous network graph, such as malware instance, malicious behavior, attack techniques and exploits. Furtherly based on heterogeneous graph representation learning, SimMal can predict potential malware family and APT(Advanced Persistent Threats) groups associated with malware, and thus can assist analysts to discover malware-related risks and intentions in advance, and making advance defenses. The SimMal system currently is applied to real malware datasets and the experimental result has verified the effectiveness of malware family classification and APT groups traceability analysis.
Key words : malware;automated analysis,;association analysis,;heterogeneous graph learning