中圖分類(lèi)號(hào):TP391.1 文獻(xiàn)標(biāo)志碼:A DOI: 10.16157/j.issn.0258-7998.245114 中文引用格式: 楊嘉佳,,關(guān)健,,李正,等. ?FA:一種基于向量指令集的高性能數(shù)據(jù)處理算法[J]. 電子技術(shù)應(yīng)用,,2024,,50(11):85-88. 英文引用格式: Yang Jiajia,Guan Jian,,Li Zheng,,et al. ?FA: a high-performance data processing algorithm based on vector instruction set[J]. Application of Electronic Technique,2024,,50(11):85-88.
ßFA: a high-performance data processing algorithm based on vector instruction set
Yang Jiajia,,Guan Jian,,Li Zheng,Yu Zengming,,Yao Wangjun
The Sixth Research Institute of China Electronics Corporation
Abstract: Regular expression matching technology plays a significant role in data processing tasks such as data cleaning, parsing, and extraction. However, due to issues such as strong data dependency and unpredictable memory access in the matching process, the matching performance is relatively low. In response to this problem, this paper proposes a high-performance regular expression data processing algorithm based on vector instruction set, which is called ßFA. By using vector instructions to read a sequence of consecutive characters at once, and performing vector matching with the non-trusted character set corresponding to the most frequently accessed state, built-in functions can be utilized to find the position of the first non-trusted character, thus obtaining the number of characters that can be skipped directly, thereby accelerating the matching performance. Experimental results show that the throughput of the ßFA algorithm is superior to the original DFA algorithm and the αFA algorithm, being 4.67~60 times faster than the original DFA algorithm and 4.37~7.82 times faster than the αFA algorithm.
Key words : regular expression matching,;vector instruction set;high-performance data processing