Document Type
Article
Source of Publication
Forensic Science International: Digital Investigation
Publication Date
7-1-2024
Abstract
Extracting compiler-provenance-related information (e.g., the source of a compiler, its version, its optimization settings, and compiler-related functions) is crucial for binary-analysis tasks such as function fingerprinting, detecting code clones, and determining authorship attribution. However, the presence of obfuscation techniques has complicated the efforts to automate such extraction. In this paper, we propose an efficient and resilient approach to provenance identification in obfuscated binaries using advanced pre-trained computer-vision models. To achieve this, we transform the program binaries into images and apply a two-layer approach for compiler and optimization prediction. Extensive results from experiments performed on a large-scale dataset show that the proposed method can achieve an accuracy of over 98 % for both obfuscated and deobfuscated binaries.
DOI Link
ISSN
Publisher
Elsevier BV
Volume
49
Disciplines
Computer Sciences
Keywords
Binary code analysis, Compiler provenance, Malware analysis, Reverse engineering
Scopus ID
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Recommended Citation
Khan, Wasif; Alrabaee, Saed; Al-kfairy, Mousa; Tang, Jie; and Raymond Choo, Kim Kwang, "Compiler-provenance identification in obfuscated binaries using vision transformers" (2024). All Works. 6635.
https://zuscholars.zu.ac.ae/works/6635
Indexed in Scopus
yes
Open Access
yes
Open Access Type
Hybrid: This publication is openly available in a subscription-based journal/series