Posts by Collection

portfolio

publications

Building Adversarial Defense with Non-invertible Data Transformations

Published in PRICAI (CCF-C), 2019

This paper is to use non-invertible data transformation to defend adversarial machine learning. Empirical results indicate that our framework provides better robustness compared to state-of-art solutions while having negligible degradation in generalization accuracy.

Recommended citation: Guo W, Mu D, Chen L, et al. Building Adversarial Defense with Non-invertible Data Transformations[C]//Pacific Rim International Conference on Artificial Intelligence. Springer, Cham, 2019: 593-606. http://chenligeng.github.io/files/PRICAI2019.pdf

CATI: Context-Assisted Type Inference from Stripped Binaries

Published in DSN (CCF-B), acceptance rate 16.5% (48/291), 2020

This paper leverages the context of assembly code to identify and infer the types of variables from stripped binaries, which achieves a high accuracy and beats state-of-art method.

Recommended citation: Chen L, He Z, Mao B. CATI: Context-Assisted Type Inference from Stripped Binaries[C]//2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 2020: 88-98. http://chenligeng.github.io/files/DSN2020.pdf

What Exactly Determines the Type? Inferring Types with Context

Published in DSN-S (CCF-B), 2020

Closed-source programs lack crucial information vital for code analysis. We locates variables from stripped binaries and infers 19 types from variables. Experiments show that it infers variable type with 71.2% accuracy on unseen binaries.

Recommended citation: Chen L. What Exactly Determines the Type? Inferring Types with Context[C]//2020 50th Annual IEEE-IFIP International Conference on Dependable Systems and Networks-Supplemental Volume (DSN-S). IEEE, 2020: 71-72. http://chenligeng.github.io/files/DSN-S2020.pdf

RoBin: Facilitating the Reproduction of Configuration-Related Vulnerability

Published in TrustCom (CCF-C), acceptance rate 30%, 2021

In this paper, we address the problem mentioned above — reproducing the configuration-related vulnerability. We try to solve it by proposing a binary similarity-based method to infer the specific building configurations via the binary from crash report. The main challenges are as follows: precise compilation option inference, program configuration inference, and source-code-to binary matching.

Recommended citation: Chen, Ligeng, et al. "RoBin: Facilitating the Reproduction of Configuration-Related Vulnerability." arXiv preprint arXiv:2110.12989 (2021). http://chenligeng.github.io/files/ROBIN_TrustCom2021.pdf

DIComP: Lightweight Data-Driven Inference of Binary Compiler Provenance with High Accuracy

Published in SANER (CCF-B), ORAL acceptance rate 24.1% (48/199), 2022

In this paper, we conduct a thorough empirical study on the binary’s appearance under various compilation settings and propose a lightweight binary analysis tool based on the simplest machine learning method, called DIComP to infer the compiler and optimization level via most relevant features according to the observation.

Recommended citation: To appear. http://chenligeng.github.io/files/SANER2022_DIComP.pdf

AVMiner: Expansible and Semantic-Preserving Anti-Virus Labels Mining Method

Published in TrustCom (CCF-C), acceptance rate 25% (101/404), 2022

In this paper, we present a fully automatic method for mining the vital tokens (e.g., family, format, behavior) from Anti-Virus labels. Compared with the previous methods,our method employed unsupervised learning and NLP re-ranking methods to extract the most related tokens for the target malware. To highlight, AVMiner can be easily adapted to new kinds of malware families.

Recommended citation: Chen, Ligeng, et al. "AVMiner: Expansible and Semantic-Preserving Anti-Virus Labels Mining Method." arXiv preprint arXiv:2208.14221 (2022). http://chenligeng.github.io/files/AVMiner_2022TrustCom.pdf

Nimbus: Toward Speed Up Function Signature Recovery via Input Resizing and Multi-Task Learning

Published in QRS (CCF-C), acceptance rate 27.47% (75/273), 2022

In this paper, we propose a lightweight framework to recover function signature, i.e., parameter amount and parameter type, which benefits from input resizing and multi-task learning.

Recommended citation: Qian, Y., Chen, L., Wang, Y., & Mao, B. (2022). Nimbus: Toward Speed Up Function Signature Recovery via Input Resizing and Multi-Task Learning. arXiv preprint arXiv:2211.04219. http://chenligeng.github.io/files/QRS2022_camera_ready_version.pdf

RecMaL: Rectify the Malware Family Label via Hybrid Analysis

Published in Computers & Security (CCF-B), 2023

In this work, we conduct an in-depth analysis to explore the severity of the malware mislabel issue, and try to rectify the description of malware generated from anti-virus engines. We first propose a malware label correction tool called RecMaL. It employs hybrid analyses for malware label rectifying. According to the thorough exploratory analysis, we figure out the core reasons for mislabeling issues and summarize them into 3 types.

Recommended citation: Yang, W., Gao, M., Chen, L., Liu, Z., & Ying, L. (2023). RecMaL: Rectify the malware family label via hybrid analysis. Computers & Security, 128, 103177. http://chenligeng.github.io/files/RECMAL-publishVersion.pdf

talks

teaching

Teaching Assistant for Software Security

Graduate Student Course, Nanjing University, Department of Computer Science, 2019

I was a teaching assistant for Software Security, given by Professor Bing Mao.

Teaching Assistant for Artificial Intelligence

Undergraduate Student Course, Nanjing University, Department of Computer Science, 2019

I was a teaching assistant for Artificial Intelligence, given by Professor Yubin Yang.

Chen Ligeng