Skip to content

arpit2412/g8-serp2021

Repository files navigation

SERP Project 8 - Evaluating Cross-Domain Vulnerability Detection Methods

This project explores evaluating various deep learning-based cross-project vulnerability detection methods. Methods from different research papers will be replicated to provide a baseline for our evaluation framework. Our framework will be reproducible and can be adopted into future research for determining the optimum vulnerability detection method. Research papers(scope), source code, and feature extraction tools will be provided within this repository.

Papers Replicated/Scope

POSTER: Vulnerability Discovery with Function Representation Learning from Unlabeled Projects

This project explores using function-level vulnerability discovery within a cross-project scope. The AST representation will be the training data used for a bidirectional LSTM neural network. Typical Recurrent neural networks have difficulties in capturing long-term dependencies regarding continuous and fragment elements associated with vulnerability therefore the method combines RNN with LSTM cells to handle the vulnerabilities with long-term dependencies spanning multiple lines of code. The function level representation model machine learning model has demonstrated significant performance gains

Tools:

Understand by SciTools

Understand is a commercial code enhancement tool for extracting function-level code metrics.
Source: https://www.scitools.com/

CodeSensor (version 2.0)

CodeSensor is a robust code to Abstract Syntax Tree(AST) parser implemented by based on the concept of island grammars.
Source: https://github.com/fabsx00/codesensor

Results:

screenshot


Dual-component Deep domain Adaptation: A New Approach for Cross Project Software Vulnerability Detection

To address the issue concerning the scarcity of labeled vulnerabilities in data sets used for software vulnerability detection, a deep domain adaptation soltuion is proposed. Using deep domain adaptation labelled vulnerability representations from a source dataset could be transfered to an unlabelled target dataset. this paper proposes an Dual Generator-Discriminator Deep Code Domain Adaptation Network (Dual-GD-DDAN) architecture for handling transfer learning from a labelled source to unlabelled target dataset.

Tools:

joern - Data pre-processing tool for DDAN & Dual-DDAN

Joern will be utilized to analyze the source codes to get user-defined variables and functions.
Source: [https://joern.readthedocs.io/en/latest/]

Results:

screenshot
screenshot

Validation Data

SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities

126 types of vulnerabilities within source code, were collected from the National Vulnerability Database (NVD) and the Software Assurance Reference Dataset (SARD). The NVD data set contains vulnerabilities from 29 open-source software projects. The SARD dataset contains 13,906 vulnerable C/C++ programs out of a total of 14,000. Vulnerable data representations are aimed to accommodate both syntax and semantic information by introducing the notion of:

  • Syntax-based Vulnerability Candidates (SyVCs)
  • Semantics-based Vulnerability Candidates (SeVCs)

A program as a whole is divided into statements that correspond to “region proposals” and display the syntax and semantics characteristics of vulnerabilities. SyVCs are representative of vulnerability syntax characteristics and are extended upon by SeVCs for accommodating the semantic information due to the presence of data dependency and control dependency.



System Design/Requirements

screenshot

Architecture Design

screenshot

Performance Metrics

In order to compare the performance of each model horizontally, we introduced five performance metrics: FNR, FPR, Recall, Precision and F1-score. The parameters of these performance indicators can be calculated through the confusion matrix of the model.
FNR (False Negative Rate, best = 0): FNR represents that in the positive class, how many samples are predicted to be the negative class which is an error rate. In the project, the FNR demonstrates that the rate of non-vulnerable functions has been identified as vulnerable.
screenshot
FPR (False Positive Rate, best = 0): FPR represents that in the negative class, how many samples are predicted to be a positive class which is an error rate. In the project, FPR represents that the rate of the vulnerable function has been identified as non-vulnerable. Therefore, it directly demonstrates the capability of the model to correctly find out the vulnerable functions. FPR is one of our key performance indicators.
screenshot
Precision (best = 1): Model precision score represents the rate of the model correctly classifying the positive sample out of all the positive predictions made.
screenshot
Recall (best = 1): Model recall score represents the model’s ability to correctly predict the positives out of actual positives. The higher the recall score, the better the machine learning model is at identifying both positive and negative classes.
screenshot
F1-Score (best = 1): Model F1 score represents the model score as a function of precision and recall score. F1-score is a machine learning model performance metric that gives equal weight to both the Precision and Recall for measuring its performance in terms of accuracy. It’s often used as a single value that provides high-level information about the model’s output quality. Therefore, F1-score is one of our key performance indicators.
screenshot

LSTM, RF, DDAN and Dual-DDAN performance results

screenshot
screenshot
screenshot

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published