- DefectGuard is a python package
- Basic functionalities:
- Mining commits from Git repositories
- Post-processing, training, inferencing JITDP model via CLI or import library
- DefectGuard had been integrated into VSC (extension), Jenkins & GitHub Action (via command)
Please checkout #2 in TROUBLESHOOT.md if you do not have GPU(s) on your machine
docker compose up --build -d
docker exec -it defectguard /bin/bash
Inside docker container:
# This setup pyszz and defectguard
bash scripts/setup.sh
Note: download this outside of the container
Install the nvidia-container-toolkit
package as per official documentation at Github.
We also provide a quick-run script for Debian-based OS
- SrcML
DefectGuard requires PySZZ for mining data functionality. SrcML is required by PySZZ. Please install it before mining data.
# Install libarchive13 libcurl4 libxml2
sudo apt-get install libarchive13 libcurl4 libxml2
# Install libssl
wget http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2_amd64.deb && \
dpkg -i libssl1.1_1.1.1f-1ubuntu2_amd64.deb && \
rm -rf libssl1.1_1.1.1f-1ubuntu2_amd64.deb
# Install SrcML
wget http://131.123.42.38/lmcrs/v1.0.0/srcml_1.0.0-1_ubuntu20.04.deb && \
dpkg -i srcml_1.0.0-1_ubuntu20.04.deb && \
rm -rf srcml_1.0.0-1_ubuntu20.04.deb
- PySZZ
DefectGuard requires this external tool for mining data functionality. Please install it before mining data.
git clone https://github.com/grosa1/pyszz_v2.git
- Dependencies
Please checkout #2 in TROUBLESHOOT.md if you do not have GPU(s) on your machine
Check out this requirements.txt or this cpu-only-requirements.txt
pip install -r requirements.txt
- Setup DefectGuard
python setup.py develop
defectguard mining \
-repo_name <project_name> \
-repo_path <path/to/project> \
-repo_language <main_language_of_project> \
-pyszz_path <path/to/project/pyszz_v2>
defectguard training \
-model <model_name> \
-repo_name <project_name> \
-repo_language <main_language_of_project> \
-epochs <epochs>
defectguard training \
-model <model_name> \
-from_pretrain \
-repo_name <project_name> \
-repo_language <main_language_of_project> \
-epochs <epochs>
defectguard evaluating \
-model <model_name> \
-repo_name <project_name> \
-repo_language <main_language_of_project>
defectguard evaluating \
-model <model_name> \
-from_pretrain \
-repo_name <project_name> \
-repo_language <main_language_of_project>
Comming Soon
Comming Soon
.
├── dg_cache
│ ├── dataset // default folder for saving dataset
│ ├── save // default folder for saving extracted data
│ ├── repo // default folder for cloning github repository
A sample structure of extracted data:
.
├── save
| ├── repo_name
| | ├── commit_ids.pkl
| | ├── etracted_info.json // the config for Extractor
| | ├── repo_bug_fix.json // the bug_fix file for running PySZZ
| | ├── repo_commits_{num}.pkl // files storing commits information
| | ├── repo_features.pkl // files storing commits features
A sample structure of processed data:
.
├── dataset
| ├── repo_name
| | ├── commits
| | | ├── cc2vec.pkl
| | | ├── deepjit.pkl
| | | ├── simcom.pkl
| | | ├── dict.pkl
| | ├── features
| | | ├── feature.csv
In case this tool is run on mode="local"
, please follow this repository's structure paths:
.
├── repo_path
| ├── repo_owner
| | ├── repo_name
| | | ├── .git
| | | ├── other repo content
Find here: https://github.com/manhtdd/DefectGuard-the-Package/blob/main/TROUBLESHOOT.md