... | ... | @@ -4,7 +4,7 @@ GitHub Crawler is an application that has been developed within the context of t |
|
|
|
|
|
The GitHub Crawler is responsible for (i) downloading a large number of open-source repositories from GitHub based on a user's query, (ii) compiling the downloaded repositories, and (iii) analyzing the downloaded repositories with CKJM Extended (link) and CCCC (link) static code analyzers, in order to compute popular software metrics. GitHub crawler is useful for researchers and <span dir="">practitioners</span> that would like to generate benchmark repositories of open-source software applications for further analysis and processing, which are particularly important for the conduction of empirical studies. Currently, the GitHub Crawler can download applications written in any programming language; however, it can compute software metrics only for software repositories written in Java, C, and C++.
|
|
|
|
|
|
The GitHub Crawler was utilized by the SDK4ED project for (i) constructing the Benchmark Repository that was utilized for calibrating the Software Security Assessment Model (SAM) (link) and (ii) for constructing software metrics-based Vulnerability Prediction Models (VPMs) (link) that are part of the Dependability Toolbox (link). It is also utilized by the Dependability Toolbox for constructing security benchmarks.
|
|
|
The GitHub Crawler was utilized by the SDK4ED project for (i) constructing the Benchmark Repository that was utilized for calibrating the Software Security Assessment Model (SAM) ([link](dependability-toolbox-usage#quantitative-security-assessment-service)) and (ii) for constructing software metrics-based Vulnerability Prediction Models (VPMs) ([link](dependability-toolbox-usage#vulnerability-prediction)) that are part of the Dependability Toolbox ([link](dependability-toolbox)). It is also utilized by the Dependability Toolbox for constructing security benchmarks.
|
|
|
|
|
|
## Overview
|
|
|
|
... | ... | @@ -22,7 +22,7 @@ A high-level overview of the GitHub Crawler application is depicted in the figur |
|
|
|
|
|
## Utilization of the GitHub Crawler
|
|
|
|
|
|
The GitHub Crawler can be used (indirectly) through the API that is provided by the Dependability Toolbox. For more information on how to use the Dependability Toolbox, please check its dedicated wiki page (link).
|
|
|
The GitHub Crawler can be used (indirectly) through the API that is provided by the Dependability Toolbox. For more information on how to use the Dependability Toolbox, please check its dedicated wiki page ([link](dependability-toolbox)).
|
|
|
|
|
|
Apart from the Dependability Toolbox, we provide the option to be executed through the terminal. The general structure of the command that should be executed is provided below:
|
|
|
|
... | ... | @@ -39,3 +39,18 @@ In the table below, a description of the parameters is provided: |
|
|
| max_num | The maximum number of software repositories to be gathered (i.e., when the process should terminate) (default value: 100) |
|
|
|
| message | An additional search string in order to narrow down the scope of the search. It can be any string deemed necessary by the user. |
|
|
|
|
|
|
- [SDK4ED Dahboard]()
|
|
|
|
|
|
- [Description]()
|
|
|
- [User and Project Management](User-Project-Management)
|
|
|
- [Installation](Frontend-Installation)
|
|
|
- [Walkthrough](tutorial)
|
|
|
|
|
|
- [SDK4ED Toolboxes]()
|
|
|
|
|
|
- [Technical Debt Toolbox](Technical-Debt-Toolbox)
|
|
|
- [Energy Optimization Toolbox](Energy-Toolbox)
|
|
|
- [Dependability Toolbox](dependability-toolbox)
|
|
|
- [Forecaster Toolbox](Forecaster Toolbox Home)
|
|
|
- [Decision Support Toolbox](Decision Support Toolbox)
|
|
|
|