Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • W Wiki Home
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Create a new issue
  • Jobs
  • Issue Boards
Collapse sidebar
  • SDK4ED WikiSDK4ED Wiki
  • Wiki Home
  • Wiki
  • GitHub Crawler

GitHub Crawler · Changes

Page history
Update GitHub Crawler authored May 08, 2024 by Miltos's avatar Miltos
Hide whitespace changes
Inline Side-by-side
GitHub-Crawler.md
View page @ 72524a5c
......@@ -2,7 +2,7 @@
GitHub Crawler is an application that has been developed within the context of the SDK4ED project, as part of the Dependability Toolbox, in order to enable the automatic collection and static analysis of a large number of open-source software repositories from GitHub.
The GitHub Crawler is responsible for (i) downloading a large number of open-source repositories from GitHub based on a user's query, (ii) compiling the downloaded repositories, and (iii) analyzing the downloaded repositories with CKJM Extended (link) and CCCC (link) static code analyzers, in order to compute popular software metrics. GitHub crawler is useful for researchers and practicionairs that would like to generate benchmark repositories of open-source software applications for further analysis and processing, which are particularly important for the conduction of empirical studies. Currently, the GitHub Crawler can download applications written in any programming language; however, it can compute software metrics only for software repositories written in Java, C, and C++.
The GitHub Crawler is responsible for (i) downloading a large number of open-source repositories from GitHub based on a user's query, (ii) compiling the downloaded repositories, and (iii) analyzing the downloaded repositories with CKJM Extended (link) and CCCC (link) static code analyzers, in order to compute popular software metrics. GitHub crawler is useful for researchers and <span dir="">practitioners</span> that would like to generate benchmark repositories of open-source software applications for further analysis and processing, which are particularly important for the conduction of empirical studies. Currently, the GitHub Crawler can download applications written in any programming language; however, it can compute software metrics only for software repositories written in Java, C, and C++.
The GitHub Crawler was utilized by the SDK4ED project for (i) constructing the Benchmark Repository that was utilized for calibrating the Software Security Assessment Model (SAM) (link) and (ii) for constructing software metrics-based Vulnerability Prediction Models (VPMs) (link) that are part of the Dependability Toolbox (link). It is also utilized by the Dependability Toolbox for constructing security benchmarks.
......
Clone repository
  • Advanced
  • Architectural Toolbox Description
  • Architectural Toolbox Front End
  • Architectural Toolbox Home
  • Architectural Toolbox Installation
  • Architectural Toolbox Usage
  • Decision Support Toolbox Description
  • Decision Support Toolbox Front end
  • Decision Support Toolbox Installation
  • Decision Support Toolbox Usage
  • Decision Support Toolbox
  • Energy Toolbox Description
  • Energy Toolbox Front end
  • Energy Toolbox Installation
  • Energy Toolbox Usage
View All Pages