A Comparative Study of Software Secrets Reporting by Secret Detection Tools | IEEE Conference Publication | IEEE Xplore

A Comparative Study of Software Secrets Reporting by Secret Detection Tools


Abstract:

Background: According to GitGuardian's monitoring of public GitHub repositories, secrets sprawl continued accelerating in 2022 by 67% compared to 2021, exposing over 10 m...Show More

Abstract:

Background: According to GitGuardian's monitoring of public GitHub repositories, secrets sprawl continued accelerating in 2022 by 67% compared to 2021, exposing over 10 million secrets (API keys and other credentials). Though many open-source and proprietary secret detection tools are available, these tools output many false positives, making it difficult for developers to take action and teams to choose one tool out of many. To our knowledge, the secret detection tools are not yet compared and evaluated. Aims: The goal of our study is to aid developers in choosing a secret detection tool to reduce the exposure of secrets through an empirical investigation of existing secret detection tools. Method: We present an evaluation of five open-source and four proprietary tools against a benchmark dataset. Results: The top three tools based on precision are: GitHub Secret Scanner (75%), Gitleaks (46%), and Commercial X (25%), and based on recall are: Gitleaks (88%), SpectralOps (67%) and TruffleHog (52%). Our manual analysis of reported secrets reveals that false positives are due to employing generic regular expressions and ineffective entropy calculation. In contrast, false negatives are due to faulty regular expressions, skipping specific file types, and insufficient rulesets. Conclusions: We recommend developers choose tools based on secret types present in their projects to prevent missing secrets. In addition, we recommend tool vendors update detection rules periodically and correctly employ secret verification mechanisms by collaborating with API vendors to improve accuracy.
Date of Conference: 26-27 October 2023
Date Added to IEEE Xplore: 08 November 2023
ISBN Information:
Conference Location: New Orleans, LA, USA

Funding Agency:


I. Introduction

GitGuardian measured the exposure of secrets in GitHub repositories for the last three years and reported in March 2023 that secrets sprawl continued accelerating in 2022 by 67% compared to 2021, exposing more than 10 million secrets [1]. In addition, they discovered that one out of 10 GitHub code authors exposed at least one secret in 2022. Secrets (such as API keys and access tokens) are indispensable for software as secrets are needed for third-party service integration, such as payment systems. However, developers leak secrets in plain text in the version control systems (VCS) and application packages [2], [3]. In September 2022, an attacker took over Uber's internal tools and applications by leveraging hard-coded admin credentials in their PowerShell scripts [4].

Contact IEEE to Subscribe

References

References is not available for this document.