Loading [MathJax]/extensions/MathMenu.js
Forging New Paths in Cybersecurity Doctoral Research with Open Datasets and Synthetic Data Generation | IEEE Conference Publication | IEEE Xplore

Forging New Paths in Cybersecurity Doctoral Research with Open Datasets and Synthetic Data Generation


Abstract:

This research-to-practice full paper addresses the important need for relevant and comprehensive datasets to advance cybersecurity research by proposing methods for curat...Show More

Abstract:

This research-to-practice full paper addresses the important need for relevant and comprehensive datasets to advance cybersecurity research by proposing methods for curating open datasets and generating synthetic datasets. Cybersecurity research is a rapidly evolving scientific field, making robust datasets crucial for empirical analysis. Unfortunately, current doctoral research is hindered by the scarcity, limited accessibility, and outdated or irrelevant nature of existing open-source datasets. This paper tackles these challenges by focusing on two main initiatives: (1) curating a pilot collection of open datasets aligned with the National Initiative for Cybersecurity Education (NICE) Cybersecurity Workforce Framework, and (2) using Generative Adversarial Networks (GANs) to generate synthetic datasets. Our research highlights the obstacles faced by doctoral students due to fragmented, outdated data and underscores the importance of accessible datasets for rigorous scientific inquiry. We also demonstrate how synthetic data can ease privacy concerns while still offering researchers realistic data. By incorporating these approaches into doctoral curricula, we aim to equip future cybersecurity researchers with the skills resources for impactful research. The authors will continue to expand their dataset curation efforts and study how discoverable, high-quality datasets can influence doctoral research, particularly empirical studies and their outcomes.
Date of Conference: 13-16 October 2024
Date Added to IEEE Xplore: 26 February 2025
ISBN Information:

ISSN Information:

Conference Location: Washington, DC, USA

I. Introduction

The field of cybersecurity continues to rapidly evolve, re-flecting a shift towards a more rigorous scientific approach that emphasizes empirical research and data-driven analysis. For example, a bibliometric study by Furstena et al. [1] maps out two decades of cybersecurity research, revealing an expanding scope of themes from intrusion detection to complex issues like privacy and smart grids. The study highlights the growing reliance on quantitative methodologies and the increasing so-phistication of cybersecurity research, underscoring the field's evolution from practical countermeasures to a structured scientific discipline. Complementing this perspective, [2] introduces the concept of “cybersecurity dynamics”, further establishing the field's foundation by advocating for a systemic and sci-entific approach to understanding and modeling cybersecurity phenomena. The framework by Xu [2] reinforces the necessity for a scientific discipline that can adapt to and anticipate evolving cybersecurity challenges. Adding to this foundation, [3] highlight the critical role of mathematical approaches in elevating cybersecurity to a scientific discipline, arguing that these methodologies provide the precision and replicability needed to transform cybersecurity from a protoscience to a fully developed science. This stream of literature corroborates the pressing needs of transformation of cybersecurity from practical, ad hoc solutions to a more structured and empirical field. However, the current research landscape is marred by outdated and fragmented datasets that fail to capture the evolving dynamics of cyber threats, limiting the scope and depth of their potential use in research and their applicability to research on modern cybersecurity challenges [4].

Contact IEEE to Subscribe

References

References is not available for this document.