| Abstract | The dynamic nature of cloud environments has amplified the challenges of defending against Advanced Persistent Threats (APTs), which exploit the complexity and interconnectedness of modern infrastructures. Existing datasets fail to adequately address the unique requirements of cloud-native systems, particularly those leveraging system provenance graphs for comprehensive analysis. In this work, we present CloudAPT, the first dataset to utilize system provenance graphs for capturing APT behaviors in Kubernetes-based cloud environments. The dataset spans eight days, encompassing the complete APT lifecycle: reconnaissance, initial compromise, privilege escalation, lateral movement, data exfiltration, and covering tracks, while integrating realistic, human-driven user interactions. By centralizing activities on a single worker VM, the dataset ensures granular and transparent data collection, avoiding fragmentation and providing a holistic view of attacker strategies and system responses. CloudAPT includes provenance graph data, cluster logs, and application-level data, offering deep insights into interactions within cloud-native systems. This dataset serves as a foundational resource for developing and benchmarking advanced detection mechanisms and security solutions tailored specifically to the complexities of cloud environments. |
|---|