Data Swamps to Data Lakes – Visualizing the Void of Security Audit Data

November 18, 2020 – 2:15PM EST

In this talk, we will discuss detection of three well-defined security problems “adversarial user behavior, lateral movement and insider threat detection” using a relatively untapped data set: shell and session commands. We’ll discuss machine learning (ML) techniques needed to analyze this data, present research key findings and describe the effects and mitigations of bias to achieve higher accuracy. Additionally, we will explore techniques for safeguarding ML models based on this data.

The presentation also will outline a number of the tools used to develop these findings, including methods for analyzing and visualizing massive datasets over billions of Linux audit events. Finally, we’ll cover advances in ML that can be leveraged to gain meaning out of the data thrown into the lake.

Learning Objectives

  • Understand how command-line data can be used for threat detection within cloud-scale Linux environments.
  • Apply data science and machine learning against Linux audit data to assist in a variety of security analytics tasks.
  • Understand and account for various kinds of bias in security data.

Jake King

CEO & Cofounder, Cmd

You need to purchase your pass and register for Security Congress before you can save your spot to attend this session.