The subject of this workshop is real-time data analysis using Spark Streaming. We'll cover how Spark streaming works and how it can be used in machine learning systems. We will be interested in building machine learning models for classification and clustering. The main application we will spend most of our time on will be network traffic analysis for detecting threats in computer networks.
Target Audience
Data analysts and data scientists interested in real-time data processing using Spark for application to machine learning systems.
Requirements
Some experience with Python, basic knowledge of cloud computing, and machine learning concepts. You need a laptop with internet access. We will work in the Databricks cloud environment.
Participant’s ROI
Practical knowledge of building data processing systems using Spark Streaming.
Practical experience building machine learning models for classification and clustering.
Application of learned techniques for analyzing network packets in order to increase network security.
Training Materials
All participants will receive training materials in the form of PDF files containing slides with the theory and an exercise manual with a detailed description of all exercises. During the workshops, the exercises can be performed in the Databricks Platform.
Time Box
This is a one-day event (9:00 AM - 4:00 PM). We will schedule breaks between sessions.
Agenda
Prowadzący: