DataMass Gdańsk Summit 2022
phone: +48 570 272 723
e-mail: kamil.piotrowski@evention.pl

DATAMASS
GDANSK
SUMMIT

29.09 – Workshop Day

30.09 – Conference Day

CLOUD
AGAINST DATA

Let’s discover Big Data and Machine Learning in the contex Cloud Solutions! We are coming back to Gdansk! Note that we are not transmitting anything!

DataMass Summit is not just another conference – it is an event created with passion. The Summit is aimed at people who use the cloud in their daily work to solve Big Data, Data Science, Machine Learning and AI problems. The main idea of the conference is to promote knowledge and experience in designing and implementing tools for solving difficult and interesting challenges.

This year we are coming back after a long break forced by a pandemic, in a classic on-site form. And we have great news for You! We joined forces with GetInData and Evention (kudos to Adam Kawa for making it possible), who for years have been organizing the Big Data Technology Warsaw Summit, the largest BigData conference in Poland. So for you, our event will be even more interesting and even stronger in terms of content.

We are planning two days, the first day is devoted to technical workshops conducted by practitioners. The second day includes conference presentations by renowned experts in Big Data, Data Science, Machine Learning and AI, all in the context of cloud solutions.

Do come and join us!

check previous edition of datamass gdansk summit

Selection committee DataMass Gdansk Summit 2022:

[prelegenci name="2022-selection-comittee"]

speakers

[prelegenci name="2022-speaker"]

agenda

29.09.2022 – WORKSHOP DAY & EVENING MEETING

Independent Workshops

During DATAMASS there will be 3 parallel workshops (additionally payable), each for a maximum of 25 participants

When/Where? September 29, Conference Center of Museum of The Second World War, Gdańsk

From 9 a.m. to 4 p.m (including lunch and coffee breaks), onsite

Build reliable data pipelines using Modern Data Stack in the cloud


DETAILS:

In this one-day workshop, you will learn how to create modern data transformation pipelines managed by dbt and orchestrated with Apache Airflow.


SESSION LEADER:

Data Analyst / Analytics Engineer
GetInData
Data Engineer
GetInData

(Near) Real-time data processing in the cloud using Spark Structured Streaming and SparkML


DETAILS:

The subject of this workshop is real-time data analysis using Spark Streaming. We’ll cover how Spark streaming works and how it can be used in machine learning systems.


SESSION LEADERS:

Head of Data Science
DataMass

Machine Learning Operations (MLOps)


DETAILS:

In this one-day workshop, you will learn how to operationalize Machine Learning models using popular open-source tools, like Kedro and Kubeflow, and deploy it using cloud computing.


SESSION LEADERS:

Machine Learning Engineer
Getindata

EVENING MEETING
18:30 – 22:30

 


ZORBA RoofTop

Szafarnia 10 street, 80-755 Gdansk

The DATAMASS conference is not only about workshops and substantive presentations. It’s also a great time for networking.
That is why we invite you to an EVENING PARTY, during which, with good Greek food and drink, we will have the opportunity to get to know each other and talk to the group of participants, speakers and conference partners.
The perfect place for integration will be the ZORBA RoofTop restaurant in Gdansk – a concept full of energy, music, good food and Greek aroma. The RoofTop is located on the 10th floor of the Marina Hotel in Gdansk’s Old Town. It is a cocktail bar with great views – of the whole city of Gdansk, with drinks and food inspired by the Greeks. ZORBA RoofTop is the perfect place to organize unusual meetings, which are a great opportunity for networking.

MAPA

EVENING EVENT:


On the 27th of April from 6:00 to 10:00 p.m. you are welcome to a dinner at the PIANO BAR located at Żelazna 51/53 (A unique cocktail bar with live music in Warsaw in the revitalized Norblin Factory).

MAPA

30.09.2022 – CONFERENCE DAY

PLENARY SESSION

9.00 – 9.10
Sesja plenarna
Opening
Head of Data Science
DataMass
CEO and Co-founder
GetInData
CEO, Meeting Designner
Evention

9.10 – 9.40
Sesja plenarna
Data Infrastructure in a Multi-Cloud environment MultiCloud
[su_expand more_text="Show more" less_text="Show less"]

In this talk we'll talk about the Data Engineering Platform Datadog has built in multiple clouds. We'll discuss why, how, how it's used and the challenges we've faced.

[/su_expand]
Director Of Engineering
Datadog

9.40 – 10.05
Sesja plenarna
TBA

10.05 – 10.30
Sesja plenarna
Bank Analytics in the Cloud Google Cloud
[su_expand more_text="Show more" less_text="Show less"]

If you move your analytics to the cloud and you're a bank, there are a few things to consider. We'll show you how we are doing this and why cloud and MLOps is the only way to go.

[/su_expand]
Director of SME Risk & Analytics
PKO Bank Polski

10.30 – 10.50
Sesja plenarna
TBA

10.50 – 11.10

COFFEE BREAK

PARALLEL SESSION 1

PARALLEL SESSION 2

11:10 – 11:30
Sesja równoległa
Accelerating Data Solution Migration to the Cloud MultiCloud
[su_expand more_text="Show more" less_text="Show less"]

With increasing adoption of the native cloud based technologies, companies are struggling to migrate their legacy due to insufficient resources, lack of time and knowledge. See how our revolutionary metadata based platform – ADELE – helps customer to overcome these obstacles with automated metadata harvesting and solution re-platforming capabilities.

[/su_expand]
Division Director
Adastra Slovakia
Sesja równoległa
Cloud Data Ingestion Simplified MultiCloud
[su_expand more_text="Show more" less_text="Show less"]

With the growing demand for real-time data in business, organizations need solutions that seamlessly extract data from many sources and ingest it into the cloud for further processing needs. Creating and managing data ingestion pipelines is a time-consuming and resource-intensive task. In this presentation you will learn about typical challenges and how they can be solved with DataLark - a LeverX developed modern data management platform, effectively combining batch and streaming data processing with data transformation capabilities.

[/su_expand]
Director, Software Solutions and Product Development
LeverX Group

11:35 – 11:55
Sesja równoległa
How to process 33bln events from set top boxes in under 4 minutes Google Cloud
[su_expand more_text="Show more" less_text="Show less"]

We will talk about lessons learned from advanced analytics in traditional enterprise:
- how to talk with C-Line about Cloud?
- how to ingest data for further processing
- how to run processing on hundreds of cores and only spend few thousands a month
- which tools to use for data engineering in Google Cloud
- how to present results to the users

[/su_expand]
Enterprise Architect
Vectra
Sesja równoległa
Data engineering at the scale of PepsiCo eCommerce, 3 years of experience MultiCloud
[su_expand more_text="Show more" less_text="Show less"]

During the talk we will present the approach to data engineering at Pepsico eCommerce which scales to hundreds of DAGs/workflows and terabytes of data processed daily. This approach has progressed iteratively over the past 3 years, is used by 100+ engineers at PepsiCo and leverages 3 major cloud providers and several SaaS vendors. In particular, we will discuss the following points: data acquisition, modeling, transformation, data quality and lineage, and orchestration and scheduling.

[/su_expand]
Data Engineering Director
PepsiCo
eCommerce Engineering Head
PepsiCo

12:00 – 12:20
Sesja równolegla
TBA
Sesja równoległa
TBA

12:25 – 12:45
Sesja równoległa
Don’t go with the flow. How did Ringier Axel Springer moved its data to the cloud? AWS
[su_expand more_text="Show more" less_text="Show less"]

For a modern publisher, data is like... water. This is a story how we moved our whole waterworks to the cloud. Story about changing our intakes, pipelines, pumps and sinks to the completely new environment. Story about using cloud-native services to optimize product development while keeping costs under control. We migrated over 120-datanode Hadoop cluster and massive data stream processing infrastructure to native data services provided by AWS. Join me, if you want to hear how such task can be performed in other way than just lift-and-shift.

[/su_expand]
Solution Product Manager
Ringier Axel Springer Poland
Sesja równoległa
Medical Image Analysis using Auto ML Azure
[su_expand more_text="Show more" less_text="Show less"]

In this session, you will learn how Automated ML can be combined with transfer learning to boost data scientist productivity when building computer vision models trained on medical image data. We will see new capabilities in Azure Machine Learning’s AutoML related to image classification, object detection and segmentation.

[/su_expand]
Freelancer Data Scientist
WSB University

12:50- 13:10
Sesja plenarna
TBA
Sesja plenarna
TBA

13:10 – 14:00

LUNCH

PLENARY SESSION

14:00 – 14:20
Sesja plenarna
TBA

PARALLEL SESSION 1

PARALLEL SESSION 2

14:20 – 14:40
Sesja rownoległa
IoT as a data pipe between the physical world and the cloud AWS
[su_expand more_text="Show more" less_text="Show less"]

Important considerations when designing an IoT data pipe.

Data without the metadata is a noise without a value.

How the nature of input data impacts the solution design.

Maximizing the throughput vs saving the battery life.

How to leverage the pre-processing and post-processing in data migration projects.

[/su_expand]
Consultant
Amazon Web Service
Sesja równolegla
Peer beneath the surface of the Earth in the clouds MultiCloud
[su_expand more_text="Show more" less_text="Show less"]

- Pros & cons of multicloud solution
- How to do online inference of Deep Learning models?
- Why we should use CDK in our MLOps solution?
- Should we avoid manual steps in the automatic Machine Learning pipeline?
- Where is the end of Machine Learning Pipeline?

[/su_expand]
Senior Data Scientist, CI/CD best practices evangelist, and a trainer.
SGPR.TECH

14:45 – 15:05
Sesja równoległa
Running Apache Flink in any cloud environment MultiCloud
[su_expand more_text="Show more" less_text="Show less"]

Apache Flink is a distributed stream processing engine with large known users including Alibaba, Amazon, Netflix and Uber amongst others. Running Flink workloads in cloud environments is gaining popularity, not only for proof of concepts but also large scale production environments.

This talk discusses a cloud provider agnostic approach for running Flink workloads using Kubernetes for the deployment and management of Flink jobs, Kafka as a message broker and the object store provided by your cloud for long term storage. The talks also demonstrates running Flink jobs in a cloud environment via the Flink Kubernetes Operator.

[/su_expand]
Flink Tech Lead
Apple
Sesja równoległa
Anomaly Detection in Network with the use of ML Pipeline in Vertex AI Google Cloud
[su_expand more_text="Show more" less_text="Show less"]

Have you ever wondered how AI technologies are used in the network area?

What kind of machine learning models do we use in anomaly detection?

How do we work with pipelines and Vertex AI?

Orange, as a telco operator has a very big and wide telecommunication network based on servers, routers, and wires or antennas.  Each day millions of bites are flowing into the network to enable customers to call somebody, use the internet or send a file. How to maintain such a network? We will take you on a unique tour through our Predictive Network Maintenance project, starting from raw data and ending with an automized solution.

[/su_expand]
Data Engineer
Orange
Machine Learning Engineer
Orange

15:10 – 15:30
Sesja plenarna
Data Platform - what does it take to be called a modern one? A new stack with well-known best practices MultiCloud
[su_expand more_text="Show more" less_text="Show less"]

On this talk you'll learn how we (as GetInData) combined all of these state-of-the-art components and software engineering best practices in our Modern Data Platform to provide a scalable and user-friendly environment for data scientists and analytics engineers in AWS and GCP clouds.

[/su_expand]
Data Analyst / Architect
GetInData
Sesja równoległa
4 ways to deploy a ML model on Amazon SageMaker AWS
[su_expand more_text="Show more" less_text="Show less"]

There are four distinct ways to deploy a machine learning model in AWS. Amazon SageMaker contains Batch Transform, Asynchronous Inference, Serverless Inference and Real-time Inference. Each method has its own use case and its own limits. In our talk we compare them, we outline their pros and cons and we help you decide which one is the right fit for you.

[/su_expand]
Solutions Architect
Chaos Gears

15:30 – 16:15

COFFEE BREAK

PARALLEL SESSION 1

PARALLEL SESSION 2

16:15- 16:35
Sesja równoległa
Data Mesh concept, executed by Trino MultiCloud
[su_expand more_text="Show more" less_text="Show less"]

Data Mesh enables every person in an organization to read and consume data produced by its software, greatly improving the cycle of discovering the data, learning about it, and envisioning new ways of utilizing it. All of that without the overhead of fragile ETL processes, monolithic data warehouses, or even highly sophisticated data lakes. Join me to hear about what Data Mesh is and how it can be implemented by Trino. Let's talk on how to finally enable people to easily discover, read and reason about data in your organization.

[/su_expand]
Agile programmer, leader, mentor
Starburst
Sesja równoległa
From a machine learning competition to an enterprise analytics framework Google Cloud
[su_expand more_text="Show more" less_text="Show less"]

Not long ago, we had a chance to participate in a machine learning competition on the Kaggle platform. Usually, the goal of competing is to win, but hey - there’s only one winner among thousands of participants, so we tried to be smarter than that. We set up our own goals, just in case we somehow don’t manage to be the best. And guess what - we weren’t the best, but we learned a ton of things about different data analysis and machine learning approaches, useful MLOps and cloud tools and data science project team management. Accomplishing these research goals led us to some rough piece of machinery that combined Google Cloud Platform, Kedro, MLFlow and various analytical algorithms to solve a specific business problem. After we realized that with just a little bit of polishing and structuring we could forge it into a really robust framework for tackling a wide range of other, generic problems, we rolled up our sleeves and got to work. In this presentation we would like to show you what is our idea for such a framework, that allows you to take a data sample from your client even before making an official commercial offer, pick some bricks that match the use case, adjust them a little, quickly prototype the solution and get back to your client with an empirically proven estimate of analytical potential hidden in his data.

[/su_expand]
Senior Data Scientist
GetInData
Senior Data Scientist
GetInData

16:40-17:00
Sesja plenarna
Engineering Risk Assessment - how to measure the risk in enterprise IT infrastructure MultiCloud
[su_expand more_text="Show more" less_text="Show less"]

Engineering risk assessment is a process of analysing potential threats and vulnerabilities to enterprise IT systems to establish what loss an organization might expect to incur if certain events happen. Its objective is to help achieve optimal security at a reasonable cost. This is especially important while implementing new software to organization infrastructure such as cloud solutions.

During this short talk, I will present measurement methods that are used to indicate risk in IT infrastructure.

Based on examples in areas of cloud security, data management & governance, or incident management I will show challenges with establishing a risk profile.

[/su_expand]
Associate, IAM Secrets and Encryption Svc
Investment Banking Sector
Sesja plenarna
Breaking the stormy antipattern - from bad design to cloud native on Beam Google Cloud

This is a story about customer's implementation of complex event processing system running in very bad setup on Storm and how we manage to build clean design on Dataflow while maintaining core requirements.

Strategic Cloud Engineer
Google

17:05-17:25
Sesja plenarna
The road to world-class observability MultiCloud

In this talk, you will learn about the road to world-class observability of distributed and scalable systems. In this talk, you will:
- Learn about various stages of monitoring maturity
- Discover at what stage you are and learn how to get to the next level
- Learn about the monitoring prioritization pyramid
- Get best practices and tools recommendations to help you on your observability improvement journey
- Understand how to scale your observability in sync with your company

Director of Engineering
Xapo
Sesja plenarna
Spark vs. Bigquery vs. Trino: Shopify’s journey of SQL transformations at scale Google Cloud
[su_expand more_text="Show more" less_text="Show less"]

Cloud data warehousing technologies, such as Bigquery, have allowed companies to scale their analytics operations. Now at what point does it make sense to buy vs build. Bigquery has been a tremendous asset to Shopify, but we have had to reassess our relationship. We will walk through a case study of how Shopify is balancing the ability to move fast, along with supporting our needs of cost optimization, transparency, access controls, and customization.

[/su_expand]
Data Developer
Shopify

17:25-17:35

COFFEE BREAK

PLENARY SESSION

17:35 – 18.05
Sesja plenarna
Introduction to Causal Inference in the Ride-Hailing Business AWS
[su_expand more_text="Show more" less_text="Show less"]

Designing an experiment shows us how many constraints and limitations we have to deal with. Every business requires a different way to set up experimentations, therefore disparate techniques to test our solutions. Standard approaches in statistics, such as regression analysis, are concerned with quantifying how changes in X are associated with changes in Y. Unlike methods that are concerned with associations only, causal inference approaches can answer the question of why Y changes.

[/su_expand]
Data Scientist
Free Now

18.05 – 18.15
Sesja plenarna
Summary and closing the meeting
Head of Data Science
DataMass
CEO and Co-founder
GetInData
CEO, Meeting Designner
Evention

EVENING MEETING
18:30 – 22:30

 


ZORBA RoofTop

Szafarnia 10 street, 80-755 Gdansk

The DATAMASS conference is not only about workshops and substantive presentations. It’s also a great time for networking.
That is why we invite you to an EVENING PARTY, during which, with good Greek food and drink, we will have the opportunity to get to know each other and talk to the group of participants, speakers and conference partners.
The perfect place for integration will be the ZORBA RoofTop restaurant in Gdansk – a concept full of energy, music, good food and Greek aroma. The RoofTop is located on the 10th floor of the Marina Hotel in Gdansk’s Old Town. It is a cocktail bar with great views – of the whole city of Gdansk, with drinks and food inspired by the Greeks. ZORBA RoofTop is the perfect place to organize unusual meetings, which are a great opportunity for networking.

EVENING EVENT:


On the 27th of April from 6:00 to 10:00 p.m. you are welcome to a dinner at the PIANO BAR located at Żelazna 51/53 (A unique cocktail bar with live music in Warsaw in the revitalized Norblin Factory).

Elevator Pitch

During DATAMASS we will present either short presentation in the „case study” formula during a special coffee break activating participants to networking. Cooperate with us in accordance with the motto of Evention „time engaged” – take part in ELEVATOR PITCH and with the spirit of the principle „maximum concrete in minimum time” convince participants in 7 minutes to your idea / solution / project / technology.
Stand out from other companies and solutions!

Independent Workshops, 29.09.2022 (Thursday)

 

During DATAMASS there will be 3 parallel workshops (additionally payable), each for a maximum of 25 participants

When/Where? September 29, Conference Center of Museum of The Second World War, Gdańsk

From 9 a.m. to 4 p.m (including lunch and coffee breaks), onsite

Build reliable data pipelines using Modern Data Stack in the cloud


DETAILS:

In this one-day workshop, you will learn how to create modern data transformation pipelines managed by dbt and orchestrated with Apache Airflow.


SESSION LEADER:

Data Analyst / Analytics Engineer
GetInData
Data Engineer
GetInData

(Near) Real-time data processing in the cloud using Spark Structured Streaming and SparkML


DETAILS:

The subject of this workshop is real-time data analysis using Spark Streaming. We’ll cover how Spark streaming works and how it can be used in machine learning systems.


SESSION LEADERS:

Head of Data Science
DataMass

Machine Learning Operations (MLOps)


DETAILS:

In this one-day workshop, you will learn how to operationalize Machine Learning models using popular open-source tools, like Kedro and Kubeflow, and deploy it using cloud computing.


SESSION LEADERS:

Machine Learning Engineer
Getindata

sponsors

PLATINUM

[organizacje_logotypy name="2022-platinum"]

GOLD

[organizacje_logotypy name="2022-gold"]

SILVER

[organizacje_logotypy name="2022-silver"]

PARTNERS

[organizacje_logotypy name="2022-silver"]

technologies we use

Organizers

General Platinum Partner

[organizacje_logotypy name="platinum-general-partner"]

Gold Strategic Partner

[organizacje_logotypy name="2022-gold"]

Silver Content Partners

[organizacje_logotypy name="silver-content-partners"]

Supporting Partners

[organizacje_logotypy name="supporting-partners"]

Patronage

[organizacje_logotypy name="patronage"]

Media Patronage

[organizacje_logotypy name="media-patronage-2022"]

Community Patronage

[organizacje_logotypy name="2022-community-patronage"]

tickets

Early Bird
  • In accumsan condimentum libero
  • Venenatis metus viverra maximus.
  • Ut vehicula tempus nisl, id maximus libero semper vitae.
  • Donec semper, sapien ut cursus commodo, lacus est laoreet augue, sed consectetur justo metus eu quam.
  • Suspendisse at nunc sollicitudin, aliquam turpis sit amet, accumsan purus.
  • Quisque pharetra pretium felis aliquam pulvinar.
  • Cras volutpat lectus sed dui luctus.
  • Venenatis metus viverra maximus.
  • Ut vehicula tempus nisl, id maximus libero semper vitae.
Standart Pass
  • In accumsan condimentum libero
  • Venenatis metus viverra maximus.
  • Ut vehicula tempus nisl, id maximus libero semper vitae.
  • Donec semper, sapien ut cursus commodo, lacus est laoreet augue, sed consectetur justo metus eu quam.
  • Suspendisse at nunc sollicitudin, aliquam turpis sit amet, accumsan purus.
  • Quisque pharetra pretium felis aliquam pulvinar.
  • Cras volutpat lectus sed dui luctus.
  • Venenatis metus viverra maximus.
  • Ut vehicula tempus nisl, id maximus libero semper vitae.
Last Call
  • In accumsan condimentum libero
  • Venenatis metus viverra maximus.
  • Ut vehicula tempus nisl, id maximus libero semper vitae.
  • Donec semper, sapien ut cursus commodo, lacus est laoreet augue, sed consectetur justo metus eu quam.
  • Suspendisse at nunc sollicitudin, aliquam turpis sit amet, accumsan purus.
  • Quisque pharetra pretium felis aliquam pulvinar.
  • Cras volutpat lectus sed dui luctus.
  • Venenatis metus viverra maximus.
  • Ut vehicula tempus nisl, id maximus libero semper vitae.

ADDRESS:

Conference Center of  Museum of The Second World War

pl. Bartoszewskiego 1
80-862 Gdańsk

contact

DataMass strives to provide the best service possible with every contact!

BECOME A DATAMASS GDANSK SUMMIT PARTNER!

Mariola Rauzer

m: +48 509 622 541
e: mariola.rauzer@evention.pl

Dominika Opoka

m: +48 604 112 883
e: dominika.opoka@evention.pl

CONTACT FOR ORGANIZATIONAL MATTERS

Kamil Piotrowski

+48 570 272 723

kamil.piotrowski@evention.pl

CONTACT FOR PARTICIPANTS

Weronika Warpas

+48 570 611 811

weronika.warpas@evention.pl