29.09 - Workshop Day
30.09 - Conference Day
Fourth edition of the conference
– First time together with Evention and GetInData
Conference for people who use the cloud in their daily work to solve Big Data,
Data Science, Machine Learning and AI problems
Community of dedicated experts will help us share knowledge and exchange
experience in shaping scalable and distributed computing solutions
DataMass Summit is not just another conference – it is an event created with passion. The Summit is aimed at people who use the cloud in their daily work to solve Big Data, Data Science, Machine Learning and AI problems. The main idea of the conference is to promote knowledge and experience in designing and implementing tools for solving difficult and interesting challenges.
This year we came back after a long break forced by a pandemic, in a classic on-site form. And we have great news for You! We have joined forces with GetInData and Evention (kudos to Adam Kawa for making it possible), who for years have been organizing the Big Data Technology Warsaw Summit, the largest BigData conference in Poland. So for you, our event will be even more interesting and even stronger in terms of content.
We planned two days, the first day was devoted to technical workshops conducted by practitioners. The second day included conference presentations by renowned experts in Big Data, Data Science, Machine Learning and AI, all in the context of cloud solutions.
Do come and join us!
Selection committee DataMass Gdansk Summit 2022:
During DATAMASS there will be 3 parallel workshops (additionally payable), each for a maximum of 25 participants
When/Where? September 29, Conference Center of Museum of The Second World War, Gdańsk
From 9 a.m. to 4 p.m (including lunch and coffee breaks), onsite
DETAILS:
In this one-day workshop, you will learn how to create modern data transformation pipelines managed by dbt and orchestrated with Apache Airflow.
SESSION LEADER:
DETAILS:
The subject of this workshop is real-time data analysis using Spark Streaming. We'll cover how Spark streaming works and how it can be used in machine learning systems.
SESSION LEADERS:
DETAILS:
In this one-day workshop, you will learn how to operationalize Machine Learning models using popular open-source tools, like Kedro and Kubeflow, and deploy it using cloud computing.
SESSION LEADERS:
ZORBA RoofTop
Szafarnia 10 street, 80-755 Gdansk
The DATAMASS conference is not only about workshops and substantive presentations. It's also a great time for networking. That is why we invite you to an EVENING PARTY, during which, with good Greek food and drinks, we will have the opportunity to get to know each other and talk to the group of participants, speakers and conference partners.
The perfect place for integration will be the ZORBA RoofTop restaurant in Gdansk - a concept full of energy, music, good food and Greek aroma.
Opening | CEO DataMass Head of Data Science DataMass CEO and Co-founder GetInData CEO, Meeting Designner Evention |
Data Infrastructure in a Multi-Cloud environment MultiCloud | In this talk we'll talk about the Data Engineering Platform Datadog has built in multiple clouds. We'll discuss why, how, how it's used and the challenges we've faced. | Director Of Engineering Datadog |
Pepsico eCommerce and the Global impact of data MultiCloud | ![]() | Data Engineering Director PepsiCo |
Bank Analytics in the Cloud Google Cloud | Director of SME Risk & Analytics PKO Bank Polski |
Let’s build our own Cloud Data Platform MultiCloud | We love to analyze data. Often, we have to go to many different places to do one analysis that will help us make the correct business decision. Many sources exist because we match technology to varying types of data, different processing speeds, and costs. In addition, our organizations are changing rapidly and need more computing power and data storage space. How do you keep up with all this? Let's try to build an innovative, scalable data platform. ![]() | Senior Sales Engineer Snowflake |
Accelerating Data Solution Migration to the Cloud MultiCloud | With increasing adoption of the native cloud based technologies, companies are struggling to migrate their legacy due to insufficient resources, lack of time and knowledge. See how our revolutionary metadata based platform – ADELE – helps customer to overcome these obstacles with automated metadata harvesting and solution re-platforming capabilities. | Division Director Adastra Slovakia |
Cloud Data Ingestion Simplified MultiCloud | With the growing demand for real-time data in business, organizations need solutions that seamlessly extract data from many sources and ingest it into the cloud for further processing needs. Creating and managing data ingestion pipelines is a time-consuming and resource-intensive task. In this presentation you will learn about typical challenges and how they can be solved with DataLark - a LeverX developed modern data management platform, effectively combining batch and streaming data processing with data transformation capabilities. ![]() | Director, Software Solutions and Product Development LeverX Group |
How to process 33bln events from set top boxes in under 4 minutes Google Cloud | We will talk about lessons learned from advanced analytics in traditional enterprise: ![]() | Enterprise Architect Vectra |
Data engineering at the scale of PepsiCo eCommerce, 3 years of experience MultiCloud | During the talk we will present the approach to data engineering at Pepsico eCommerce which scales to hundreds of DAGs/workflows and terabytes of data processed daily. This approach has progressed iteratively over the past 3 years, is used by 100+ engineers at PepsiCo and leverages 3 major cloud providers and several SaaS vendors. In particular, we will discuss the following points: data acquisition, modeling, transformation, data quality and lineage, and orchestration and scheduling. ![]() | Data Engineering Director PepsiCo eCommerce Engineering Head PepsiCo |
Don’t go with the flow. How did Ringier Axel Springer moved its data to the cloud? AWS | For a modern publisher, data is like... water. This is a story how we moved our whole waterworks to the cloud. Story about changing our intakes, pipelines, pumps and sinks to the completely new environment. Story about using cloud-native services to optimize product development while keeping costs under control. We migrated over 120-datanode Hadoop cluster and massive data stream processing infrastructure to native data services provided by AWS. Join me, if you want to hear how such task can be performed in other way than just lift-and-shift. ![]() | Solution Product Manager Ringier Axel Springer Poland |
Medical Image Analysis using Auto ML Azure | In this session, you will learn how Automated ML can be combined with transfer learning to boost data scientist productivity when building computer vision models trained on medical image data. We will see new capabilities in Azure Machine Learning’s AutoML related to image classification, object detection and segmentation. ![]() | Freelancer Data Scientist WSB University |
IoT as a data pipe between the physical world and the cloud AWS | Important considerations when designing an IoT data pipe. Data without the metadata is a noise without a value. How the nature of input data impacts the solution design. Maximizing the throughput vs saving the battery life. How to leverage the pre-processing and post-processing in data migration projects. ![]() | Consultant Amazon Web Service |
Peer beneath the surface of the Earth in the clouds MultiCloud | ![]() | Senior Data Scientist, CI/CD best practices evangelist, and a trainer. SGPR.TECH |
Running Apache Flink in any cloud environment MultiCloud | Apache Flink is a distributed stream processing engine with large known users including Alibaba, Amazon, Netflix and Uber amongst others. Running Flink workloads in cloud environments is gaining popularity, not only for proof of concepts but also large scale production environments. This talk discusses a cloud provider agnostic approach for running Flink workloads using Kubernetes for the deployment and management of Flink jobs, Kafka as a message broker and the object store provided by your cloud for long term storage. The talks also demonstrates running Flink jobs in a cloud environment via the Flink Kubernetes Operator. | Flink Tech Lead Apple |
Anomaly Detection in Network with the use of ML Pipeline in Vertex AI Google Cloud | Have you ever wondered how AI technologies are used in the network area? What kind of machine learning models do we use in anomaly detection? How do we work with pipelines and Vertex AI? Orange, as a telco operator has a very big and wide telecommunication network based on servers, routers, and wires or antennas. Each day millions of bites are flowing into the network to enable customers to call somebody, use the internet or send a file. How to maintain such a network? We will take you on a unique tour through our Predictive Network Maintenance project, starting from raw data and ending with an automized solution. | Data Engineer Orange Machine Learning Engineer Orange |
From first contact to a full charge... How we built a Modern Data Platform in 4 months for a FinTech scale-up. MultiCloud | During this talk you'll learn how Volt.io together with GetInData and DataEdge architected a batteries-included Modern Data Platform using a combination of the state-of-the-art managed cloud services, R&D plugins and software engineering best practices to provide a scalable and self-service environment for analytics engineers, data analysts and business users. ![]() | Entrepreneur, Developer, Designer Volt Chief Data Architect GetInData Senior Partner Data Edge |
4 ways to deploy a ML model on Amazon SageMaker AWS | There are four distinct ways to deploy a machine learning model in AWS. Amazon SageMaker contains Batch Transform, Asynchronous Inference, Serverless Inference and Real-time Inference. Each method has its own use case and its own limits. In our talk we compare them, we outline their pros and cons and we help you decide which one is the right fit for you. ![]() | Solutions Architect Chaos Gears |
Data Mesh concept, executed by Trino MultiCloud | Data Mesh enables every person in an organization to read and consume data produced by its software, greatly improving the cycle of discovering the data, learning about it, and envisioning new ways of utilizing it. All of that without the overhead of fragile ETL processes, monolithic data warehouses, or even highly sophisticated data lakes. Join me to hear about what Data Mesh is and how it can be implemented by Trino. Let's talk on how to finally enable people to easily discover, read and reason about data in your organization. ![]() | Agile programmer, leader, mentor Starburst |
From a machine learning competition to an enterprise analytics framework Google Cloud | Not long ago, we had a chance to participate in a machine learning competition on the Kaggle platform. Usually, the goal of competing is to win, but hey - there’s only one winner among thousands of participants, so we tried to be smarter than that. We set up our own goals, just in case we somehow don’t manage to be the best. And guess what - we weren’t the best, but we learned a ton of things about different data analysis and machine learning approaches, useful MLOps and cloud tools and data science project team management. Accomplishing these research goals led us to some rough piece of machinery that combined Google Cloud Platform, Kedro, MLFlow and various analytical algorithms to solve a specific business problem. After we realized that with just a little bit of polishing and structuring we could forge it into a really robust framework for tackling a wide range of other, generic problems, we rolled up our sleeves and got to work. In this presentation we would like to show you what is our idea for such a framework, that allows you to take a data sample from your client even before making an official commercial offer, pick some bricks that match the use case, adjust them a little, quickly prototype the solution and get back to your client with an empirically proven estimate of analytical potential hidden in his data. ![]() | Senior Data Scientist GetInData Senior Data Scientist GetInData |
Engineering Risk Assessment - how to measure the risk in enterprise IT infrastructure MultiCloud | Engineering risk assessment is a process of analysing potential threats and vulnerabilities to enterprise IT systems to establish what loss an organization might expect to incur if certain events happen. Its objective is to help achieve optimal security at a reasonable cost. This is especially important while implementing new software to organization infrastructure such as cloud solutions. During this short talk, I will present measurement methods that are used to indicate risk in IT infrastructure. Based on examples in areas of cloud security, data management & governance, or incident management I will show challenges with establishing a risk profile. | Associate, IAM Secrets and Encryption Svc Investment Banking Sector |
Breaking the stormy antipattern - from bad design to cloud native on Beam Google Cloud | This is a story about customer's implementation of complex event processing system running in very bad setup on Storm and how we manage to build clean design on Dataflow while maintaining core requirements. ![]() | Strategic Cloud Engineer |
The road to world-class observability MultiCloud | In this talk, you will learn about the road to world-class observability of distributed and scalable systems. In this talk, you will: ![]() | Director of Engineering Xapo |
Spark vs. Bigquery vs. Trino: Shopify’s journey of SQL transformations at scale Google Cloud | Cloud data warehousing technologies, such as Bigquery, have allowed companies to scale their analytics operations. Now at what point does it make sense to buy vs build. Bigquery has been a tremendous asset to Shopify, but we have had to reassess our relationship. We will walk through a case study of how Shopify is balancing the ability to move fast, along with supporting our needs of cost optimization, transparency, access controls, and customization. ![]() | Data Developer Shopify |
Introduction to Causal Inference in the Ride-Hailing Business AWS | Designing an experiment shows us how many constraints and limitations we have to deal with. Every business requires a different way to set up experimentations, therefore disparate techniques to test our solutions. Standard approaches in statistics, such as regression analysis, are concerned with quantifying how changes in X are associated with changes in Y. Unlike methods that are concerned with associations only, causal inference approaches can answer the question of why Y changes. ![]() | Data Scientist Free Now |
Summary and closing the meeting | CEO DataMass Head of Data Science DataMass CEO and Co-founder GetInData CEO, Meeting Designner Evention |
During DATAMASS there will be 3 parallel workshops (additionally payable), each for a maximum of 25 participants
When/Where? September 29, Conference Center of Museum of The Second World War, Gdańsk
From 9 a.m. to 4 p.m (including lunch and coffee breaks), onsite
DETAILS:
In this one-day workshop, you will learn how to create modern data transformation pipelines managed by dbt and orchestrated with Apache Airflow.
SESSION LEADER:
DETAILS:
The subject of this workshop is real-time data analysis using Spark Streaming. We'll cover how Spark streaming works and how it can be used in machine learning systems.
SESSION LEADERS:
DETAILS:
In this one-day workshop, you will learn how to operationalize Machine Learning models using popular open-source tools, like Kedro and Kubeflow, and deploy it using cloud computing.
SESSION LEADERS:
DataMass strives to provide the best service possible with every contact!
BECOME A DATAMASS GDANSK SUMMIT PARTNER!
m: +48 509 622 541
e: mariola.rauzer@evention.pl
m: +48 604 112 883
e: dominika.opoka@evention.pl
CONTACT FOR ORGANIZATIONAL MATTERS
Kamil Piotrowski
+48 570 272 723
kamil.piotrowski@evention.pl
CONTACT FOR PARTICIPANTS
Weronika Warpas
+48 570 611 811
weronika.warpas@evention.pl
ADDRESS:
Conference Center of Museum of The Second World War
pl. Bartoszewskiego 1
80-862 Gdańsk