Big Data Track Overview
Big data and open source remain closely intertwined, and this year’s track will look at the processes, tools and frameworks making the analysis and management of large datasets possible. This track is absolutely packed year after year!
*Note, this track takes place on both Monday, October 28 and Tuesday, October 29 in room 302B (third floor) of the Convention Center.
See who’s speaking and what’s being covered below.
Thanks to the wonderful team at MongoDB for making this track possible.
Monday, October 28

Open-source alternatives to the Cloud Data Warehouse
Tanya Bragin, VP Product, ClickHouse

Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us
Robert Gove, Senior Data Visualization Engineer, CrowdStrike

Elasticsearch Essentials: Data Loading with Python for Interplanetary Insights
Jessica Garson, Senior Developer Advocate, Elastic


Open Source Privacy-Preserving Metrics
Sarah Gran, VP of Brand & Donor Development & Brandon Pitman, Engineer and Technical Lead, Divvi Up


Equipping easy-to-use and scalable stream processing technologies on Kubernetes
Sidhant Kohli, Senior Software Engineer & Juanlu Yu, Senior Software Engineer, Intuit
Tuesday, October 29

Understanding Vector Databases
Nyah Macklin, enior Developer Advocate, Couchbase

KIP-714: Keep your Kafka Clusters Close, and your Kafka Clients Closer
Ricardo Ferreira, Principal Developer Advocate, AWS

Columnar Storage: Redefining Data Management for the Modern Era
Zoe Steinkamp, Senior Developer Advocate, Clickhouse

Machine Learning Pipelines at Scale with Apache Beam
Danny McCormick, Senior Software Engineer, Google