Manage episode 229788369 series 1433319
Data engineering touches every area of an organization.
Engineers need a data platform to build search indexes and microservices. Data scientists need data pipelines to build machine learning models. Business analysts need flexible dashboards to understand the trends and customer use for a product.
Max Beauchemin is a data engineer who has worked at Airbnb, Lyft, and Facebook. He’s the creator of two successful open source projects: Apache Airflow and Apache Superset. In a previous show, Max discussed data engineering at Airbnb, and the usage of Airflow. In today’s show, Max discusses the engineering of Apache Superset.
Superset is an open source business intelligence web application. Superset allows users to create visualizations, slice and dice their data, and query it. Superset integrates with Druid, a database that supports exploratory, OLAP-style workloads.
One reason Superset is distinctive is that it is a full open source application. Many open source projects are tools like databases, command line tools, and web frameworks. Superset is an open source application that can be used by individuals who are not developers–so the audience is wider than the typical open source tool built for engineers.
Max joins the show to talk about his experience as a data engineer at Airbnb and Lyft, and the open source projects he has started.