Astronomer CTO Julian LaNeve unpacks the future of data orchestration, democratization, and Airflow’s role in driving organizational innovation.
In an exclusive interview, Julian LaNeve, Chief Technology Officer at Astronomer, delves into the delicate equilibrium between innovation and stability in technology decision-making. As the driving force behind Astronomer’s approach, LaNeve shares insights into navigating data orchestration trends, democratizing data, and the pivotal role played by Apache Airflow in driving innovation within organizations.
Explore the intricate dance of technological choices and learn how Astronomer strikes a balance between stability and innovation. Their open-source model provides a glimpse into the challenges and strategies of their model. Join us in unraveling the future trends in data orchestration and workflow automation that are set to reshape the landscape of data-driven businesses.
Excerpts from the interview;
How do you navigate the balance between innovation and stability in technology decision-making?
Stability must underpin any technology we adopt or build and is our top priority. Our customers trust Astronomer to run mission-critical data pipelines, and maintaining that trust requires building a stable service. That said, creating and maintaining stability doesn’t mean you must sacrifice innovation. By conducting diligence upfront before introducing new features or technologies, we can understand the potential impact on the stability of the feature or technology.
We conduct due diligence in several ways. First, we have an architecture forum comprising a few senior members of the R&D team who are responsible for reviewing new components that teams want to introduce to our stack. The forum guides how to think about stability to ensure we don’t regress. Second, we’ve built a culture around careful tool evaluation.
When there are proposals for new technologies or features, we always evaluate a few alternatives and explicitly call those out in design documents (including a description of why the proposed solution is better than the alternatives). Finally, we consistently reserve part of our R&D budget to pay down technology debt. If a component causes stability issues, we ensure there is a plan to address it within the time frame of executing that plan.
What future trends in data orchestration and workflow automation will impact data-driven businesses?
The biggest trend we see is a shift from data powering internal tooling and dashboards to powering customer-facing applications and business-critical processes. A few years ago, if your pipelines didn’t run successfully or on time, the data powering your internal dashboards wouldn’t be delivered––annoying but not mission-critical. Today, businesses have created reliances on the data being delivered on time, every time. Whether it’s to power machine learning models that drive in-product experience or regulatory reporting that comes with hefty fines, the pipelines we see our customers running are becoming increasingly mission-critical.
Also Read: Balancing Growth with Sustainability and Security in AI
As per Astronomer, how will democratizing data evolve and contribute to fostering innovation within organizations?
Now that companies have become very skilled at collecting and delivering data on time, they’re looking for more ways to extract the full value of that data. One way to approach this is to “democratize access to data,” which means businesses find ways to give less technical members of their teams access to use and analyze data.
As organizations become even more data-driven, democratizing access to the data will become more important and easier to accomplish. Generative AI and large language models (LLM) will help significantly here. The technology acts as a natural bridge between technical interfaces and non-technical users. Rather than going directly to technical interfaces, a business user can work in natural language with an LLM to access and work with data.
How does Astronomer align its goals with utilizing Apache Airflow for data workflow management, and what are the advantages and challenges of this choice?
Astronomer offers our customers a product named Astro, a unified data platform built on top of Apache Airflow. Customers can run their existing Airflow pipelines on Astro at a larger scale, more efficiently, and with a seamless user experience. To that extent, Apache Airflow is critical to Astronomer and Astro. We invested quite a bit in the open-source project; we were heavily involved in the Airflow 2.0 release, and at this point, the majority of the code in the project was written by Astronomer staff.
Any business with an “open source core” model faces unique challenges. At Astronomer, Apache Airflow’s continued success is critical—without it, we have a smaller market to market our product to.
How does Astronomer adjust to edge computing demands, and what’s the expected impact on the data ecosystem?
The Astro platform is built on top of Apache Airflow, making it easy for users to adopt and making our compute options very flexible. Customers can centralize their orchestration logic (i.e., the order in which pipelines and tasks should be run) on Astro and choose the right compute model for their use case, including edge computing. Instead of running workloads directly on our compute, customers can set up edge computing technologies and let Airflow interact with those.
Given its open-source approach, how does Astronomer balance innovation, agility, stability, and security in its critical research infrastructure?
Innovation and agility are key pillars of the Astro platform, but stability and security underscore everything we do and are at the center of our business. The Astro architecture is secure by default, using encryption in transit, encryption at rest, strong cryptographic protocols, authentication, and role-based access control for authorization to data pipelines, with a host of flexible and secure connectivity options for critical data sources.
How do you measure the success of Astronomer’s technology?
Technology success can be difficult to measure, but I think about it from a few different angles.
We measure the success of our technology primarily by our ability to support our customers. The product we build (and the technology behind it) is ultimately designed to give our customers a great experience, and there are very defined ways to measure customer success. In addition, we also look at our ability to forecast and meet our customers’ future demands. For example, we support our customers as they explore and adopt generative AI technologies. As the data landscape becomes increasingly more complex, organizations and data teams rely on the Astro platform to manage and modernize data pipelines––a critical element for an effective AI strategy. We’ve released a set of integrations between Airflow and more than five different generative AI tools to ensure our customers successfully adopt and scale new use cases. As of September 2023, Astro’s year-over-year platform usage increased by 1400%.
Also Read: The Future of Business Intelligence: 10 Trends Shaping the Data-Driven Landscape
Outside of our ability to support our customers, we also measure success in the open-source project Apache Airflow. We’re extremely motivated to continue establishing Airflow as a modern solution to data orchestration challenges because it’s the basis of our platform, and we’re proud to be such a large driving force behind the project. We look at community adoption and sentiment towards the project to measure the success of our contributions there, and we’re excited about the momentum. Apache Airflow has over 73 million downloads and more than 2,500 contributors. Last week, the community released Airflow 2.8, which contains over 20 new features, 60 improvements, and 50 bug fixes.
Finally, our ability to attract and retain talent also helps us measure the success of our technology. Engineers ultimately want to work on interesting projects with innovative technology, so our ability to offer that environment–and thus attract and retain talent—is another good proxy.