19.3 C
Tuesday, July 16, 2024

Databricks Summit 2024: AI Takes Center Stage with Focus on Democratizing Data

Must read

Databricks Summit highlights AI innovation, open-sourcing Unity Catalog, and Databricks AI/BI for self-service analytics. Learn more about key announcements!

Databricks’ annual summit has always been a party for data ecosystem stakeholders. The company shares new technologies, partnerships, and developments that make working with data assets – whether structured or unstructured – easier than ever. This year, the summit saw the same party continue, albeit with one major (and expected) shift: a focus on AI.

In his keynote, CEO Ali Ghodsi shared several innovations at the intersection of data and AI as part of the company’s broader effort to help teams make the most of their governed datasets on the Databricks Data Intelligence Platform. This included upgrades to Mosaic AI, the company’s platform for AI development, a new model for image generation, and a generative AI-driven offering for better and faster data analytics.

Connecting physical and digital worlds: A developer’s journey

Below is a rundown of all major announcements:

Unity Catalog goes open-source

Using Snowflake’s Polaris Catalog, Databricks open-sourced its Unity Catalog under an Apache 2.0 license with OpenAPI specification, server, and clients. The move means other firms can take the underlying architecture and code to set up their catalogs supporting data in any format, including Iceberg and Delta/Hudi via UniForm, and interoperability with all major cloud platforms and compute engines. The code for the catalog was published live on stage, while Polaris Catalog is expected to go open source over the next 90 days.

Mosaic AI gets new tools for production-grade compound AI systems

Mosaic AI, the company’s suite of tools for building AI applications, got a major upgrade to help teams build trusted, production-grade compound AI systems. This included a new Mosaic AI Model Training product, an AI Agent framework, an Evaluation framework, an AI Tools Catalog, and an AI Gateway for governance and trust. All offerings, except the AI tools, are in public preview starting today.

New text-to-image model for enterprises

Databricks also announced the private preview launch of Shutterstock ImageAI, a text-to-image generative AI model that provides enterprises with high-fidelity, trusted images for different business use cases. The model was pre-trained with Mosaic AI, using Shutterstock’s trusted image collection.

It is live on Shutterstock’s image generator and will be available for fine-tuning via Mosaic AI and for application integration via API.

Databricks AI/BI for intelligent analytics

For enterprises looking to democratize access to analytics and insights, Databricks announced the launch of Databricks AI/BI, a compound AI system that sits atop Databricks Data Intelligence Platform and utilizes an ensemble of AI agents (Dashboards and Genie) to reason about business questions and generate useful natural language answers and visualizations.

Each agent is responsible for a narrow but important task, such as planning, SQL generation, explanation, visualization, and result certification. Other components, such as a response ranking subsystem and a vector index, further support them. The offering is for all Databricks SQL Pro and Serverless customers, with Dashboards generally available and Genie in public preview starting today.

Databricks LakeFlow for simplified data engineering

In addition to AI/BI, Databricks debuted LakeFlow, a unified experience built atop its Data Intelligence Platform to unify and simplify all aspects of data engineering, from data ingestion to transformation and orchestration.

Also Read: How Data Drives Eco-Friendly Practices in Travel and Hospitality

While building and maintaining data pipelines has long been a task for complex tools and integration, LakeFlow solves this problem for good. The offering ingests data from different sources and then automates pipeline deployment, operation, and monitoring with built-in support for CI/CD and quality checks at scale.

It is yet to enter preview, although Databricks has opened a waitlist where users can sign up for early access.

Partnerships with Nvidia and Gretel

Finally, Databricks announced major partnerships with Nvidia and Gretel.

The partnership with Nvidia focuses on adding native support for CUDA-accelerated computing in Databricks’ next-generation vectorized query engine, Photon, to deliver improved speed and efficiency when handling data warehousing and analytics workloads. Meanwhile, the engagement with Gretel makes the company an ISV technology partner providing high-quality synthetic datasets to build and customize machine learning models on Databricks’ platform.

More articles

Latest posts