Mistral AI
and Snowflake

Snowflake Leverages Mistral AI's LLMs to Enhance Self-Service Analytics

Snowflake logo

Beyond basic queries: Achieving 2x greater accuracy with text-to-sql on Snowflake

Snowflake is the AI Data Cloud company that helps companies get more value from their data, processing billions of queries each day across thousands of customers to drive important business insights. Central to the success of Cortex Analyst, their AI-powered assistant for structured data, was a partnership with Mistral AI and the integration of two LLMs - Mistral Large and Codestral - selected for their excellent natural language understanding and SQL generation ability.

By leveraging Mistral AI's LLMs, Snowflake Cortex Analyst achieved:

2Xhigher accuracy compared to state-of-the-art LLMs

14%higher accuracy than other competing solution

90%or higher accuracy consistently across customer evaluations

This collaboration has set a new standard for user experience in generative AI SQL products, making it easier for non-technical team members to tap into self-service analytics fast.

Background

The need for LLMs to help interpret data is growing as customers look for access to insights in both structured and unstructured data. The Snowflake development team needed a solution for Cortex Analyst that could generate SQL with expertise, while also providing conversational skills for high-quality natural language interactions. Given the importance of accuracy to data teams and business users using query results to inform critical business decisions, the solution needed to be state-of-the-art and outperform existing evaluation metrics.

With Mistral Large and Codestral, Snowflake is delivering cutting-edge text-to-sql capabilities for complex and nuanced enterprise data,” said Yusuf Ozuysal, Director of Engineering, Snowflake
SQL: A barrier to quick insight

For anyone frequently writing SQL queries in an enterprise environment, the learning curve of finding the right schema and understanding the nuance of complex datasets can be steep. To further complicate matters, SQL queries can be subtle and ambiguous, and the “correct” answer may be written more than one way. Writing and optimizing SQL queries can be complex and time-consuming for non-technical users, making a seamless and intuitive user experience crucial to driving adoption and satisfaction among a diverse user base.

To provide faster access to insights, Snowflake wanted to enhance Cortex Analyst with highly accurate text-to-SQL capabilities. The goal was to develop highly accurate text-to-SQL capabilities that could handle complex datasets, simplify the learning curve for schema identification, and provide an intuitive experience for non-technical users. Most critically, Snowflake needed to ensure state-of-the-art accuracy to support critical business decisions.

Codestral and Mistral Large: Delivering accurate, conversational responses

Snowflake needed to generate a SQL query, evaluate that query for accuracy and fix any potential issues, and then deliver the query and relevant information conversationally to business users. Snowflake turned to two Mistral models as potential answers: Codestral, an open-weight model explicitly tailored for code generation tasks, and Mistral Large, for reasoning, language generation, and code generation.

For robust query generation, Snowflake uses a two-step approach: first, asking multiple agents to generate SQL and then asking another agent to synthesize the final SQL based on the expert responses. Snowflake uses Mistral Large and a finetuned Codestral for the initial SQL generation and uses Codestral for synthesizing the final SQL response.

Cortex Analyst also needed to present queries and important contextual information conversationally using natural language that any user, regardless of their technical expertise, could understand. An important consideration was identifying ambiguous or unanswerable questions given available data and alerting the user or suggesting alternatives.

Mistral AI was the perfect fit for text-to-sql generation.” said Yusuf Ozuysal, Director of Engineering, Snowflake. ”Our goal was to exceed accuracy when compared to existing benchmarks. We tried several other models, but even with fine-tuning, they did not meet the required execution accuracy”

Mistral Large was critical in providing both the SQL knowledge and conversational quality. The additional explanation of how the query was generated and reference documentation gives users confidence in the output and the ability to research more when needed.

Why Mistral AI?

Snowflake selected Mistral Large and Codestral models for their:

Superior natural language understanding and SQL generation capabilities

Strong performance in both code generation and reasoning tasks

Ability to provide high-quality conversational explanations

Competitive advantage in SQL expertise compared to other models

Results and Impact: Setting New Standards in Accuracy and User Experience

The results were impressive. Based on internal evaluation sets, using Mistral Large and Codestral, Cortex Analyst is consistently close to 2x more accurate than single-shot SQL generation from state-of-the-art LLMs and delivers approximately 14% higher accuracy than another competing text-to-SQL solution in the market. Additionally, Cortex Analyst achieved ~90% or higher accuracy reliably across customer evaluations as well as in benchmark tests, which is critical for enterprise use cases.

Beyond numerical improvements, the collaboration between Snowflake and Mistral AI has set a new standard for user experience in generative AI SQL products. It has enhanced natural language interactions for non-technical users and improved the ability to identify ambiguous questions and suggest alternatives. This partnership has allowed Snowflake to further deliver on their promise of democratizing generative AI for customers and delivering value on their enterprise data.

Looking Ahead: Expanding the Horizons of Data Analytics

As Snowflake increases the accessibility of data insights through AI-powered data applications, the collaboration with Mistral AI opens up new possibilities. Higher quality outputs lead to user satisfaction and adoption, allowing data teams to confidently build business applications using state-of-the-art solutions without the accuracy, latency, and cost challenges that come with building and maintaining internal models.

This partnership between Snowflake and Mistral AI is just the beginning of a journey to provide customers with fast and insightful access to data, making self-service analytics more accessible and efficient for enterprises across various sectors.

To learn more about why businesses across the globe trust Mistral AI for open, efficient and performant LLMs, check out our customer stories at https://mistral.ai/customers.