Chicory is an innovative analytics and optimization platform tailored for Spark environments, initially focusing on SparkSQL queries within Spark Python projects. Utilizing the cutting-edge capabilities of Large Language Models (LLM), the suite offers deep analysis of SparkSQL queries by integrating real-time metadata and context from databases, including Data Definition Language (DDL) and statistical information. The primary aim is to significantly enhance the efficiency and performance of SparkSQL queries and Spark transformations by providing precise, actionable optimization recommendations. This tool suite is poised to become a comprehensive solution for optimizing a wide array of Spark pipelines, ensuring they run at optimal efficiency and speed.

This platform stands out by leveraging AI-driven insights to understand the complexities of SparkSQL queries and offering tailored recommendations that can lead to substantial performance improvements in data processing tasks. By doing so, it addresses the critical need for tools that can simplify and optimize the increasingly complex data transformations and analyses performed in Spark environments.


This ai-powered workflow aims to analyze SparkSQL queries within a Spark Python project to provide optimization recommendations. It leverages Large Language Models (LLM) for analysis and real-time metadata from a database (DB) for context, such as Data Definition Language (DDL) and statistics. The goal is to enhance the efficiency and performance of SparkSQL queries by offering actionable insights.

Last updated