Vanna AI
What it is: Vanna is an open-source Python framework that uses RAG plus an LLM to convert natural language to SQL. You train it once on your DDL, documentation strings, and a handful of golden query examples, and it generates accurate SQL for your specific warehouse.
Why it matters for data work
Generic text-to-SQL fails on real schemas because it has no idea what your columns mean. Vanna's training-data approach — feed it real questions and the SQL that answers them — gets accuracy into the high 90s for the queries your team actually asks.
Install & configure
pip install vanna
Pick a vector store (ChromaDB, Pinecone, Qdrant) and an LLM (OpenAI, Anthropic, local). Train with vn.train(ddl=...), vn.train(documentation=...), and vn.train(question=..., sql=...) calls. Then call vn.ask(question).
Example usage
Stand up a Slack bot in an afternoon: an analyst asks "monthly revenue for product X this year" and the bot replies with a chart and the underlying SQL. Wrong answers? Add a corrected example to the training set and the next ask is right.
Author & links
Author: Vanna AI
Repo: github.com/vanna-ai/vanna
License: MIT
Related skills
For a full UI/BI experience instead of a Python library, see Wren AI.
← Back to Data Agents