🐘EllieSQL: Cost-Efficient Text-to-SQL
with Complexity-Aware Routing

1The Hong Kong University of Science and Technology (Guangzhou), 2The Hong Kong University of Science and Technology
*Equal Contribution.

Teaser Figure: Framework of EllieSQL

📖Introduction

Text-to-SQL automatically translates natural language queries to SQL, allowing non-technical users to retrieve data from databases without specialized SQL knowledge. Despite the success of advanced LLM-based Text-to-SQL approaches on leaderboards, their unsustainable computational costs—often overlooked—stand as the "elephant in the room" in current leaderboard-driven research, limiting their economic practicability for real-world deployment and widespread adoption.

To tackle this, we exploratively propose EllieSQL, a complexity-aware routing framework that assigns queries to suitable SQL generation pipelines based on estimated complexity. We investigate multiple routers to direct simple queries to efficient approaches while reserving computationally intensive methods for complex cases. Drawing from economics, we introduce the Token Elasticity of Performance (TEP) metric, capturing cost-efficiency by quantifying the responsiveness of performance gains relative to token investment in SQL generation. Experiments show that compared to always using the most advanced methods in our study, EllieSQL with the Qwen2.5-0.5B-DPO router reduces token use by over 40% without compromising performance on Bird development set, achieving more than a 2× boost in TEP over non-routing approaches. This not only advances the pursuit of cost-efficient Text-to-SQL but also invites the community to weigh resource efficiency alongside performance, contributing to progress in sustainable Text-to-SQL.

🛠️Implementations

As shown in teaser figure, our EllieSQL operates in three key phases: schema linking, routing, and tiered SQL generation. At the heart of EllieSQL lies the router, which is, serving as the decision-making core, which dynamically directs queries to the appropriate tier of SQL generation methods based on estimated query complexity. The router is strategically positioned in Phase II, following the schema linking phase since the filtered schema provides valuable structural information that strongly correlates with query complexity. Our motivation stems from the following observation: in practice, Text-to-SQL tasks exhibit significant heterogeneity in complexity, and not all cases necessitate the most powerful yet sophisticated and resource-intensive approach for effective resolution. Existing leading approaches, however, indiscriminately apply complex reasoning and computationally expensive methods to all queries, resulting in computational waste when processing simple, straightforward queries that could be effectively handled by lightweight approaches. This inefficiency reveals a clear opportunity for optimization through complexity-aware routing. We investigate various router implementations across three categories for our experiments as follows. Qwen2.5-Coder-0.5B (Qwen) and RoBERTa-base are fine-tuned as routers. Following CHASE-SQL, We implementate our three-tiered SQL generation methods (Basic, Intermediate, Advacned) in Phase III, as shown in teaser. > Please refer to our paper for more detials.

🧪Experiments

Experiment Objective. The primary objective of our experiments is to validate whether our complexity-aware routing framework for Text-to-SQL can maintain performance comparable to consistently using the most advanced method while significantly reducing token consumption. Therefore, our analysis focuses on relative performance differences and comparisons between different methods, rather than absolute values. Metrics. Execution Accuracy (EX) evaluates the performance of the Text-to-SQL system by comparing whether the execution result sets of the gold SQL queries and the predicted SQL queries are identical. The Performance Gap Recovered (PGR) assesses the effectiveness of routers (Ong et al., 2025) while Token Elasticity of Performance (TEP) measures the responsiveness of performance gains relative to token investments. Results. The tables and figures below show the performance across difference experiments (both base and routing methods), relationship between PGR and TEP, EX and token cost, G_A allocation and other metrics, repectively. > For full results and detailed analysis, please refer to our paper.

🌟Our Contributions

We aim to invite the whole community for further research into balancing performance with resource expenditure, hoping to contribute to more practical and sustainable Text-to-SQL. This paper serves as an exploration, a stepping stone rather than a definitive answer, intended to spark further research and discussion. In this paper, we try to confront the elephant with contributions summarized as follows:

- Elephant in the Room. We highlight a critical yet overlooked cost-efficiency limitation in current Text-to-SQL methods, hindering their practical deployment beyond laboratories. - TEP Metric. We introduce Token Elasticity of Performance (TEP), an economic-inspired metric for evaluating the responsiveness of performance gains relative to token investments in SQL generation. - EllieSQL. We propose EllieSQL framework with various router implementations to direct queries to appropriate tiered SQL generation pipelines based on estimated complexity. - Extensive Experiments. Experiments exhibit the potential and effectiveness of EllieSQL with various routers. Notably, with the Qwen2.5-0.5B-DPO router, we reduce token use by over 40% without sacrificing performance compared with consistently deploying the most advanced SQL generation pipeline, achieving more than a 2× boost in TEP.

✏️BibTeX Citation

If you find our work useful or inspiring, please kindly cite:

@misc{zhu2025elliesql,
      title={EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing}, 
      author={Yizhang Zhu and Runzhi Jiang and Boyan Li and Nan Tang and Yuyu Luo},
      year={2025},
      eprint={2503.22402},
      archivePrefix={arXiv},
      primaryClass={cs.DB},
      url={https://arxiv.org/abs/2503.22402}, 
}