Text-to-SQL automatically translates natural language queries to SQL, allowing non-technical users to retrieve data from databases without specialized SQL knowledge. Despite the success of advanced LLM-based Text-to-SQL approaches on leaderboards, their unsustainable computational costs—often overlooked—stand as the "elephant in the room" in current leaderboard-driven research, limiting their economic practicability for real-world deployment and widespread adoption.
To tackle this, we exploratively propose EllieSQL, a complexity-aware routing framework that assigns queries to suitable SQL generation pipelines based on estimated complexity. We investigate multiple routers to direct simple queries to efficient approaches while reserving computationally intensive methods for complex cases. Drawing from economics, we introduce the Token Elasticity of Performance (TEP) metric, capturing cost-efficiency by quantifying the responsiveness of performance gains relative to token investment in SQL generation. Experiments show that compared to always using the most advanced methods in our study, EllieSQL with the Qwen2.5-0.5B-DPO router reduces token use by over 40% without compromising performance on Bird development set, achieving more than a 2× boost in TEP over non-routing approaches. This not only advances the pursuit of cost-efficient Text-to-SQL but also invites the community to weigh resource efficiency alongside performance, contributing to progress in sustainable Text-to-SQL.
We aim to invite the whole community for further research into balancing performance with resource expenditure, hoping to contribute to more practical and sustainable Text-to-SQL. This paper serves as an exploration, a stepping stone rather than a definitive answer, intended to spark further research and discussion. In this paper, we try to confront the elephant with contributions summarized as follows:
If you find our work useful or inspiring, please kindly cite:
@misc{zhu2025elliesql,
title={EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing},
author={Yizhang Zhu and Runzhi Jiang and Boyan Li and Nan Tang and Yuyu Luo},
year={2025},
eprint={2503.22402},
archivePrefix={arXiv},
primaryClass={cs.DB},
url={https://arxiv.org/abs/2503.22402},
}