Hi, I'm

Jiale Lao

PhD Student in Computer Science

Cornell University

I research how to leverage large language models to enhance the efficiency, usability, and generality of database systems.

Jiale Lao

About Me

I am a PhD student in Computer Science at Cornell University, advised by Professor Immanuel Trummer. My research focuses on leveraging advanced techniques from Natural Language Processing (e.g., Large Language Models) to enhance the efficiency, usability, and generality of database systems.

During my time at Sichuan University as a B.Eng student, I was fortunate to be advised by Professor Mingjie Tang and Professor Jianguo Wang from Purdue University, focusing on utilizing machine learning techniques to optimize database performance.

I have published multiple papers in SIGMOD and VLDB conferences and received the SIGMOD Research Highlight Award 2024.

Database Systems Machine Learning Large Language Models Query Optimization DB Tuning

News

Mar 2026 GenDB is available now! The next generation of query processing — synthesized, not engineered!
Nov 2025 SQLBarber is accepted by SIGMOD 2026!
Nov 2025 SemBench is under revision of VLDB 2026!
Nov 2025 SemBench is available now! Let us go for Semantic Query Processing!
Aug 2025 ToxicSQL is accepted by SIGMOD 2026!
Mar 2025 SQLBarber Demo is accepted by SIGMOD 2025!
Dec 2024 GPTuner wins SIGMOD Research Highlight Award! [10 papers selected]
Mar 2024 GPTuner Demo is accepted by SIGMOD 2024!
Mar 2024 GPTuner is accepted by VLDB 2024!

Publications

2026

Preprint

GenDB: The Next Generation of Query Processing -- Synthesized, Not Engineered

Jiale Lao, Immanuel Trummer

arXiv preprint, 2026

Preprint

IGenBench: Benchmarking the Reliability of Text-to-Infographic Generation

Yinghao Tang, Xueding Liu, Boyuan Zhang, Tingfeng Lan, Yupeng Xie, Jiale Lao, Yiyao Wang, Haoxuan Li, Tingting Gao, Bo Pan, Luoxuan Weng, Xiuqi Huang, Minfeng Zhu, Yingchaojie Feng, Yuyu Luo, Wei Chen

arXiv preprint, 2026

Under Revision, VLDB

SemBench: A Benchmark for Semantic Query Processing Engines

Jiale Lao, Andreas Zimmerer, Olga Ovcharenko, Tianji Cong, Matthew Russo, Gerardo Vitagliano, Michael Cochez, Fatma Özcan, Gautam Gupta, Thibaud Hottelier, H. V. Jagadish, Kris Kissel, Sebastian Schelter, Andreas Kipf, Immanuel Trummer

Under revision at Proceedings of Very Large Data Bases Conference (VLDB), 2026

SIGMOD

SQLBarber: A System Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads

Jiale Lao, Immanuel Trummer

Proceedings of ACM Conference on Management of Data (SIGMOD), 2026

SIGMOD

Are Your LLM-based Text-to-SQL Models Secure? Exploring SQL Injection via Backdoor Attacks

Meiyu Lin, Haichuan Zhang, Jiale Lao, Renyuan Li, Yuanchun Zhou, Carl Yang, Yang Cao, Mingjie Tang

Proceedings of ACM Conference on Management of Data (SIGMOD), 2026

2025

Preprint

QUITE: A Query Rewrite System Beyond Rules with LLM Agents

Yuyang Song, Hanxu Yan, Jiale Lao, Yibo Wang, Yufei Li, Yuanchun Zhou, Jianguo Wang, Mingjie Tang

arXiv preprint, 2025

SIGMOD

Demonstrating SQLBarber: Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads

Jiale Lao, Immanuel Trummer

Proceedings of ACM Conference on Management of Data (SIGMOD), 2025

SIGMOD Research Highlight

GPTuner: An LLM-Based Database Tuning System

Jiale Lao, Yibo Wang, Yufei Li, Jianping Wang, Yunjia Zhang, Zhiyuan Cheng, Wanghu Chen, Mingjie Tang, Jianguo Wang

Proceedings of ACM Conference on Management of Data (SIGMOD), 2025 — SIGMOD Research Highlight Award 2024

2024

VLDB

GPTuner: A Manual-Reading Database Tuning System

Jiale Lao, Yibo Wang, Yufei Li, Jianping Wang, Yunjia Zhang, Zhiyuan Cheng, Wanghu Chen, Mingjie Tang, Jianguo Wang

Proceedings of Very Large Data Bases Conference (VLDB), 2024

SIGMOD

A Demonstration of GPTuner: A GPT-Based Manual-Reading Database Tuning System

Jiale Lao, Yibo Wang, Yufei Li, Jianping Wang, Yunjia Zhang, Zhiyuan Cheng, Wanghu Chen, Yuanchun Zhou, Mingjie Tang, Jianguo Wang

Proceedings of ACM Conference on Management of Data (SIGMOD), 2024

Invited Talks

Feb 2026

GPTuner: An LLM-Based Database Tuning System

Research Talk at DEEM Lab, Berlin

Nov 2025

QUITE: A Query Rewrite System Beyond Rules with LLM Agents

Cornell Database Seminar

Nov 2024

GPTuner: An LLM-Based Database Tuning System

Cornell Database Seminar

Experience

May 2026 — Aug 2026

Student Researcher

Google

Advised by Fatma Özcan and Helena Caminal

Semantic query processing over multi-modal data

Jan 2026 — May 2026

Teaching Assistant

Cornell University — CS 5322 Efficient Analysis of Large Data Sets

Advised by Prof. Immanuel Trummer

May 2025 — Present

Research Assistant

Database Group, Cornell University

Advised by Prof. Immanuel Trummer

Customized and Realistic SQL Workload Generation with Large Language Model

Jan 2025 — May 2025

Teaching Assistant

Cornell University — CS 3110 Functional Programming

Advised by Prof. Anshuman Mohan

Aug 2024 — Jan 2025

Teaching Assistant

Cornell University — CS 4320/5320 Introduction to Database Systems

Advised by Prof. Immanuel Trummer and Prof. Sainyam Galhotra

Apr 2023 — Aug 2024

Research Assistant

Database Systems Group, Purdue University

Advised by Prof. Jianguo Wang

Automatic Optimization of Database with LLM; Distance Indexing via GNN

Oct 2022 — Aug 2024

Research Assistant

AI and System Lab, Sichuan University

Advised by Prof. Mingjie Tang

Automatic Optimization of Database with LLM; Distance Indexing via GNN

Education

PhD in Computer Science

Cornell University

2024 — Present

B.Eng in Software Engineering

Sichuan University

2020 — 2024