Efficient and Scalable Estimation of Tool Representations in Vector Space
Suhong Moon*
Siddharth Jha*,
Lutfi Erdogan,
Sehoon Kim,
Woosang Lim,
Kurt Keutzer,
Amir Gholami
arXiv, 2024
|
Characterizing Prompt Compression Methods for Long Context Inference
Siddharth Jha,
Lutfi Erdogan,
Sehoon Kim,
Kurt Keutzer,
Amir Gholami
Es-FoMo @ ICML (Oral), 2024
|
Learned Best-Effort LLM Serving
Siddharth Jha,
Coleman Hooper,
Xiaoxuan Liu,
Sehoon Kim,
Kurt Keutzer
Es-FoMo @ ICML, 2024
|
TinyAgent: Function Calling at the Edge
Lutfi Erdogan*,
Nicholas Lee*,
Siddharth Jha*,
Sehoon Kim,
Ryan Tabrizi,
Suhong Moon,
Coleman Hooper,
Gopala Anumanchipalli,
Kurt Keutzer,
Amir Gholami
arXiv, 2024
|
Text2SQL is Not Enough: Unifying AI and Databases with TAG
Asim Biswal*,
Liana Patel*,
Siddharth Jha,
Amog Kamsetty,
Shu Liu,
Joseph Gonzalez,
Carlos Guestrin,
Matei Zaharia
arXiv, 2024
|
LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data
Liana Patel,
Siddharth Jha,
Carlos Guestrin,
Matei Zaharia
arXiv, 2024
|
An Evaluation of Memory Optimization Methods for Training Neural Networks
Xiaoxuan Liu,
Siddharth Jha,
Alvin Cheung
arXiv, 2023
|
Leveraging Application Data Constraints to Optimize Database-Backed Web Applications
Xiaoxuan Liu,
Shuxian Wang,
Mengzhu Sun,
Sicheng Pan,
Ge Li,
Siddharth Jha,
Cong Yan,
Junwen Yang,
Shan Lu,
Alvin Cheung
VLDB, 2023
|
|