Yikun Han | 韩颐堃

I am a first-year PhD student in Information Sciences at the University of Illinois Urbana-Champaign. Previously, I received my Master's degree in Data Science from the University of Michigan.

My research interests span the intersection of natural language processing, information retrieval, and AI for healthcare.

I am fortunate to be advised by Prof. Halil Kilicoglu and Prof. Yue Guo at UIUC. During my time at Michigan, I was supervised by Prof. Ambuj Tewari, and closely collaborated with the AI Health Lab at the University of Texas at Austin.

Email / GitHub / CV / Google Scholar / ORCID / LinkedIn

Publications

	High-Fidelity Tuning of Olfactory Mixture Distances in the Perceptual Space of Smell Through a Community Effort Vahid Satarifard, Laura Sisson, Yikun Han, Pedro Ilidio, Matej Hladis, Maxence Lalis, Xuebo Song, Wenjie Yin, Aharon Ravia, CiCi Xingyu Zheng, Gaia Andreoletti, Jake Albrecht, Robert Pellegrino, Zehua Wang, Stephen Yang, Robbe D'hondt, Achilleas Ghinis, Jasper de Boer, Felipe Kenji Nakano, Alireza Gharahighehi, DREAM Olfactory Mixtures Prediction Consortium, Benjamin Sanchez-Lengeling, Andreas Keller, Leslie B. Vosshall, Sebastien Fiorucci, Ambuj Tewari, Jeremie Topin, Celine Vens, Marten Bjorkman, Danica Kragic, Noam Sobel, Nicholas A. Christakis, Joel D. Mainland, Pablo Meyer bioRxiv, 2025 arxiv / code We present an ensemble model derived from the DREAM Olfactory Mixtures Prediction Challenge that accurately predicts the perceptual similarity of complex odor mixtures. By aggregating top-performing architectures, our approach outperforms state-of-the-art methods, establishing a robust, validated framework for mapping molecular combinations to human olfactory perception.
	Teaching Machine Olfaction in an Undergraduate Deep Learning Course: An Interdisciplinary Approach Based on Chemistry, Machine Learning, and Sensory Evaluation Yikun Han, Michelle Krell Kydd, Joseph Ward, Ambuj Tewari arXiv, 2025 code We integrated machine olfaction into an undergraduate deep learning course, introducing smell as a new modality alongside traditional data types. Hands-on activities and graph neural networks enhanced student engagement and comprehension. We discuss challenges and future improvements.
	Beyond Answers: Transferring Reasoning Capabilities to Smaller LLMs Using Multi-Teacher Knowledge Distillation Yijun Tian, Yikun Han, Xiusi Chen, Wei Wang, Nitesh V. Chawla International Conference on Web Search and Data Mining (WSDM)*, 2025 paper / arxiv / code We present TinyLLM, a knowledge distillation approach that transfers reasoning abilities from multiple large language models (LLMs) to smaller ones. TinyLLM enables smaller models to generate both accurate answers and rationales, achieving superior performance despite a significantly reduced model size.
	Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models Kyle Cox, Jiawei Xu, Yikun Han, Abby Xu, Tianhao Li, Chi-Yang Hsu, Tianlong Chen, Walter Gerych, Ying Ding AAAI Conference on Artificial Intelligence (AAAI), 2025 paper / arxiv / code We explore prompt sensitivity in large language models (LLMs), where semantically identical prompts can yield vastly different outputs. By modeling this sensitivity as generalization error, we improve uncertainty calibration using paraphrased prompts. Additionally, we propose a new metric to quantify uncertainty caused by prompt variations, offering insights into how LLMs handle semantic continuity in natural language.
	When Large Language Models Meet Vector Databases: A Survey Zhi Jing, Yongye Su, Yikun Han* Artificial Intelligence x Multimedia (AIxMM), 2025 paper / arxiv We survey the integration of Large Language Models (LLMs) and Vector Databases (VecDBs), highlighting VecDBs’ role in addressing LLM challenges like hallucinations, outdated knowledge, and memory inefficiencies. This review outlines foundational concepts and explores how VecDBs enhance LLM performance by efficiently managing vector data, paving the way for future advancements in data handling and knowledge extraction.
	A Community Detection and Graph-Neural-Network-Based Link Prediction Approach for Scientific Literature Chunjiang Liu, Yikun Han, Haiyun Xu, Shihan Yang, Kaidi Wang, Yongye Su Mathematics, 2024 paper / arxiv We integrate the Louvain community detection algorithm with various GNN models to improve link prediction in scientific literature networks. This approach consistently boosts performance, with models like GAT seeing AUC increases from 0.777 to 0.823, demonstrating the effectiveness of combining community insights with GNNs.
	A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge Le Ma, Ran Zhang, Yikun Han, Shirui Yu, Zaitian Wang, Zhiyuan Ning, Jinghan Zhang, Ping Xu, Pengjiang Li, Wei Ju, Chong Chen, Dongjie Wang, Kunpeng Liu, Pengyang Wang, Pengfei Wang, Yanjie Fu, Chunjiang Liu, Yuanchun Zhou, Chang-Tien Lu arXiv*, 2023 arxiv We present a comprehensive survey of vector database techniques—covering hash-, tree-, graph-, and quantization-based ANNS methods—and outline integration opportunities with large language models for emerging research.

Competitions

DREAM Olfactory Mixtures Prediction Challenge

Yikun Han, Zehua Wang, Stephen Yang, Ambuj Tewari
RECOMB/ISCB Conference on Regulatory & Systems Genomics with DREAM Challenges, 2024
writeup / video / code / website / news / slide

We use pre-trained graph neural networks and boosting techniques to enhance odor mixture discriminability, transforming single molecule embeddings into mixture predictions with improved robustness and accuracy.

Research Internships

	University of Michigan (Aug. 2023 - Aug. 2025) Advisor: Prof. Ambuj Tewari Research Topics: [1] Graph Neural Networks [2] Molecular Property Prediction [3] Protein-Ligand Affinity Prediction
	University of Texas at Austin (Feb. 2024 - Aug. 2025) Advisor: Prof. Ying Ding, Prof. Jiliang Tang Research Topics: [1] Graph Retrieval-Augmented Generation [2] Medical AI [3] Collaborator Recommendation
	University of Notre Dame (Dec. 2023 - Mar. 2024) Advisor: Prof. Nitesh V. Chawla Research Topics: [1] Knowledge Distillation [2] Multi-Teacher Collaboration [3] In-Context Learning
	Tianyuan Mathematical Center in Southwest China (May. 2022 - Nov. 2022) Advisor: Prof. Gang Chen Research Topics: [1] LAPACK Optimization [2] Parallel Computation for Large-Scale Matrices [3] High-Performance Matrix Factorization and Back Substitution