Yikun Han | 韩颐堃

I am a first-year PhD student in Information Sciences at the University of Illinois Urbana-Champaign. Previously, I received my Master's degree in Data Science from the University of Michigan.

My research interests span the intersection of natural language processing, information retrieval, and AI for healthcare.

I am fortunate to be advised by Prof. Halil Kilicoglu and Prof. Yue Guo at UIUC. During my time at Michigan, I was supervised by Prof. Ambuj Tewari, and closely collaborated with the AI Health Lab at the University of Texas at Austin.

Email  /  GitHub  /  CV  /  Google Scholar  /  ORCID  /  LinkedIn

profile photo


Publications

project image

High-Fidelity Tuning of Olfactory Mixture Distances in the Perceptual Space of Smell Through a Community Effort


Vahid Satarifard*, Laura Sisson*, Yikun Han*, Pedro Ilidio*, Matej Hladis*, Maxence Lalis*, Xuebo Song, Wenjie Yin, Aharon Ravia, CiCi Xingyu Zheng, Gaia Andreoletti, Jake Albrecht, Robert Pellegrino, Zehua Wang, Stephen Yang, Robbe D'hondt, Achilleas Ghinis, Jasper de Boer, Felipe Kenji Nakano, Alireza Gharahighehi, DREAM Olfactory Mixtures Prediction Consortium, Benjamin Sanchez-Lengeling, Andreas Keller, Leslie B. Vosshall, Sebastien Fiorucci, Ambuj Tewari, Jeremie Topin, Celine Vens, Marten Bjorkman, Danica Kragic, Noam Sobel, Nicholas A. Christakis, Joel D. Mainland, Pablo Meyer
bioRxiv, 2025
arxiv / code

We present an ensemble model derived from the DREAM Olfactory Mixtures Prediction Challenge that accurately predicts the perceptual similarity of complex odor mixtures. By aggregating top-performing architectures, our approach outperforms state-of-the-art methods, establishing a robust, validated framework for mapping molecular combinations to human olfactory perception.

project image

Teaching Machine Olfaction in an Undergraduate Deep Learning Course: An Interdisciplinary Approach Based on Chemistry, Machine Learning, and Sensory Evaluation


Yikun Han, Michelle Krell Kydd, Joseph Ward, Ambuj Tewari
arXiv, 2025
code

We integrated machine olfaction into an undergraduate deep learning course, introducing smell as a new modality alongside traditional data types. Hands-on activities and graph neural networks enhanced student engagement and comprehension. We discuss challenges and future improvements.

project image

Beyond Answers: Transferring Reasoning Capabilities to Smaller LLMs Using Multi-Teacher Knowledge Distillation


Yijun Tian*, Yikun Han*, Xiusi Chen*, Wei Wang, Nitesh V. Chawla
International Conference on Web Search and Data Mining (WSDM), 2025
paper / arxiv / code

We present TinyLLM, a knowledge distillation approach that transfers reasoning abilities from multiple large language models (LLMs) to smaller ones. TinyLLM enables smaller models to generate both accurate answers and rationales, achieving superior performance despite a significantly reduced model size.

project image

Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models


Kyle Cox, Jiawei Xu, Yikun Han, Abby Xu, Tianhao Li, Chi-Yang Hsu, Tianlong Chen, Walter Gerych, Ying Ding
AAAI Conference on Artificial Intelligence (AAAI), 2025
paper / arxiv / code

We explore prompt sensitivity in large language models (LLMs), where semantically identical prompts can yield vastly different outputs. By modeling this sensitivity as generalization error, we improve uncertainty calibration using paraphrased prompts. Additionally, we propose a new metric to quantify uncertainty caused by prompt variations, offering insights into how LLMs handle semantic continuity in natural language.

project image

When Large Language Models Meet Vector Databases: A Survey


Zhi Jing*, Yongye Su*, Yikun Han*
Artificial Intelligence x Multimedia (AIxMM), 2025
paper / arxiv

We survey the integration of Large Language Models (LLMs) and Vector Databases (VecDBs), highlighting VecDBs’ role in addressing LLM challenges like hallucinations, outdated knowledge, and memory inefficiencies. This review outlines foundational concepts and explores how VecDBs enhance LLM performance by efficiently managing vector data, paving the way for future advancements in data handling and knowledge extraction.

project image

A Community Detection and Graph-Neural-Network-Based Link Prediction Approach for Scientific Literature


Chunjiang Liu*, Yikun Han*, Haiyun Xu, Shihan Yang, Kaidi Wang, Yongye Su
Mathematics, 2024
paper / arxiv

We integrate the Louvain community detection algorithm with various GNN models to improve link prediction in scientific literature networks. This approach consistently boosts performance, with models like GAT seeing AUC increases from 0.777 to 0.823, demonstrating the effectiveness of combining community insights with GNNs.

project image

A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge


Le Ma*, Ran Zhang*, Yikun Han*, Shirui Yu, Zaitian Wang, Zhiyuan Ning, Jinghan Zhang, Ping Xu, Pengjiang Li, Wei Ju, Chong Chen, Dongjie Wang, Kunpeng Liu, Pengyang Wang, Pengfei Wang, Yanjie Fu, Chunjiang Liu, Yuanchun Zhou, Chang-Tien Lu
arXiv, 2023
arxiv

We present a comprehensive survey of vector database techniques—covering hash-, tree-, graph-, and quantization-based ANNS methods—and outline integration opportunities with large language models for emerging research.


Competitions

project image

DREAM Olfactory Mixtures Prediction Challenge


Yikun Han, Zehua Wang, Stephen Yang, Ambuj Tewari
RECOMB/ISCB Conference on Regulatory & Systems Genomics with DREAM Challenges, 2024
writeup / video / code / website / news / slide

We use pre-trained graph neural networks and boosting techniques to enhance odor mixture discriminability, transforming single molecule embeddings into mixture predictions with improved robustness and accuracy.


Research Internships

project image

University of Michigan (Aug. 2023 - Aug. 2025)


Advisor: Prof. Ambuj Tewari

Research Topics:

[1] Graph Neural Networks

[2] Molecular Property Prediction

[3] Protein-Ligand Affinity Prediction

project image

University of Texas at Austin (Feb. 2024 - Aug. 2025)


Advisor: Prof. Ying Ding, Prof. Jiliang Tang

Research Topics:

[1] Graph Retrieval-Augmented Generation

[2] Medical AI

[3] Collaborator Recommendation

project image

University of Notre Dame (Dec. 2023 - Mar. 2024)


Advisor: Prof. Nitesh V. Chawla

Research Topics:

[1] Knowledge Distillation

[2] Multi-Teacher Collaboration

[3] In-Context Learning

project image

Tianyuan Mathematical Center in Southwest China (May. 2022 - Nov. 2022)


Advisor: Prof. Gang Chen

Research Topics:

[1] LAPACK Optimization

[2] Parallel Computation for Large-Scale Matrices

[3] High-Performance Matrix Factorization and Back Substitution


Education

project image

University of Illinois Urbana-Champaign (Aug. 2025 - May. 2030)


PhD

Information Sciences

GPA: 4.00/4.00

project image

University of Michigan (Aug. 2023 - May. 2025)


Master

Data Science

GPA: 3.97/4.00

project image

Sichuan University (Sep. 2019 - Jun. 2023)


Bachelor

Information Resources Management

GPA: 3.87/4.00

Rank: 2/76


Awards

RSGDREAM Travel Award, 2024

Outstanding Graduate, 2023

Second Prize Scholarship 2022

Outstanding Student, 2021

Outstanding Student, 2020


Service

Program Committee Member: GenAI4Health@NeurIPS 2025

Reviewer: ICWSM 2026, AMIA 2026, IEEE TNNLS


Miscellanea

- I keep a Book List 📚 of the books I have read since 2021.
- I enjoy sports, including Skiing 🎿, Badminton 🏸, and Running 🏃.




Design and source code from Jon Barron's website