Puxuan Yu, 余璞轩
Hi there! I am a Ph.D. candidate from the CIIR and CICS at UMass Amherst. I am advised by James Allan and Negin Rahimi. I obtained my bachelor's degree from Wuhan University in 2018.
My research focus revolves around the intersection of information retrieval (IR) and natural language processing (NLP). I am interested in applying NLP approaches to various IR tasks, including entity retrieval and ad-hoc retrieval (ranking, diversification, dense retrieval, sparse retrieval, and multilingual retrieval). Currently, my thesis explores the "explanation" factor of search systems.
Recent Updates
- [Active] I'm in the industrial job market seeking a full-time position (RS/AS/MLE/SDE). Please feel free to reach out if you're interested in my work. (CV)
- [Feb. 2024] My internship work with Dataminr on using LLMs for scale calibrating neural ranking models is up on arXiv!
- [Dec. 2023] One full paper accepted at ECIR'24. See you in Glasgow, Scotland, UK!
- [Oct. 2023] Joining Dataminr for a Fall research internship!
- [Sep. 2023] I defended my proposal titled "Leveraging Explanations for Information Retrieval Systems under Data Scarsity."
- [Aug. 2023] One full paper accepted at CIKM'23. See you in Birmingham, UK!
Publications
Explain then Rank: Scale Calibration of Neural Rankers via Natural Language Explanations from Large Language Models.
Puxuan Yu, Daniel Cohen, Hemank Lamba, Joel Tetreault, Alex Jaimes.
Preprint.
[Text: arXiv]
Improved Learned Sparse Retrieval with Corpus-specific Vocabularies.
Puxuan Yu, Antonio Mallia, Matthias Petri.
ECIR 2024, full paper.
[Text: arXiv]
Search Result Diversification Using Query Aspects as Bottlenecks.
Puxuan Yu, Negin Rahimi, Zhiqi Huang, James Allan.
CIKM 2023, full paper.
[Text: CIIR / ACM]
Improving Cross-lingual Information Retrieval on Low-Resource Languages via Optimal Transport Distillation.
Zhiqi Huang, Puxuan Yu, James Allan.
WSDM 2023, full paper.
[Text: CIIR / ACM]
Towards Explainable Search Results: A Listwise Explanation Generator.
Puxuan Yu, Negin Rahimi, James Allan.
SIGIR 2022, full paper.
[Text: CIIR / ACM][Github]
AutoName: A Corpus-Based Set Naming Framework.
Zhiqi Huang, Negin Rahimi, Puxuan Yu, Jingbo Shang, James Allan.
SIGIR 2021, short paper.
[Text: ACM]
Cross-lingual Language Model Pretraining for Retrieval.
Puxuan Yu, Hongliang Fei, Ping Li.
TheWebConf 2021, full paper.
[Text: CIIR / ACM][Github]
Learning to Rank Entities for Set Expansion from Unstructured Data.
Puxuan Yu, Negin Rahimi, Zhiqi Huang, James Allan.
ICTIR 2020, full paper.
[Text: CIIR / ACM][Slides]
A Study of Neural Matching Models for Cross-lingual IR.
Puxuan Yu, James Allan.
SIGIR 2020, short paper.
[Text: arXiv / ACM][Slides]
Corpus-based Set Expansion with Lexical Features and Distributed Representations.
Puxuan Yu, Zhiqi Huang, Negin Rahimi, James Allan.
SIGIR 2019, short paper.
[Text: CIIR / ACM][Github]
Hide-n-Seek: An Intent-aware Privacy Protection Plugin for Personalized Web Search.
Puxuan Yu, Wasi Uddin Ahmad, Hongning Wang.
SIGIR 2018, demo paper.
[Text: ACM]
Work Experiences
- AI Platform, Dataminr, Research Intern Oct. 2023 - Jan. 2024
- Web Information Group, Amazon Alexa AI, Applied Scientist Intern May - Aug. 2022
- Cognitive Computing Lab, Baidu Research USA, Research Intern May - Dec. 2020
- HCDM Group, University of Virginia, Research Intern Jul. - Sep. 2017
Teaching Assistantships
- Search Engines (COMPSCI-446 @ UMass) Spring 2019
- Computer Literacy (COMPSCI-105 @ UMass) Fall 2018
Miscellaneous
- 🏠 I spent the first 22 years of my life in my hometown, Wuhan, China, until I joined UMass.
- 🇬🇧 I currently live in Lincoln, UK while working on my research remotely.
- 🏸 I am a huge fan of the Boston Celtics 🏀 and Tottenham Hotspur ⚽.
Contact
pxyu[AT]cs[DOT]umass[DOT]edu