Hi! I'm Xiaoyan Bai

I am currently a senior student majoring in Computer Science in the department of Computer Science and Engineering at University of Michigan - Ann Arbor . Also, I'm studying my minor for Art Design in Stamps School of Art & Design . In year 2022-2023, I was in the Explore CS Research, a program sponsored by Google in collaboration with Girls Encoded . And I currently join Language and Information Technologies (LIT) lab group to explore more in NLP and AI. I am also working on NLP inference time efficiency in Prof. Atul Prakash's group. I am interested in doing research in Natural Language Processing as well as Game Design & Development. Check the games I made!

profile
Research Interest: As machine learning continues to advance, so does the potential for the digital divide and inequality to widen. These issues motivate my research in building responsible, efficient, and more accessible tools for social good. I believe this can be done by building interpretable models or building more cost-efficient methods

Publications

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity (Pre-print, 2024)
Andrew Lee, Xiaoyan Bai, Itamar Pres, Martin Wattenberg, Jonathan K. Kummerfeld, Rada Mihalcea

Learn To be Efficient: Build Structured Sparsity in Large Language Models (Pre-print, 2024)
Haizhong Zheng, Xiaoyan Bai, Beidi Chen, Fan Lai, Atul Prakash

CV & Experience

I am currently doing my dual degree program in Shanghai Jiao Tong Univeristy as an Electrical and Computer Engineering student and in University of Michigan - Ann Arbor as a Computer Science student.

  • During the winter of 2021, I worked as an intern in Emogent to help develop human-machine interactive product, Irene, who are also considered as a hyper-realistic artificial intelligence.
  • During the summer of 2022, I worked as a research assistant in Li WeiDong Laboratory with Binglei Zhao as my advisor. We used computor visualization tool to worked on how depression affect human's imagery and visual rumination. Starting from Fall 2022, I have been an undergrad research assistant in LIT lab working on NLP research for social good
  • Starting from May 2023, I have also been an undergrad research assistant working on NLP inference time efficiency in Prof. Atul Prakash's group. In the summer of 2023, I worked as a teaching assistant for Serious Game and AI in Beaver Works Summer Institue MIT to teach high school student about machine learning basics and game development and design to let them use games and AI to simulate social problems. In Fall 2023, I am working as a grader for EECS 376: Foundations of Computer Science
  • Poetry Generation Model

  • Pre-processed poetry data and fine-tuned pretrained language model to generate poems
  • Explored the repetitive behavior in text generation tasks
  • Skills: NLP, Python, Huggingface, PyTorch

    How do Language Models Solve Math Problems

  • Analyze what contributes to GPT2 model's decision making processing using LIME and residual stream analysis.
  • Lead the team to conduct a class research project on understanding LLMs
  • Win the prize for "Most Interesting Research Problems" in the class
  • Skills: Python, Pytorch

    Search Engine

  • Implement a basic search engine like google where you can search the thing you want.
  • Develop both front-end client-side and back-end searching algorithm by mapreduce and pipeline design.
  • Skills: Java, Python, HTML, React