Aliyah R. Hsu

Berkeley Way West, 8th floor

Pronouns: she/her

I’m a 5th-year PhD student at UC Berkeley’s EECS department and BAIR, researching natural language processing (NLP), interpretability and AI-assisted clinical decision making. I’m generally interested in language model applications and building trust and safety in large language models (LLMs) in high-stakes domains.

I am fortunate to have been awarded UC Berkeley Chancellor Fellowship and EECS Excellence Award and advised by Bin Yu.

I received my double Bachelor’s degrees in electrical engineering and economics from National Taiwan University (NTU). During my time at NTU, I had the pleasure to have worked with Lin-Shan Lee, Hung-Yi Lee, and Yu-Chiang Frank Wang on research focusing on dialogue response generation and computer vision.

I am on the industry job market during the 2024-2025 academic year. Happy to connect if you think I might be a good fit for your team.

You can find my CV here, which includes a non-exhaustive list of people from many institutions (UCSF, MIT, UT Austin, Microsoft Research, Salesforce) who took a leap of faith to support, teach, and guide me in research.

news

Mar 5, 2025	Our paper Enhancing CBMs Through Binary Distillation with Applications to Test-time Intervention decomposing Concept Bottleneck Models(CBMs) predictions into interpretable binary-concept-interaction attributions to guide adaptive test-time intervention got into ICLR 2025 BuildingTrust Workshop
Jan 22, 2025	Our paper Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition proposing a novel circuit discovery algorithm for more efficient mechanistic interpretability in large language models got into ICLR 2025
May 20, 2024	Returned to Salesforce Einstein Language Intelligence Data Science Team as an Applied Scientist Intern again!
May 7, 2024	Will be attending ICLR 2024 at Vienna, Austria!
Jan 15, 2024	Our paper Diagnosing Transformers: Illuminating Feature Spaces for Clinical Decision-Making investigating effect of pre-training data distributions on transformer feature spaces got into ICLR 2024