Aliyah R. Hsu
Berkeley Way West, 8th floor
Pronouns: she/her
I’m a 5th-year PhD student at UC Berkeley’s EECS department and BAIR, researching natural language processing (NLP), interpretability and AI-assisted clinical decision making. I’m generally interested in language model applications and building trust and safety in large language models (LLMs) in high-stakes domains.
I am fortunate to have been awarded UC Berkeley Chancellor Fellowship and EECS Excellence Award and advised by Bin Yu.
I received my double Bachelor’s degrees in electrical engineering and economics from National Taiwan University (NTU). During my time at NTU, I had the pleasure to have worked with Lin-Shan Lee, Hung-Yi Lee, and Yu-Chiang Frank Wang on research focusing on dialogue response generation and computer vision.
I am on the industry job market during the 2024-2025 academic year. Happy to connect if you think I might be a good fit for your team.
You can find my CV here, which includes a non-exhaustive list of people from many institutions (UCSF, MIT, UT Austin, Microsoft Research, Salesforce) who took a leap of faith to support, teach, and guide me in research.
news
Mar 5, 2025 | Our paper Enhancing CBMs Through Binary Distillation with Applications to Test-time Intervention decomposing Concept Bottleneck Models(CBMs) predictions into interpretable binary-concept-interaction attributions to guide adaptive test-time intervention got into ICLR 2025 BuildingTrust Workshop ![]() |
---|---|
Jan 22, 2025 | Our paper Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition proposing a novel circuit discovery algorithm for more efficient mechanistic interpretability in large language models got into ICLR 2025 ![]() |
May 20, 2024 | Returned to Salesforce Einstein Language Intelligence Data Science Team as an Applied Scientist Intern again! ![]() |
May 7, 2024 | Will be attending ICLR 2024 at Vienna, Austria! |
Jan 15, 2024 | Our paper Diagnosing Transformers: Illuminating Feature Spaces for Clinical Decision-Making investigating effect of pre-training data distributions on transformer feature spaces got into ICLR 2024 ![]() |