Research

My research focuses on trustworthy NLP, AI safety, LLM interpretability, and representation analysis.

I am particularly interested in understanding and mitigating subtle biases in language and multimodal models, while improving their reliability and generalization in real-world settings.

My recent work spans embedding-level bias mitigation, bias evaluation in vision-language models, and post-training methods for preserving generalization in vision-language-action systems.

Current Research

Previous Research

Visiting / Industry Research Experience