Hi! or "Annyeonghaseyo!" I am an incoming Ph.D. student in Computer Science at the University of Cambridge, advised by Prof. Andreas Vlachos. Prior to this, I received M.S. in Computer Science & Engineering at Korea University as a member of DMIS Lab, advised by Prof. Jaewoo Kang, and received B.S. in Linguistics and Computer Science from the same institution. I worked as a researcher intern at Aigen Science Inc., Nara-Information, and Song Lab (hosted by Prof. Sanghoun Song).
My research covers the broad field of natural language processing, with a primary focus on driving Post-Training innovations through Mechanistic Interpretability. I trace the internal computations of LLMs to design interpretable alignment techniques for Safety, Knowledge Editing, and Reasoning. Moving forward, I envision devising fine-grained post-training methods that resolve localized breakdowns without robustness trade-offs. Beyond general models, I also apply RAG, RL techniques and these alignment insights to Biomedical, Molecular LMs.
You can find my CV in here. My biography of Gates Cambridge scholar is here.
[jɛin bɑk]
[Under Review] Adaptive Latent Block Insertion via Online Reward Optimization
Yewon Cho, Hyeongsoon Hwang, Yein Park, Hyeon Hwang, Jaewoo Kang
[Under Review] OpenThoughts-Agent: Data Recipes for Agentic Models
Negin Raoof, Richard Zhuang, Marianna Nezhurina, Etash Kumar Guha, Atula Tejaswi, Ryan Marten, Charlie F. Ruan, Tyler Griggs, Alexander Glenn Shaw, Hritik Bansal, E. Kelly Buchanan, Artem Gazizov, Reinhard Heckel, Chinmay Hegde, Sankalp Jajee, Daanish Khazi, Emmanouil Koukoumidis, Xiangyi Li, Hange Liu, Shlok Natarajan, Harsh Raj, Nicholas Roberts, Ethan Shen, Nishad Singhi, Michael Siu, Ashima Suvarna, Hanwen Xing, Patrick Yubeaton, Robert Zhang, Leon Liangyu Chen, Xiaokun Chen, Steven Dillmann, Saadia Gabriel, Xunyi Jiang, Anurag Kashyap, Boxuan Li, Yein Park, Minh Pham, sujay sanghavi, Lin Shi, Yixin Wang, Zhiwei Xu, Erica Zhang, Siyan Zhao, Wanjia Zhao, Jenia Jitsev, Alex Dimakis, Benjamin Feuer, Ludwig Schmidt
[Under Review] MolDeTox: Evaluating Language Model's Stepwise Fragment Editing for Molecular Detoxification
Jueon Park, WonJune Jang, Jiwoo Lee, Yein Park, Jaewoo Kang
[Under Review] Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training
Yein Park, Minbyul Jeong, Jaewoo Kang
[ICML 2026 SD4H] Distilling Expert-level Planning for Real-world Diabetes Prescribing
Yein Park†, Sihyeon Park†, Taeyun Roh, Mujeen Sung, Jaehoon Yun, Sooyeon Lee, Sanghoon Lee, Jaewoo Kang
[ACL 2026 Findings] ToxReason: A Benchmark for Mechanistic Chemical Toxicity Reasoning via Adverse Outcome Pathway
Jueon Park, WonJune Jang, Chanhwi Kim, Yein Park, Jaewoo Kang
[ICLR 2026] ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack
Yein Park, Jungwoo Park, Jaewoo Kang
[IEEE BIBM 2025] CoTox: Chain-of-Thought-Based Molecular Toxicity Reasoning and Prediction
Jueon Park, Yein Park, Minju Song, Soyon Park, Donghyeon Lee, Seungheun Baek, Jaewoo Kang
[🥇1st Rank] DMIS Lab at MedHopQA-2025: Ensemble Multi-Retrieval Methodologies with Reasoning Language Model Decision
Jongmyung Jung†, Hyeongsoon Hwang†, Yein Park†, Minju Song, Jaehoon Yoon, Hyeon Hwang, Sanghoon Lee, Jiwoong Sohn, Jaewoo Kang
[EMNLP 2025 Findings] Assessing LLM Reasoning Steps via Principal Knowledge Grounding
Hyeon Hwang, Yewon Cho, Chanwoong Yoon, Yein Park, Minju Song, Kyungjae Lee, Gangwoo Kim, Jaewoo Kang
[ACL 2025] Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information
Yein Park, Chanwoong Yoon, Jungwoo Park, Minbyul Jeong, Jaewoo Kang
[ICLR 2025] ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
Yein Park, Chanwoong Yoon, Jungwoo Park, Donghyeon Lee, Minbyul Jeong, Jaewoo Kang
[NAACL 2025] Rationale-Guided Retrieval Augmented Generation for Medical Question Answering
Jiwoong Sohn, Yein Park, Chanwoong Yoon, Sihyeon Park, Hyeon Hwang, Mujeen Sung, Hyunjae Kim, Jaewoo Kang
(advisor: Prof. Andreas Vlachos)
(supervisor: Prof. Ludwig Schmidt at Stanford University)
Data Mining & Information Systems Lab (advisor: Prof. Jaewoo Kang)
(advisor: Prof. Seok-Hoon You)
Research Intern Advisor: Sanghoon Lee
- Developed on-premise clinical multi-agent system for diabetes management to assist medical experts.
- Building small expert LLMs for real-world prescribing scenarios with GDPO-based search RL.
Research Intern Metaverse lab team
- Designed domain specific sLLM for local government
- Develped tour recommendation system combined with LLM by RAG method
- Assisted Korean curtural contents project using commercial AI systems.
Researcher Korean AI dataset team (host: Prof. Sanghoun Song)
- Joined Korean legal/bidomedical AI dataset project.
- Worked as crowdworker for legal data tagging and labelling.
- Safety Training Methods using Transformer Circuits.
- Differentiate Specific Emergent Attention Heads using Transforemr Circuits.
- Temporal Alignment through Chronosemanticity.
- Supported to research on knowledge distillation method.
- Utilize Knowledge Hierarchy.
- Developed Korean local tour recommendation system combined with LLM by RAG methods.
- Designed pipeline of recommendation using personalized tour data, with efficient ML methods.
- [Available GPTs ver]
- [link to the post]
- Developed sLLM specialized to administrative and civil affairs works for local government's officers.
- Developed summarization and search agents among the entire LLM pipeline.
- [link to the post]
- The Fortieth Annual Conference on Neural Information Processing Systems
- Participated as a reviewer
- 3rd Conference on Language Modeling
- Participated as a reviewer
- The 64rd Annual Meeting of the Association for Computational Linguistics
- Participated as a secondary reviewer
- The 2025 Conference on Empirical Methods in Natural Language Processing
- Participated as a secondary reviewer
- The 63rd Annual Meeting of the Association for Computational Linguistics
- Participated as a secondary reviewer