I'm a master's student in Computer Science at Humboldt University of Berlin, currently awaiting the results of my thesis titled "Cross-lingual Transfer of Pre-Trained Language Models to Vietnamese".
For the past 6 years, I have been working as a student assistant in the Speech and Language Technology Group at the German Research Center for Artificial Intelligence (DFKI), focusing on Natural Language Processing (NLP) – especially Information Extraction. During my time at the DFKI I wrote my bachelor thesis on weak supervision for event extraction. Most recently I have been working on Biomedical Relation Extraction with Large Language Models (LLMs) using the LangChain framework.
I recently submitted my Master’s thesis titled "Cross-lingual transfer of Pre-Trained Language Models to Vietnamese".
In this work I explored cross-lingual transfer techniques for adapting pre-trained language models to Vietnamese, focusing on tokenizer replacement and efficient initialization strategies.
👉 Check out the code and experiments here
For my Bachelor’s thesis titled "Investigating Weak Supervision for the Extraction of Mobility Relations and Events in German Text",
I explored weak supervision techniques for event extraction. In particular, I worked with Snorkel, a framework for programmatically generating training data through labeling functions based on heuristics.
This work was conducted at the Speech and Language Technology Lab, where I contributed to ongoing research efforts and related open-source projects. These include::
- 🔧 eventx: Implementation of joint classification of events and arguments
- 🔍 wsee: Codebase developed as part of my thesis to investigate weak supervision for extracting mobility-related events and relations
- 📊 MobIE Dataset: A dataset of mobility-related named entity and n-ary relation annotations
- 📄 Published paper @ KONVENS 2021: Hennig, L., Truong, P. T., & Gabryszak, A. (2021). MobIE: A German Dataset for Named Entity Recognition, Entity Linking and Relation Extraction in the Mobility Domain. In Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021) (pp. 223–227). Düsseldorf, Germany.