Carolin Holtermann (she/her)

I’m Carolin Holtermann, a third year PhD student at the University of Hamburg, Germany.

SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models

January 2025

While research has attempted to identify and mitigate such biases, most efforts have been concentrated around English, lagging the rapid advancement of LLMs in multilingual settings. In this paper, we introduce a new multilingual dataset SHADES to help address this issue, designed for examining culturally-specific stereotypes that may be learned by LLMs. The dataset includes stereotypes from 20 geopolitical regions and languages, spanning multiple identity cate016 gories subject to discrimination worldwide.

Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model

January 2025

We present a comprehensive investigation into the training strategies for massively multilingual LVLMs. First, we conduct a series of multi-stage experiments spanning 13 downstream vision-language tasks and 43 languages, systematically examining: (1) the number of training languages that can be included without degrading English performance and (2) optimal language distributions of pre-training as well as (3) instruction-tuning data. Further, we (4) investigate how to improve multilingual text-in-image understanding, and introduce a new benchmark for the task.

Why do LLaVA Vision-Language Models Reply to Images in English?

July 2024

We uncover a surprising multilingual bias occurring in a popular class of multimodal vision-language models (VLMs). Including an image in the query to a LLaVA-style VLM significantly increases the likelihood of the model returning an English response, regardless of the language of the query.

Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ

March 2024

We investigate the basic multilingual capabilities of state-of-the-art open LLMs beyond their intended use. Specifically, we introduce a new silver standard benchmark which we use to assess the models' multilingual language fidelity and question answering accuracy.

Work

Company
University of Hamburg
Role
PhD Candidate
Date
2023 Present
Company
Blue Yonder
Role
Data Science Consultant
Date
2022 2023
Company
SAP
Role
Cloud Consultant
Date
2019 2022
Company
SAP
Role
Cooperative Student
Date
2016 2019

Download CV