Large Language Models Discriminate Against Speakers of German Dialects
We examine whether negative societal stereotypes against speakers of german dialects are mirrored by large language models (LLMs).
We examine whether negative societal stereotypes against speakers of german dialects are mirrored by large language models (LLMs).
We present the first evaluation of the ability of language models to jointly reason over time and space. To enable our analysis, we create GeoTemp, a dataset of 320k prompts covering 289 cities in 217 countries and 37 time zones.
We introduce GIMMICK, an extensive multimodal benchmark designed to assess a broad spectrum of cultural knowledge across 144 countries representing six global macro-regions. GIMMICK comprises six tasks built upon three new datasets that span 728 unique cultural events or facets on which we evaluated 20 LVLMs and 11 LLMs, including five proprietary and 26 open-weight models of all sizes.
While research has attempted to identify and mitigate such biases, most efforts have been concentrated around English, lagging the rapid advancement of LLMs in multilingual settings. In this paper, we introduce a new multilingual dataset SHADES to help address this issue, designed for examining culturally-specific stereotypes that may be learned by LLMs. The dataset includes stereotypes from 20 geopolitical regions and languages, spanning multiple identity cate016 gories subject to discrimination worldwide.