A new multilingual study from researchers in Germany and partner institutions reveals that text prompts written in different languages can influence the gender presentation of generated faces, and these shifts are not random at all. The underlying systems amplify familiar stereotypes in occupations and personality traits, turning assumptions into visual results. The investigation shows that no matter how advanced modern text to image generators have become, they still reflect and sometimes intensify cultural patterns about gender roles.
Testing Nine Languages and Thousands of Prompts
The benchmark is known as the Multilingual Assessment of Gender Bias in Image Generation. It evaluates occupations and descriptive adjectives with carefully controlled phrasing. The set includes languages that mark gender directly in nouns such as German, Spanish, French, Italian, and Arabic. It also includes English and Japanese which primarily carry gender through pronouns rather than the form of the occupation word. Korean and Chinese are present as well, representing languages without grammatical gender in nouns or pronouns. This wide linguistic range allowed the researchers to investigate whether the same job title or description leads to similar images when prompts are identical in content.
Prompt Structure Can Influence Visual Interpretation
One type refers to an occupation using the default noun that traditionally acts as a generic masculine term in languages that rely on grammatical gender.
Another type avoids the occupation noun entirely by replacing it with a description of the work that a person performs.
Feminine versions of job titles appear in languages where they exist. In German, there is even a gender star notation that tries to make references more inclusive by altering the written form of a word with a special character. These choices were introduced to learn whether changing prompt structure reduces bias or whether the models continue showing strong patterns even when language attempts to remove gender cues.
A Large-Scale Image Evaluation Process
Researchers measured how far the results deviated from an equal presentation of male and female appearances. A measure of absolute deviation from balance helped indicate how strongly stereotypes emerge when the model interprets a role like accountant, nurse, firefighter, or software engineer.
Bias Patterns Show Up Consistently Across Models
These tendencies appear repeatedly across different platforms tested, which suggests that the bias comes from common exposure to large datasets shaped by real world social structures. The study found that some languages produced noticeably stronger stereotypes than others, yet the level of grammatical gender in the language did not reliably predict the degree of bias. Shifting from one European language to another could change the portrayal significantly even when both languages handle gender in similar ways.
Gender Neutral Phrasing Reduces Bias but Creates New Challenges
Language Choices That Try to Ensure Fairness May Backfire
More Attention Needed for Global Fairness
Bias Remains a Persistent Issue in Image Generation
Notes: This post was edited/created using GenAI tools.
Read next: Wikipedia Faces Political Pressure As Co-founder Renews Bias Claims
by Irfan Ahmad via Digital Information World










