Lukas Birkenmaier
banner
lukasbirkenmai1.bsky.social
Lukas Birkenmaier
@lukasbirkenmai1.bsky.social
Research Associate and PhD Candidate @gesis_org | Interested in text as data, validity, political communication; 🔗https://lukasbirkenmaier.de/
We also test the predictive ability of our measures by confirming previous findings on the relationship between personality and ideology, and provide extensive further exploratory analysis (e.g., on gender differences) in the Appendix.
December 2, 2025 at 11:35 AM
Next, we conducted "functional tests", inspired by software development practices, to verify whether models can accurately classify intentionally designed statements, showing that only DeepSeek-V3 and GPT-40 were able to pass all functional tests.
December 2, 2025 at 11:34 AM
How well do different approaches perform on human-labeled data?
Our findings show that state-of-the-art instruction-tuned models perform best, reaching macro F1 scores of nearly 0.8 for both traits.
December 2, 2025 at 11:31 AM
We build a full pipeline, from conceptualizing political personality traits to codebook development, human annotation, model comparison, and systematic validation. We also use multiple data sources (interviews, social media, speeches) and multiple model architectures.
December 2, 2025 at 11:28 AM
⏰ I’m happy to share that a major paper from my dissertation was published today (🔗 doi.org/10.1017/pan....) in Political Analysis. In the paper, Clemens Lechner and I conduct an extensive validation study on how to measure politicians’ public personality traits using computational text analysis!
December 2, 2025 at 11:09 AM
For 2) we confirm previous findings that Donald Trump signals more agency-related cues, whereas Kamala Harris signals more communion-related cues during their presidential debate
April 26, 2025 at 10:33 AM
For 1), we see a clear and consistent negative relationship between the parties CHES ("left-right") score and the politicians’ share of communion. This association is particularly
pronounced for the economic dimension (left panel), but also present in the cultural dimension (right panel).
April 26, 2025 at 10:31 AM
We observe that Deepseek-V3 achieved the strongest performance, closely followed by GPT-4o. At the same time, traditional methods (SVM, XLM-RoBERTa) showed weaker results across validation steps (e.g., comparison with human labels (left plot) & functional tests with designed examples (right plot)).
April 26, 2025 at 10:27 AM
💻 We then apply a systematic research design that includes
1) extensive operationalisation of two traits (agency and communion) using a validated framework
2) Human labelling and validation, and
3) multiple methods (SVM, XLM-RoBERTa, GPT-4o, Llama-3-8B, Deepseak-V3) and measurement strategies!
April 26, 2025 at 10:17 AM
We started conceptualizing how "politicians' personality" can be measured by focusing on observable personality cues as the main empirical indicators in language. These cues reflect an amalgamation of politicians' true intrinsic traits shaped by (unobservable) strategic considerations.
April 26, 2025 at 10:08 AM
Had a great CPSS workshop organized by @indiiigosky @GabriellaLapesa @chklamm on NLP and computational social science! I am still in rainy Vienna for the next days if you want to meet for a coffee or beer in the afternoon☕️ 🙂
September 15, 2024 at 7:03 AM
Key findings interviews:
💡 scholars also engage in more conceptual validation steps which are, however, often not reported. These steps, while not necessarily empirical, nevertheless play a critical role in critically examining assumptions, limitations, and implications.
November 28, 2023 at 11:01 AM
Key findings review:
💡The total number of validation steps varied greatly across studies (0≤n≤6)
💡 Only 9% of validation steps were properly labelled
💡 Overall focus on external (i.e., output comparison ) over internal (i.e., evaluation of measurement model) validation
November 28, 2023 at 11:01 AM
Key findings interviews:
💡 scholars also engage in more conceptual validation steps which are, however, often not reported. These steps, while not necessarily empirical, nevertheless play a critical role in critically examining assumptions, limitations, and implications.
November 28, 2023 at 10:20 AM
Key findings review:
💡The total number of validation steps varied greatly across studies (0≤n≤6)
💡 Only 9% of validation steps were properly labelled
💡 Overall focus on external (i.e., output comparison ) over internal (i.e., evaluation of measurement model) validation
November 28, 2023 at 10:20 AM
Second day of the RUDE conference on rural urban divides in Frankfurt! Thanks for the opportunity to present early work with @msaeltzer.bsky.social @wurthmann.bsky.social on politicians geographical representation using computational text analysis 💻
November 18, 2023 at 9:17 AM
Great evening news: the first paper of my dissertation just got accepted for publication in Communication Methods and Measures🎉 /w Claudia Wagner & Clemens Lechner
osf.io/preprints/so...
November 16, 2023 at 9:49 PM
Still some time (22.02.24), but save the date and register already today if you want to learn about hands-on multi-class and multi-label text classification in python (+ want to support a good cause :)👇
bit.ly/3tYhHb4
November 4, 2023 at 1:56 PM
We just launched the (Beta) project website for our framework ValiTex on how to validate computational text-based measures www.valitex.info.
👉Scroll the page for info (🔗paper arxiv.org/abs/2307.02863)
👉Download a checklist for different use cases
👉Let us know what you think about it

polisky cssky
October 13, 2023 at 12:42 PM