I am a Senior Data Scientist working at KPMG Lighthouse, KPMG’s Center of Data & Analytics in Germany. I mainly work on Natural Language Processing (NLP) topics, especially on information extraction from documents, e.g. invoices, contracts or form sheets. I also like to serve as a translator between non-technical stake-holders and the tech team.
Before, I completed my PhD in 2019 at the Graduate School of Social Science and Economics at the Chair for Quantitative Methods at the University of Mannheim. During my PhD, I mainly worked on the application of machine learning approaches to political science questions, with a particular focus on predictive modelling and text analysis, e.g. predicting constitutional court decision-making with machine learning and measuring vagueness in judicial texts.
PhD, Chair of Quantitative Methods, 2019
University of Mannheim
Double Degree MA Public Administration, 2015
University of Konstanz & Science Po Grenoble
BA in Political Science & Public Law, 2013
University of Mannheim
Responsibilities include:
Responsibilities include:
This paper shows how wrongly understood cross-validation can lead to reporting wrong performance measures. We demonstrates the serverness of this problem with an experiment and an application to a recently published paper.
In this dissertation chapter, I seek to develop a measurement for vague language in written constitutional court rulings. I use two different methods to approach this a dictionary approach expanded using word embeddings, and a machine learning classifier (using both traditional NLP classifiers and recent deep learning classifiers).
Ex ante forecasting approaches using machine learning become increasingly popular to analyze and predict judicial outcomes. Yet, existing work on the prediction of court decision-making has two important limitations. First, it exclusively focuses on the US Supreme Court. This raises concerns about the external validity of previous stud- ies and their implications for courts in different law traditions. Second, none of the existing studies have explicitly tested the relative contribution of legal context versus political context factors to the forecast of court decisions. This study addresses these two points by ex ante predicting over 2,900 decisions of the German Federal Con- stitutional Court. I find that similar methodological approaches successfully applied to predict Supreme Court decisions also work for Kelsenian European constitutional court types. My results also show that the legal context of a decision is already a good predictor. However, the predictive performance is significantly improved when information about the political context of a decision is added. These findings therefore support the view of a multifaceted decision-making of constitutional courts which is best characterized by the ensemble of both legal and political factors.
Political scientists pervasively use data that contains sensitive information – e.g. micro-level data about individuals. However, researchers face a dilemma while data has to be publicly available to make research reproducible, information about individuals needs to be protected. Synthetic copies of original data can address this concern, because ideally they contain all relevant statistical characteristics without disclosing private information. But generating synthetic data that captures–eventually undiscovered–statistical relationships is challenging. Moreover, it so far remains unsolved to fully control the amount of information disclosed during this process. To that end differentially private generative adversarial networks (DP-GANs) have been proposed in the (computer science) literature. We experimentally evaluate the trade-off between data utility and privacy protection in a simulation study by looking at evaluation metrics that are important for social scientists, specifically in terms of regression coefficients, marginal distributions and correlation structures. Our findings suggest that on average, higher levels of provided privacy negatively affects the synthetic data quality. We hope to encourage inter-disciplinary work between computer scientists and social scientists to develop more powerful DP-GANs in the future.
Welche Kandidierenden wünscht sich die Öffentlichkeit als Richterinnen und Richter am Bundesverfassungsgericht? Verfassungsgerichte benötigen öffentliche Unterstützung. Diese ergibt sich auch aus der Legitimität der gewählten Richterinnen und Richter. Wir argumentieren, dass politische Akteure durch (nicht-)institutionalisierte Auswahlkriterien die (1) juristische und (2) politisch-ideologische Ausrichtung des Gerichts bestimmen. Kandidierende besitzen Eigenschaften beider Dimensionen. Durch ein Discrete-Choice-Experiment ermitteln wir die öffentlich präferierten Eigenschaften. Wir zeigen, welche Rolle die politische Position von Befragten bei der Bewertung von Kandidierenden spielt und vergleichen die „ideale Richterin“ mit aktuellen Richterinnen und Richtern sowie Kandidierenden rund um Stephan Harbarths Wahl. Die Ergebnisse erweitern unser Verständnis von gerichtlicher Legitimität.
We offer a dynamic Bayesian forecasting model for multiparty elections. It combines data from published pre-election public opinion polls with information from fundamentals-based forecasting models. The model takes care of the multiparty nature of the setting and allows making statements about the probability of other quantities of interest, such as the probability of a plurality of votes for a party or the majority for certain coalitions in parliament. We present results from two ex ante forecasts of elections that took place in 2017 and are able to show that the model outperforms fundamentals-based forecasting models in terms of accuracy and the calibration of uncertainty. Provided that historical and current polling data are available, the model can be applied to any multiparty setting.
We present results of an ex-ante forecast of party-specific vote shares at the German Federal Election 2017. To that end, we combine data from published trial heat polls with structural information. The model takes care of the multi-party nature of the setting and allows making statements about the probability of certain events, such as the plurality of votes for a party or the majority for coalition options in parliament. The forecasts of our model are continuously being updated on the platform zweitstimme.org. The value of our approach goes beyond the realms of academia - We equip journalists, political pundits, and ordinary citizens with information that can help make sense of the parties’ latent support and ultimately make voting decisions better informed.