Today, Thomas Metz made me aware of a dataset about ministers in Eastern German federal states (Bundesländer) by Sebastian Jäckle. The dataset includes the variable “duration of incumbency” in days for 291 ministers between 1990 and 2011.

I was curious to look at the distribution of duration with the intention to be brave as a physicist and infer a simple stochastic model which reproduces that distribution. I copied the duration data into a matlab vector `duration`

, made histograms, fits for different distributions and KS-Tests. As `duration`

is a discrete random variable (days starting from inauguration), distributions living on the nonnegative integers are the natural candidates. The classical one-parameter distributions Poisson and geometric failed to deliver fitting distributions, but the negative binomial (NB) did surprisingly well.

The best fit yielded parameters and . The Kolmogorov-Smirnov test did not reject that duration data came from the distribution with these parameters (p=0.32), but rejects under reasonably small changes of the two parameters. Thus, it is reasonable to assume

What model does this imply? Looking at the days in the incumbency of a minister. Let us assume that every day can either be a success or failure which happens with probability . The negative binomial is the distribution of the number of successful days until failures occur (there is an extension to non-integer number of failures). Our model is thus, that a minister’s incumbency ends after a certain number of failures (what ever that means in practice). The best fit suggests that under this model 1.79 failures are allowed during a minister’s incumbency and that failures are relatively rare events happening with probabilty 0.11% every day, i.e. on average the first failure happens approximately at day 900.

**Further notes:**

Here ist the matlab code which delivers the results

`% computation`

dist='nbin';bins=0:500:7500;hi=hist(duration,bins);par=mle(duration,'distribution',dist);

% plots

clf;bar(bins,hi/sum(hi)/(bins(2)-bins(1)));hold on;x=0:8000;plot(x,pdf(dist,x,par(1),par(2)),'r','LineWidth',3);

% Kolmogorov-Smirnov test

[h,p]=kstest(duration,[x' cdf(dist,x,par(1),par(2))'])

You can test other two-parameter distributions by changing `dist`

, e.g. to `'logn'`

or `'gam'`

. If you want to check one-parameter distribution you have to further remove `,par(2)`

from the code. It turns out that also the gamma distribution delivers a fit which is not rejected. This is reasonable because it is sometimes seen as the continuous-valued version of the negative binomial. Also the Weibull distribution was not rejected (although with much lower p-value), this shows that also other models might be appropriate. As always with statistics and real world data, I assume that the KS-Test would reject my theory when we have a larger dataset, as certainly some deviations from the negative binomial trend get dominant (e.g. caused by election cycles).

I hope this “finger exercise” on finding a simple stochastic model that fits is inspiring for political scientists, although every political theory would likely immediately reject it.

Leave a reply