import numpy as npimport scipy.stats as stimport polars as plimport cmdstanpyimport arviz as azimport iqplotimport bebi103import bokeh.ioimport bokeh.plottingbokeh.io.output_notebook()
Loading BokehJS ...
In this lesson, we will learn how to use Markov chain Monte Carlo to do parameter estimation. To get the basic idea behind MCMC, imagine for a moment that we can draw samples out of the posterior distribution. This means that the probability of choosing given values of a set of parameters is proportional to the posterior probability of that set of values. If we drew many many such samples, we could reconstruct the posterior from the samples, e.g., by making histograms. That’s a big thing to imagine: that we can draw properly weighted samples. But, it turns out that we can! That is what MCMC allows us to do.
We already discussed some theory behind this seemingly miraculous capability. For this lesson, we will just use the fact that we can do the sampling to learn about posterior distributions in the context of parameter estimation.
28.1 The data set
The data set we will use is synthetic, but will serve to instruct on parameter estimation techniques. The synthetic experiment is as described in Section 27.2. In the experiment, we take a sample of retinal tissue and expose it to a constant light source. We measure the spiking activity of a single retinal ganglion cell (RGC) for one minute, recording \(n\) spikes.
# Load in as dataframedf = pl.read_csv(os.path.join(data_path, 'rgc_spike_times_1.csv'))# Spike times in milliseconds for conveniencespike_times = df['spike time (ms)'].to_numpy()# Interspike intervalsy = np.concatenate(((spike_times[0],), np.diff(spike_times)))# Make ECDFbokeh.io.show( iqplot.ecdf(y, 'interspike interval (ms)'))
Just glancing at this, we see that we have plenty of interspike intervals and they appear to be Exponentially distributed.
28.2 Statistical model, take 1
We will now formulate a generative model. We will choose an Exponential likelihood, assuming that the spike arrival in these constant conditions are best modeled as a Poisson process. Let \(t_i\) be the time at which spike \(i\) arrives (with \(t_0 = 0\) by definition), and the interspike interval \(i\) as \(y_i = t_i - t_{i-1}\). Then, our likelihood is
\[\begin{align}
y_i \sim \text{Expon}(\beta)\;\forall i.
\end{align}
\]
We need to specify a prior for the rate parameter \(\beta\). The rate parameter must be positive, so we will choose a distribution defined on positive numbers. We will choose to use a Gamma distribution. I do not know much about spiking, so I will choose a weakly informative prior with 95% of the probability mass lying between a spiking rate of 1 and 1000 Hz. Using the Distribution Explorer’s quantile setting tool, I find that I should have Gamma parameters \(a = 0.65\) and \(b = 2.9\times 10^{-3}\) Hz. That is fine if I am going to work in units of seconds, but my data are in units of milliseconds, so I should work in units of kHz to be consistent. By the change of variables formula, I should use \(b = 2.9\) kHz. So, my generative model, with the understanding that all units are consistent with the interspike intervals being in units of milliseconds, is
data {int<lower=2> n;array[n] real spike_times;}transformed data {// Parameters for the prior distribution for betareal a = 0.65;real b = 2.9;// Sorted spike timesarray[n] real t = sort_asc(spike_times);// Interspike intervalsarray[n] real y; y[1] = t[1];for (i in2:n) { y[i] = t[i] - t[i-1]; }}parameters {real<lower=0> beta_;}model { beta_ ~ gamma(a, b); y ~ exponential(beta_);}
Let’s run the model and see what we get for a posterior!
data =dict(n=len(spike_times), spike_times=spike_times)with bebi103.stan.disable_logging(): sm = cmdstanpy.CmdStanModel(stan_file='rgc_spike_times_expon.stan') samples = az.from_cmdstanpy(sm.sample(data=data))
We can plot our MCMC samples as an ECDF or histgram. The binning bias of a histogram is less of an issues with MCMC samples because we can take arbitrarily many of them. So, let’s just plot a histogram.
We have estimated our rate as about 0.024 kHz, or about 24 Hz. Coincidentally, for our choice of prior, we may write the posterior PDF down analytically. You will prove this in an exercise. Given that our prior is Gamma\((a, b)\), the posterior is
In the next section (Chapter 29), we discuss how to concisely report summaries of the posterior. In this case, the posterior is simple, and the above plot works very well. We could alternatively report the median and central 95% credible interval.