Molecular Simulation/Umbrella Sampling

Umbrella sampling is a sampling method used in computational physics and chemistry. This sampling can sample the rare states which normal molecular dynamic sampling ignored. Therefore, umbrella sampling can improve free energy calculation when a system is undergoing a systematic change.

Biased molecular dynamics simulations
Normal MD simulations samples system in equilibrium. In an MD simulation of the time series of the C-C-C-C dihedral angle of n-butane(aq), only gauche states and trans states are sampled. Because this simulation is only performed in 2 ns, states with high free energy (e.g. cis state) are less likely to happen. These configurations are ignored and it is impossible to calculate the free energy of these states from this simulation. An artificial bias potential is needed in this case to help the molecule cross the energy barrier. With bias potential, rare states can be effectively sampled. In this case, a harmonic bias potential $$w(\phi)=-\mathcal{V}(\phi)$$ is needed to counteract the dihedral barrier.

$$w(\phi)=-k_\phi(1+\cos(n\phi-\delta))$$ High free energy states were captured by biased simulation. In order to calculate the free energy profile of these states, biased probability distribution has to be converted to an unbiased probability distribution.

Acquire free energy profile from biased simulations
The potential energy $$V'(r)$$ includes the bias potential $$w(s)$$at the reaction coordinate $$s$$ is $$\mathcal{V}'(r)=\mathcal{V}(r)+w(s)$$

The probability distribution of this potential is

$$P'(s)=\frac{\int{\exp\left(\frac{-\mathcal{V}'(r)}{k_B T}\right)\delta(f_\alpha(r_1,...,r_N)-s)d^Nr}}{\int{\exp\left(\frac{-\mathcal{V}'(r)}{k_B T}\right)}d^Nr}$$

The probability distribution of unbiased potential is

$$P(s)=\frac{\int{\exp\left(\frac{-\mathcal{V}(r)}{k_B T}\right)\delta(f_\alpha(r_1,...,r_N)-s)d^Nr}}{\int{\exp\left(\frac{-\mathcal{V}(r)}{k_B T}\right)}d^Nr}$$

From this equation, we can derive,

$$P'(s)=\exp\left(\frac{w(s)}{k_B T}\right) P(s) \left\langle \exp\left(\frac{-w(s)}{k_B T}\right) \right\rangle ^{-1}$$

Free energy profile can be calculated from probability distribution by,

$$A(s)=k_B T \ln P(s)$$

Using this relation, the PMF of the biased simulation can be converted to unbiased PMF by:

$$A(s)=A'(s)-w(s)-k_B T \ln \left\langle \exp\left(\frac{-w(s)}{k_B T}\right) \right\rangle$$

$$-k_B T \ln \left \langle \exp\left(\frac{-w(s)}{k_B T}\right) \right\rangle$$ term is denoted as $$F_i$$. It is generally a constant and in some cases does not affect the relative energy and no needed to calculate. It can be calculated by :

$$\exp\left(\frac{-F_i}{k_B T}\right)=\int{P(s)\exp\left(\frac{-w(s)}{k_B T}\right)} ds=\int{\exp\left(\frac{-w(s)-A(s)}{k_B T}\right)}ds$$

This method provides free energy profile of all possible states. In umbrella sampling of n-butane(aq), the chosen bias potential covered all reaction coordinates. General cases are more complex, which leads to a more complex determination of bias potential.

Choice of Bias Potential
The previous section discussed the biased molecular dynamic simulation of n-butane(aq). The reaction coordinate is one-dimensional and periodic, and the bias potential was chosen to be the negative the dihedral potential of n-butane. The optimum bias potential is the opposite of the free energy $$A(s)$$. However, $$A(s)$$ is unknown for most cases. For general cases, the bias potential needs to be adjusted along the reaction coordinate. Thus, a harmonic bias potential restrained on a reference point $$s_{0,i}$$ with respect to a window $$i$$ on the reaction coordinate is introduced :

$$w_i(s)=\frac{1}{2} k_i(s-s_{0,i})^2$$

Therefore, a full umbrella sampling can be obtained by a series of biased MD simulation on different reference points on the reaction coordinate.

Calculation of the Potential of Mean Force from Umbrella Sampling Data
The Weighted Histogram Analysis Method (WHAM) transferred a series of biased distribution functions to a single distribution function. The value of $$F_i$$ needs to be estimated to give the correct value of $$A(s)$$: $$A(s)=A'(s)-w(s)+F_i$$

The true distribution P(s) is the weight average of each step :

$$P(s)=\sum_{i=1}^{window}p_i(s)P_i(s)$$

And $$p_i=N_i \exp\left(\frac{-w(s)+F_i}{k_B T}\right)$$, where $$N_i$$ is the total number of steps sampled for window $$i$$.

Combined with $$\exp\left(\frac{-F_i}{k_B T}\right)=\int{P(s) \exp\left(\frac{-w(s)}{k_B T}\right)}ds=\int{\exp\left(\frac{-w(s)-A(s)}{k_B T}\right)}ds$$, both $$P(s)$$ and $$F_i$$ can be obtained.

The other way to analyze umbrella sampling is Umbrella Integration, see.