Parallel Spectral Numerical Methods/The Cubic Nonlinear Schrodinger Equation

= The Cubic Nonlinear Schrödinger Equation =

Background
The cubic nonlinear Schrödinger equation occurs in a variety of areas, including, quantum mechanics, nonlinear optics and surface water waves. A general introduction can be found at http://en.wikipedia.org/wiki/Schrodinger_equation and http://en.wikipedia.org/wiki/Nonlinear_Schrodinger_equation. A mathematical introduction to Schrödinger equations can be found in Sulem and Sulem and Yang. In this section we will introduce the idea of operator splitting and then go on to explain how this can be applied to the nonlinear Schrödinger equation in one, two and three dimensions. In one dimension, one can show that the cubic nonlinear Schrödinger equation is subcritical, and hence one has solutions which exist for all time. In two dimensions, it is $$H^1$$ critical, and so solutions may exhibit blow-up of the $$H^1$$ norm, that is the integral of the square of the gradient of the solution can become infinite in finite time. Finally, in three dimensions, the nonlinear Schrödinger equation is $$L^2$$ supercritical, and so the integral of the square of the solution can also become infinite in finite time. For an introduction to norms and Hilbert spaces, see a textbook on partial differential equations or analysis, such as Evans, Linares and Ponce , Lieb and Loss or Renardy and Rogers. A question of interest is how this blow-up occurs and numerical simulations are often used to understand this; see Sulem and Sulem for examples of this. The cubic nonlinear Schrödinger equation is given by

$$i\psi_t + \Delta \psi \pm| \psi |^2\psi =0,$$

where $$\psi$$ is the wave function and $$\Delta$$ is the Laplacian operator, so in one dimension it is $$\partial_{xx}$$, in two dimensions, $$\partial_{xx}+\partial_{yy}$$ and in three dimensions it is $$\partial_{xx}+\partial_{yy}+\partial_{zz}$$. The $$+$$ corresponds to the focusing cubic nonlinear Schrödinger equation and the $$-$$ corresponds to the defocusing cubic nonlinear Schrödinger equation. This equation has many conserved quantities, including the &ldquo;mass&rdquo;,

$$\int_{\Omega}|\psi|^2 \text{d}^n\mathbf{x} $$

and the &ldquo;energy&rdquo;,

$$\int_{\Omega}\frac{1}{2}|\nabla \psi|^2\mp\frac{1}{4}|\psi|^4\text{d}^n\mathbf{x} $$

where $$n$$ is the dimension and $$\Omega$$ is the domain of the solution. As explained by Klein, these two quantities can provide useful checks on the accuracy of numerically generated solutions.

Splitting
We will consider a numerical method to solve this equation known as splitting. This method occurs in several applications, and is a useful numerical method when the equation can be split into two separate equations, each of which can either be solved exactly, or each part is best solved by a different numerical method. Introductions to splitting can be found in Holden et al., McLachlan and Quispel , Thalhammer , Shen, Tang and Wang , Weideman and Herbst and Yang , and also at http://en.wikipedia.org/wiki/Split-step_method. For those interested in a comparison of time stepping methods for the nonlinear Schrödinger equation, see Klein. To describe the basic idea of the method, we consider an example given in Holden et al., which is the ordinary differential equation,

We can solve this equation relatively simply by separation of variables to find that

$$u(t)=\frac{4}{4+\exp(t)}.$$

Now, an interesting observation is that we can also solve the equations $$u_t=u^2$$ and $$u_t=-u$$ individually. For the first we get that $$u(t)=\frac{u(0)}{1-tu(0)}$$ and for the second we get that $$u(t)=u(0)\exp(-t)$$. The principle behind splitting is to solve these two separate equations alternately for short periods of time. We will describe Strang splitting, although there are other forms of splitting, such as Godunov splitting and also additive splittings. We will not describe these here, but refer you to the previously mentioned references, in particular Holden et al. . To understand how we can solve the differential equation using splitting, consider the linear ordinary differential equation

$$u_t=u+2u$$ with $$u(0)=1$$.

We can first solve $$p_t=p$$ for a time $$\delta t/2$$ and then using $$q(0)=p(\delta t/2)$$, we solve $$q_t=2q$$ also for a time $$\delta t$$ to get $$q(\delta t)$$ and finally solve $$r_t=r$$ for a time $$\delta t/2$$ with initial data $$r(0)=q(\delta t)$$. Thus in this case $$p(\delta t)=\exp(\delta t/2)$$, $$q(\delta t)=p(\delta t/2)\exp(2\delta t)=\exp(5\delta t/2)$$ and $$u(\delta t)\approx r(\delta t/2)=q(\delta t)\exp(\delta t/2)=\exp(3\delta t)$$, which in this case is the exact solution. One can perform a similar splitting for matrix differential equations. Consider solving $$\mathbf u_t = (\mathbf A + \mathbf B)\mathbf u$$, where $$\mathbf A$$ and $$\mathbf B$$ are $$n\times n$$ matrices, the exact solution is $$\mathbf u=\exp\left((\mathbf A + \mathbf B) t\right)\mathbf u(t=0)$$, and an approximate solution produced after one time step of splitting is $$u(\delta t)\approx u(0)\exp(\mathbf A \delta t)\exp(\mathbf B \delta t)$$, which is not in general equal to $$u(t=0)\exp\left((\mathbf A + \mathbf B) \delta t\right)$$ unless the matrices $$\mathbf A$$ and $$\mathbf B$$ commute, and so the error in doing splitting in this case is of the form $$(\mathbf A \mathbf B - \mathbf B \mathbf A)\delta t$$. Listing $$ uses Matlab to demonstrate how to do splitting for eq. $$.

A Matlab program which uses Strang splitting to solve an ODE Code download

A Python program which uses Strang splitting to solve an ODE. Code download

Exercises

 * 1) Modify the Matlab code to calculate the error at time 1 for several different choices of timestep. Numerically verify that Strang splitting is second order accurate.
 * 2) Modify the Matlab code to use Godunov splitting where one solves $$u1_t=u1$$ for a time $$\delta t$$ and then using $$u1(\delta t)$$ as initial data solves $$u2_t=2u2$$ also for a time $$\delta t$$ to get the approximation to $$u(\delta t)$$. Calculate the error at time 1 for several different choices of timestep. Numerically verify that Godunov splitting is first order accurate.

Serial
For the nonlinear Schrödinger equation

we first solve

exactly using the Fourier transform to get

$$\psi(\delta{t}/2,\cdot)$$.

We then solve

with

$$\psi(\delta{t}/2,\cdot)$$

as initial data for a time step of $$\delta t$$. As explained by Klein and Thalhammer, this can be solved exactly in real space because in eq.$$, $$| \psi |^2$$ is a conserved quantity at every point in space and time. To show this, let $$\psi^*$$ denote the complex conjugate of $$\psi$$, so that

$$\frac{\mathrm{d}| \psi |^2}{\mathrm{d}t} = \psi^*\frac{\mathrm{d}\psi}{\mathrm{d}t} +\frac{\mathrm{d}\psi^*}{\mathrm{d}t}\psi = \psi^*\left(\pm i|\psi|^2\psi\right)+\left(\pm i|\psi|^2\psi\right)^*\psi = 0$$.

Another half step using eq. $$ is then computed using the solution produced by solving eq. $$ to obtain the approximate solution at time $$\delta t$$. Example Matlab codes demonstrating splitting follow.

Example Matlab and Python Programs for the Nonlinear Schrödinger Equation
The program in listing $$ computes an approximation to an explicitly known exact solution to the focusing nonlinear Schrödinger equation.

A Matlab program which uses Strang splitting to solve the one dimensional nonlinear Schrödinger equation Code download

A Python program which uses Strang splitting to solve the one-dimensional nonlinear Schrödinger equation. Code download

A Matlab program which uses Strang splitting to solve the two dimensional nonlinear Schrödinger equation Code download

A Python program which uses Strang splitting to solve the two-dimensional nonlinear Schrödinger equation. Code download

A Matlab program which uses Strang splitting to solve the three dimensional nonlinear Schrödinger equation Code download

Example One-Dimensional Fortran Program for the Nonlinear Schrödinger Equation
Before considering parallel programs, we need to understand how to write a Fortran code for the one-dimensional nonlinear Schrödinger equation. Below is an example Fortran program followed by a Matlab plotting script to visualize the results. In compiling the Fortran program a standard Fortran compiler and the FFTW library are required. Since the commands required for this are similar to those in the makefile for the heat equation, we do not include them here.

A Fortran program to solve the 1D nonlinear Schrödinger equation using splitting Code download

A Matlab program which plots a numerical solution to a 1D nonlinear Schrödinger equation generated by listing $$ Code download

Shared Memory Parallel: OpenMP
We recall that OpenMP is a set of compiler directives that can allow one to easily make a Fortran, C or C++ program run on a shared memory machine – that is a computer for which all compute processes can access the same globally addressed memory space. It allows for easy parallelization of serial programs which have already been written in one of the aforementioned languages.

We will demonstrate one form of parallelizm for the two dimensional nonlinear Schrödinger equation in which we will parallelize the loops using OpenMP commands, but will use the threaded FFTW library to parallelize the transforms for us. The example programs are in listing $$. A second method to parallelize the loops and Fast Fourier transforms explicitly using OpenMP commands is outlined in the exercises.

An OpenMP Fortran program to solve the 2D nonlinear Schrödinger equation using splitting and threaded FFTW Code download

An example makefile for compiling the OpenMP program in listing $$ Code download

The example assumes one is using Flux and has loaded environments for the GCC compiler as well as the GCC compiled version of FFTW. To use the Intel compiler to with this code, the OMP stack size needs to be explicitly set to be large enough. If one is using the PGI compilers instead of the GCC compilers, change the flag $$-fopenmp$$ to $$-mp$$ in the makefile.

A Matlab program which plots a numerical solution to a 2D nonlinear Schrödinger equation generated by listing $$ Code download

An example submission script for use on Flux. Change {your_username} appropriately. Code download

Exercises
  Download the example Matlab programs which accompany the pre-print by Klein, Muite and Roidot. Examine how the mass and energy for these Schrödinger like equations are computed. Add code to check conservation of mass and energy to the Matlab programs for the nonlinear Schrödinger equation.  The Gross-Pitaevskii equation is given by $$i\psi_t+|\psi|^2\psi +V(\mathbf x)\psi=0 $$ where we will take $$V(\mathbf x)=\lVert \mathbf x \rVert^2_{l^2}=\sum_{k=1}^Nx_k^2 $$ in which $$N$$ is the space dimension. Show that this equation can be solved by splitting it into

and

Be sure to explain how eqs. $$, $$ are solved.  Modify the Matlab codes to solve the Gross-Pitaevskii equation in one, two and three dimensions.   Modify the serial Fortran codes to solve the Gross-Pitaevskii equation in one, two and three dimensions.   Listings $$ and $$ give an alternate method of parallelizing an OpenMP program. Make the program in listing $$ as efficient as possible and as similar to that in $$, but without changing the parallelization strategy. Compare the speed of the two different programs. Try to vary the number of grid points and cores used. Which code is faster on your system? Why do you think this is?

An OpenMP Fortran program to solve the 2D nonlinear Schrödinger equation using splitting Code download

An example makefile for compiling the OpenMP program in listing $$ Code download

The example assumes one is using Flux and has loaded environments for the intel compiler as well as the Intel compiled version of FFTW. If one is using the freely available GCC compilers instead of the Intel compilers, change the flag $$-openmp$$ to $$-fopenmp$$ in the makefile. 

 Modify the OpenMP Fortran codes to solve the Gross-Pitaevskii equation in two and three dimensions.   Some quantum hydrodynamic models for plasmas are very similar to the nonlinear Schrodinger equation and can also be numerically approximated using splitting methods. A model for a plasma used by Eliasson and Shukla is

and

where $$\Psi$$ is the effective wave function, $$\phi$$ the electrostatic potential and $$D$$ the dimension, typically 1,2 or 3. This equation can be solved in a similar manner to the Davey-Stewartson equations in Klein, Muite and Roidot. Specifically, first solve

using the Fourier transform so that

$$P(\delta t)=\exp\left(-i\Delta \delta t\right)P(0)$$

where $$P(0)=\Psi(0)$$. Then solve

using the Fourier transform. Finally, solve

using the fact that at each grid point $$\phi - | Q |^{4/D}$$ is a constant, so the solution is $$Q(\delta t)=\exp\left[ i\left(\phi-| \Phi |^{4/D}\right)\delta t\right]Q(0)$$ with $$Q(0)=P(\delta t)$$ and $$\Psi(\delta t)\approx Q(\delta t)$$.  The operator splitting method can be used for equations other than the nonlinear Schrödinger equation. Another equation for which operator splitting can be used is the complex Ginzburg-Landau equation $$\frac{\partial A}{\partial t}=A+(1+i\alpha)\Delta A- (1+i\beta)|A|^2A,$$ where $$A$$ is a complex function, typically of one, two or three variables. An example one dimensional code is provided in listing $$, based on an earlier finite difference code by Blanes, Casa, Chartier and Miura, using the methods described in Blanes et al. . By using complex coefficients, Blanes et al. can create high order splitting methods for parabolic equations. Previous attempts to do this have failed since if only real coefficients are used, a backward step which is required for methods higher than second order leads to numerical instability. Modify the example code to solve the complex Ginzburg-Landau equation in one, two and then in three spatial dimensions. The linear part $$\frac{\partial A}{\partial t}=A+(1+i\alpha)\Delta A$$ can be solved explicitly using the Fourier transform. To solve the nonlinear part, $$\frac{\partial A}{\partial t}=- (1+i\beta)|A|^2A$$ consider $$\frac{\partial |A|^2}{\partial t}=\frac{\partial A}{\partial t}A^*+\frac{\partial A^*}{\partial t}A=2|A|^4$$ and solve this exactly for $$|A|^2$$. To recover the phase, observe that $$\frac{\partial \log(A)}{\partial t}=- (1+i\beta)|A|^2$$ which can also be integrated explicitly since $$|A|^2(t)$$ is known.

A Matlab program which uses 16th order splitting to solve the cubic nonlinear Schrödinger equation Code download </ol>

Distributed Memory Parallel: MPI
For this section, we will use the library 2DECOMP&FFT available from http://www.2decomp.org/index.html. The website includes some examples which indicate how this library should be used, in particular the sample code at http://www.2decomp.org/case_study1.html is a very helpful indication of how one converts a code that uses FFTW to one that uses MPI and the aforementioned library.

Before creating a parallel MPI code using 2DECOMP&FFT, we will generate a serial Fortran code that uses splitting to solve the 3D nonlinear Schrodinger equation. Rather than using loop-based parallelization to do a sequence of one dimensional fast Fourier transforms, we will use FFTW's three dimensional FFT, so that the serial version and MPI parallel version have the same structure. The serial version is in listing $$. This file can be compiled in a similar manner to that in | Listing A of the chapter Fortran Programs.

A Fortran program to solve the 3D nonlinear Schrödinger equation using splitting and FFTW Code download

In comparison to the previous programs, the program in listing $$ writes out its final data as a binary file. This is often significantly faster than writing out a text file, and the resulting file is usually much smaller in size. This is important when many such files are written and/or if individual files are large. Due to the formatting change, the binary file also needs to be read in slightly differently. The Matlab script in listing $$ shows how to do this.

Matlab program which plots a numerical solution to a 3D nonlinear Schrödinger equation generated by listings $$ or $$ Code download

We now modify the above code to use MPI and the library 2DECOMP&FFT. The library 2DECOMP&FFT hides most of the details of MPI although there are a few commands which it is useful for the user to understand. These commands are:


 * <tt>USE mpi</tt> or <tt>INCLUDE 'mpif.h'</tt>
 * MPI_INIT
 * MPI_COMM_SIZE
 * MPI_COMM_RANK
 * MPI_FINALIZE

The program is listed in listing $$, please compare this to the serial code in $$. The library 2DECOMP&FFT does a domain decomposition of the arrays so that separate parts of the arrays are on separate processors. The library can also perform a Fourier transform on the arrays even though they are stored on different processors – the library does all the necessary message passing and transpositions required to perform the Fourier transform. It should be noted that the order of the entries in the arrays after the Fourier transform is not necessarily the same as the order used by FFTW. However, the correct ordering of the entries is returned by the structure <tt>decomp</tt> and so this structure is used to obtain starting and stopping entries for the loops. We assume that the library 2DECOMP&FFT has been installed in an appropriate location.

A Fortran program to solve the 3D nonlinear Schrödinger equation using splitting and 2DECOMP&FFT Code download

To compile the code modify the makefile in listing $$ appropriately for your system. Type make and then you should be ready to run it after it compiles. It is assumed that FFTW is the underlying one dimensional fast Fourier transform library called by 2DECOMP&FFT, so this should be changed if another library is used. Note that you will need to rebuild the program if you want to change the problem size or the runtime.

An example makefile for a parallel Fortran program to solve the 3D nonlinear Schrödinger equation using splitting and 2DECOMP&FFT Code download

Exercises

 * 1) The ASCII character set requires 7 bits per character and so at least 7 bits are required to store a digit between 0 and 9. A double precision number in IEEE arithmetic requires 64 bits to store a double precision number with approximately 15 decimal digits and approximately a 3 decimal digit exponent. How many bits are required to store a IEEE double precision number? Suppose a file has $$10^6$$ double precision numbers. What is the minimum size of the file if the numbers are stored as IEEE double precision numbers? What is the minimum size of the file if the numbers are stored as characters?
 * 2) Write an MPI code using 2DECOMP&FFT to solve the Gross-Pitaevskii equation in three dimensions.
 * 3) Learn to use either VisIt (https://wci.llnl.gov/codes/visit/) or Paraview (http://www.paraview.org/) and write a script to visualize two and three dimensional output in a manner that is similar to the Matlab codes.