Parallel Spectral Numerical Methods/Visualization with ParaView CoProcessing

The ParaView CoProcessing plugin allows an application code to be instrumented to connect to a ParaView server in order to execute a visualization pipeline. The pipeline can produce either images in a variety of formats or VTK XML parallel file format data sets.

A user creates a pipeline in the ParaView client using a sample, probably lower resolution, data set. This pipeline is then exported using the CoProcessor plugin as a Python script which will be loaded by the application.

The application developer also needs to write some Fortran/C++ code to convert the application’s data structures into a format ParaView can understand. In the case of completely regular grids, this is straightforward.

= Using ParaView CoProcessing on a Single Computation Node =

ParaView client setup
The first requirement is to have a ParaView client which can export the Python script. This requires building from source as of ParaView 3.14.1. The standard instructions apply with some additional instructions for building the script generator plugin. After building, the first time you launch ParaView, go to Tools, Manage Plugins and set the CoProcessingPlugin to load on startup. After the plugin is loaded, there should be two new menu items: Writers and CoProcessing

Client build notes:

 * Might have to install CMake
 * Might have to build Qt
 * Might have to turn off building the Manta plugin

Existing code alterations
The following instructions pertain to the Navier-Stokes CUDA Fortran as tested on NCSA's Forge during the Summer of 2012.

Simulation code
navierstokes.cuf added 4 lines and removed .datbin writing. Code download

ns2dcnadaptor.f90
Defines navierstokescoprocessor subroutine. This calls some functions from the ParaView supplied FortranAdaptorAPI.h and the user defined functions in ns2dcnVTKDataSet.cxx. Code download

ns2dcnVTKDataSet.cxx
Defines functions that put simulation data into a format which the ParaView/VTK libs can understand. Very straightforward for completely regular grids. Expanding to 3D is simple. Unstructured meshes will require more effort. Code download

Creating a visualization pipeline in the ParaView Client
The script pipeline is generated in the ParaView client by using the CoProcessing Script Generator Plugin.


 * File &gt; Open &gt; browse to data set, the file should now be visible in the Pipeline Browser window.
 * In the Properties window, click apply, the data should now appear in the layout window.
 * To change the colormap, click on the toolbar button that’s a rainbow with a green circle.
 * To display the colormap in the layout, click on the toolbar button that’s a vertical rainbow.
 * Set the size of the viewport in Tools &gt; Lock View Size Custom. This will help you arrange the items for the final image. Make a note of the dimensions, you’ll need them later for a bug workaround.
 * If writing out VTK data sets, go to the Writers &gt; Parallel Image Data Writer, this should add a writer to the pipeline
 * To export the python pipeline go to CoProcessing &gt; Export State, this should launch a wizard
 * click Next
 * Select the dataset you want to connect to your simulation, Add it, click Next
 * The default Simulation Name is “input”. If you change this, be sure to update ns2dcnVTKDataSet.cxx, click Next
 * Check whether or not to output images. Click Finish.
 * Save the file. The file name will be referenced in navierstokes.cuf, so save as something like pipeline-test-??.py and then cp that to pipeline.py so to avoid recompiling for new pipelines.

Python script issues/bugs
If the pipeline is set to output images, then there are two issues that can be considered bugs in the script.


 * 1) No matter what viewport size is set in the client, it doesn’t get used by the script.
 * 2) Color map labels will be drawn with a box surrounding them.

The workaround for the first is to add a method call to ViewSize equal to the viewport window dimensions in the client. The second a simple edit. Both of these labeled in the sample pipeline.py with #mvm comments

Set the environment
Forge required loading the ParaView OSMesa module. $ module load paraview/3.14.1-OSMesa. On Nautilus the module is paraview/3.14.1

Run the simulation
Run as you would the uninstrumented version.

Convert the images to an animation
There are a variety of ways to do this, if ffmpeg is available ffmpeg -sameq -i path/to/images/image%04d.png myanimation.mpg

or if writing out the VTK data sets, one can work with those in ParaView directly.

dt and effect on animation time.
Smooth video is ~30 fps or dt = 0.033… If you write out a larger dt as VTK data sets, then one can do linear time interpolation in ParaView. If writing out a larger dt as images to stitch into an mpg, the options available created much reduced video quality.

ParaView Server setup

 * Forge did not have X support, so an offscreen Mesa (OSMesa) build was necessary.
 * Nautilus supports X and so can render with the supplied OpenGL libraries.

= Summary of Changes for New API =

The coprocessing API has had some minor changes since ParaView 3.14.1. The following applies for ParaView 3.98+ and Catalyst 1.0 alpha+. These were tested on Beacon in offload mode at NICS on August 20, 2013.

The code changes are:
 * In the simulation code, coprocessorinitialize is now coprocessorinitializewithpython, the arguments are the same.
 * In the C++ data set helper code, include vtkCPPythonAdaptorAPI.h instead of FortranAdaptorAPI.h.
 * In the C++ data set helper code, change from ParaViewCoprocessing namespace to vtkCPPythonAdaptorAPI namespace.

The libraries to link against have also changed, for a ParaView 4.0.1 build these are:
 * -lvtkPVCatalystPython26D-pv4.0
 * -lvtkPVCatalyst-pv4.0
 * -lvtkCommonCore-pv4.0
 * -lvtkCommonDataModel-pv4.0
 * -lvtkPVPythonCatalyst-pv4.0

Example with New API
Simulation: navierstokes.f90 -- only change is the initialization function name. Code download

Fortran adaptor: NSadaptor.f90 -- No API changes, included for completeness. Code download

C++ VTK data helper: PointBasedDataSet.cxx -- Changes are a new header and a new namespace. Notice also the use of vtkSmartPointer instead of raw pointer handling. Code download

= Using ParaView CoProcessing with MPI and Domain Decomposition =

The following notes developed while working on NS3D-MPI, NavierStokes3DfftIMR.f90, KgSemiImp3D.f90, and NSLsplitting.f90 on Nautilus at NICS.

Nodal interpretation
When working with single CPU node, non-partioned data, the simulation arrays can be passed directly to ParaView and added to the pipeline using the vtkDataSet::GetPointData-&gt;AddArray method. The vtkImageData’s points then correspond exactly to the simulation’s mesh nodes. This vtkImageData will have an implied cellular structure, with each cell corner being one of the simulation nodes.

Without special altering for MPI this code will run, however ParaView won’t be able to create the cells whose corners are on differet ranks. The .pvti will give an error when loaded that the extents are incorrect. The individual .vtis can be loaded to see that this is indeed the case and that there will be cellular gaps between the pencils.

E.g., say the array is dimension(20,10,10) and has been decomposed into 2 pencils, (1:10,:,:) and (11:20,:,:). ParaView considers the data for (10:11,:,:) to be missing and a gap will be present.

Working with point halos
Halos provide one way to fill in the gaps. The 2decomp&amp;fft halo api creates arrays containing the pencil data surrounded by halo elements. E.g., say a pencil is (10:20,10:20,:), the halo array is then (9:21,9:21,:). Using the halos raises some issues:


 * Halo elements can extend outside of the computational domain and these will contain garbage.
 * Not every pencil needs to send the same halo elements to the coprocessor. This requires additional decision logic in the program.
 * From the C++ point of view, a halo array section will not be densely packed, and therefore contain garbage.

Code changes in simulation code

 * Added halo arrays.
 * Added 1D arrays for passing packed arrays to C++.
 * Added branching for which halo elements to pass based on pencil extents.
 * If pencil furthest from index origin, send just the pencil, no halo elements.
 * Else if pencil is against either furthest extent, but not both, send the halo elements furthest in the other extent.
 * Else send the halo elements furthest in both extents.


 * Added reshaping of the halo sections. Arrays are passed by reference from F90 to C++. C++ doesn’t understand F90 array sections and will just read incrementally from the reference (this is an issue with any array sectioning, not just halos). One way around this is to reshape the section as a 1D array, this will make the desired data contiguous in memory.

Code download

Code changes in f90 adaptor glue code
Code download
 * Changed the adaptor to accept the subextents of the current pencil.

Code changes in the C++ wrapping code
Code download
 * Specified the partitioned data extents relative to the entire data.
 * Specified the relative spacing of the cells.

Pros

 * Some VTK/ParaView filters work with ghost (halo) cells, code could be extended to handle those.

Cons

 * Much code added to the simulation code itself.
 * Additional memory required for the halos and packed arrays.
 * Additional overhead of reshaping arrays for C++.
 * A different decomposition library might require changing the halo logic.
 * Would not work as is with a 3D decomposition library.

Reinterpreting simulation nodes as visualization cells
Another way to look at the data, is that each node of the simulation is a cell in the visualization domain. Then when the pencils are sent to ParaView, there aren’t any gaps of missing cells.

Code changes in simulation code
Code download
 * None, beyond the regular CoProcessing calls.

Code changes in f90 adaptor glue code
Code download
 * Accept the subextents of the current pencil.

Code changes in C++ wrapping code
Code download
 * Changed how the extents are calculated.
 * Changes point data calls to cell data calls.

Pros

 * Minimal code changes, none in simulation code (beyond the coprocessing calls).
 * No halo or packed arrays needed.
 * Should work with different decomposition libraries without change.

Cons

 * Requires extra handling in the ParaView client to handle the differences between cell and point rendering.
 * Won’t work as is with VTK/ParaView filters that require ghost cells.

Working with Fortran complex type
C++ does not have a native complex data type. This requires some additional handling in the ParaView client:


 * 1) Run the original non-coprocessing simulation at a reduced mesh size.
 * 2) Open this data in the ParaView client using the RAW binary reader.
 * 3) Set the number of Scalar Components to 2
 * 4) Create the pipeline, the real and imaginary parts’s default names are ImageFile_X and ImageFile_Y, respectively.

See the code for NLSadaptor.f90 for details on passing a complex to the coprocessor. Code download

This also requires some minor code changes in the accompanying C++ file. Code Download // Adaptor for getting fortran simulation code into ParaView CoProcessor. // Based on the PhastaAdaptor sample in the ParaView distribution. // ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx

// Fortran specific header // ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
 * 1) include "FortranAdaptorAPI.h"

// CoProcessor specific headers // Routines that take the place of VTK dataset object creation. // Called from Fortran code which also calls the Fortran Adaptor API // supplied with ParaView source. // Note: names mangled with trailing underscores for linker visibility.
 * 1) include "vtkCPDataDescription.h"
 * 2) include "vtkCPInputDataDescription.h"
 * 3) include "vtkCPProcessor.h"
 * 4) include "vtkDoubleArray.h"
 * 5) include "vtkPointData.h"
 * 6) include "vtkSmartPointer.h"
 * 7) include "vtkImageData.h"
 * 8) include "vtkCellData.h"
 * 9) include

// These will be called from the Fortran "glue" code" // Completely dependent on data layout, structured vs. unstructured, etc. // since VTK/ParaView uses different internal layouts for each.

// Creates the data container for the CoProcessor. // Takes the extents for both the global dataset and the particular subsection // visible to the current MPI rank. // Note: expects to receive Fortran base-1 indices. extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen,	int* yst, int* yen, int* zst, int* zen) { if (!ParaViewCoProcessing::GetCoProcessorData) { vtkGenericWarningMacro("Unable to access CoProcessorData."); return; }

// The simulation grid is a 3-dimensional topologically and geometrically // regular grid. In VTK/ParaView, this is considered an image data set. vtkSmartPointer img = vtkSmartPointer::New;

// For Fortram complex, need to set that there are 2 scalar components img->SetNumberOfScalarComponents(2);

// Indexing based on change from F90 to C++, and also from nodal to cellular. // Extents are given in terms of nodes, not cells. img->SetExtent(*xst - 1, *xen, *yst - 1, *yen, *zst - 1, *zen); // Setting spacing is important so that the camera position in the pipeline makes // sense if using different sized meshes between setting up the pipeline and running // the simulation. Origin can often be ignored. img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz); // considering passing in as args. ParaViewCoProcessing::GetCoProcessorData->GetInputDescriptionByName("input")->SetGrid(img); ParaViewCoProcessing::GetCoProcessorData->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx, 0, *ny, 0, *nz); }

// Add field to the data container. // Separate from above because this could be dynamic, grid is static. // Might be an issue, VTK assumes row major C storage. // Underscore is by-hand name mangling for fortran linking. // Note: Expects the address of the data, has no way of determining // if the array is densely packed or not. // Note 2: Assumes "name" points to null-terminated array of chars. // Easiest way to do that is to concatenate in caller. extern "C" void addfield_(double* a, char* name) {

vtkSmartPointer idd = ParaViewCoProcessing::GetCoProcessorData->GetInputDescriptionByName("input"); vtkSmartPointer img = vtkImageData::SafeDownCast(idd->GetGrid);

if (!img) { vtkGenericWarningMacro("No adaptor grid to attach field data to."); return; }

if (idd->IsFieldNeeded(name)) { vtkSmartPointer field = vtkSmartPointer::New; field->SetNumberOfComponents(2); field->SetName(name); field->SetArray(a, 2* img->GetNumberOfCells, 1); img->GetCellData->AddArray(field); } }

Working with cell data

 * Some ParaView filters and representations will only work with point data, e.g. volume rendering. Use the Cell Data to Point Data filter to convert. Be sure to check the option to pass the cellular data through the filter. If a simulation runs with no errors and produces no output, a possible culprit is failing to have checked that option when creating the pipeline.

vtkKDTreeGenerator warnings

 * Pipelines that use volume rendering will generate warnings about Region IDs being 0. These can often be ignored.

= ParaView CoProcessing Resources =


 * ParaView Wiki Entry
 * SC10 Coprocessing Tutorial
 * The ParaView source code, in particular the PHASTA Fortran example.
 * paraview-users mailing list