MINC/SoftwareDevelopment/MINC1-programmers-guide

=Introduction= The MINC file format (Medical Image NetCDF) is based on the NetCDF file format (Network Common Data Form) distributed by the Unidata Program Center. NetCDF provides a software interface for storing named, multi-dimensional variables in files in a machine-independent way. This interface removes applications from the concerns of portability and file structure and encourages a self-describing form of data.

Each NetCDF multi-dimensional variable in a file is described by a name, by its dimensions and by attributes. For example, an image stored in a file might be stored as byte data in a variable called "image", with dimensions "x" and "y" (each of length 256) and with an attribute called "long_name" which is a string describing the content of the image. Many variables can be stored in one file and each variable can have many attributes. Dimensions exist independently of variables and can subscript more than one variable.

MINC provides three things on top of the NetCDF interface. It provides a standard for dimension, variable and attribute names suitable for medical imaging, it provides some convenience functions to complement the NetCDF interface (not specific to the MINC conventions) and it provides convenience functions for using MINC files.

=An Introduction to NetCDF= (For a complete description, see the NetCDF User's Guide).

The NetCDF file
It is useful to look at an example file while considering the NetCDF interface. Fortunately, the NetCDF package provides utilities (ncdump and ncgen) for converting the binary NetCDF files to an ascii format call CDL. A simple NetCDF file, converted to CDL notation by ncdump, is given below:

The sample file stores a 3 by 4 image of double precision values. The first thing defined are the dimensions : xcoord and ycoord. Dimensions can represent physical dimensions like x coordinate, y coordinate etc., or they can represent abstract things like lookup-table index. Each dimension has a name and a length and when joined with other dimensions defines the shape of a variable --- the variable image is subscripted by ycoord and xcoord.

Dimensions can also be used across variables, relating them to each other. For example, if the file contained another image also subscripted by ycoord and xcoord, we would have the important information that the two variables were sampled on the same grid. Also, coordinate systems can be defined by creating variables by the same name as the dimension, like xcoord in the example above, giving the x coordinate of each point in the image.

Variables are the next thing defined in the cdl file. Each variable has a name, data type and a shape specified by a list of dimensions (up to a maximum of MAX_VAR_DIMS = 32 dimensions per variable). The data types are ''NC_CHAR, NC_BYTE, NC_SHORT, NC_INT, NC_FLOAT and NC_DOUBLE''. Information about each variable is stored in attributes. The attribute "long_name" gives a character string describing the variable "image". Attributes are either scalars or vectors of one of the six types listed above (a character string is a vector of type NC_CHAR).

Programming with NetCDF
Programming with NetCDF can be quite simple. The file listed above was produced by the following program:

The first executable line of the program creates a new NetCDF file. An open file is either in ``define mode or in ``data mode. In define mode, dimensions, variables and attributes can be defined, but data cannot be written to or read from variables. In data mode, variable values can be written or read, but no changes to dimensions or variables can be made and attributes can only be written if they exist already and will not get larger with the write. Newly created files are automatically in define mode.

The lines following the call to nccreate define the dimensions and variables in the file. Notice that the NetCDF file, dimensions and variables are all identified by an integer returned when they are created. These id's are subsequently used to refer to each object. The attribute "long_name" for the image variable is identified only by its name.

Once everything is defined, ncendef puts the file into data mode and values are written. The values to write are defined by a vector of starting indices and a vector of the number of values to write in each dimension. This defines a multi-dimensional rectangle within the variable called a hyperslab. In the C interface, the first element of the vector refers to the slowest varying index of the variable, so in this example, the array vals has the xcoord varying fastest. In the FORTRAN interface, the convention has the first subscript varying fastest. These conventions follow the language conventions for multi-dimensional arrays.

=The MINC format= It is possible to build MINC format files using only NetCDF function calls, however a C include file is provided to facilitate (and help ensure) adherence to the standard. This file defines useful constants, standard dimension, variable and attribute names and some attribute values. It also declares the functions defined in a library of minc convenience functions, designed to make an application programmer's life easier.

The MINC standard
Various requirements for file formats have been put forward. One such list is as follows: A protocol should be: 1) simple, 2) self describing, 3) maintainable, 4) extensible, 5) N dimensional, and 6) have a universal data structure. The NetCDF format meets all of these requirements, suggesting that it is a good place to start. I would, however, add some more requirements to the list. Implied in the above list is the requirement that there be a standard for accessing data (how do I get the patient name) --- this is not provided by NetCDF. Furthermore, a useful format should come with a software interface that makes it easy to use, particularly in a development environment. Finally, a format that stores many associated pieces of information should also provide some data organization.

The MINC format attempts to add these things to the NetCDF format.

MINC variable types
Medical imaging tends to produce files with a large amount of ancillary data (patient information, image information, acquisition information, etc.). To organise this information in a useful fashion, MINC uses variables to group together related attributes. The variable itself may or may not contain useful data. For example, the variable MIimage contains the image data and has attributes relevant to this data. The variable MIpatient has no relevant variable data, but serves to group together all attributes describing the patient (name, birthdate, etc.). This sort of variable is called a group variable.

Variables that correspond to dimensions are called dimension variables and describe the coordinate system corresponding to the dimension. An example is MIxspace --- both a dimension and a variable describing the x coordinate of other variables.

The NetCDF conventions allow for these dimension variables to specify the coordinate at each point, but there is nothing to describe the width of the sample at that point. MINC provides the convention of dimension width variables, e.g. MIxspace_width, to give this information.

Finally, it is possible to have attributes that vary over some of the dimensions of the variable. For example, if we have a volume of image data, varying over MIxspace, MIyspace and MIzspace, we may want an attribute giving the maximum value of the each image, varying over MIzspace. To achieve this we use a variable, called a variable attribute, pointed to by an attribute of the image variable.

Thus MINC introduces a number of types of variables: group variables, dimension variables, dimension width variables and variable attributes.

Data organization
MINC attempts to provide some level of data organization through a hierarchy of group variables. As mentioned above, attributes are grouped according to type in group variables. Each group variable can have an MIparent and an MIchildren attribute --- the former specifying the name of another variable that is above this one in the hierarchy, the latter specifying a newline-separated list of variables that come below this one in the hierarchy. At the root of the hierarchy is the MIrootvariable variable, with no parent. Although it is not necessary to make use of this structure, it can provide a mechanism for ordering large amounts of information.

MINC dimension variable and attribute names
The NetCDF format says nothing about variable and dimension naming conventions and little about attribute names. It does provide a few standards, such as the attribute "long_name" for describing a variable, which have been adopted by the MINC standard. MINC defines a set of standard names for commonly used entities and the include file defines constants specifying these names. These are described at length in the MINC reference manual. The most interesting of these is MIimage, the name of the variable used for storing the actual image data in the file.

Image dimensions
The MINC standard gives some special status to the concept of an image. There is nothing inherent in NetCDF that suggests any special status for particular dimensions, but it can be convenient to place limitations on what can vary over which dimensions in an imaging context. For example, the requirement that the variables that specify how to rescale images (see later section on pixel values) not vary with image dimensions means that we can treat the image as a simple unit. In the simplest case, the image dimensions are simply the two fastest varying dimensions of the MIimage variable.

It can also be helpful to allow for vector fields --- images or image volumes that have a vector of values at each point. A simple example of a vector field is an RGB image. At each point in space, there are three values: red, green and blue. The dimension MIvector_dimension is used for the components of the vector and it should be the fastest varying dimension in the MIimage variable. If it is present, then the three fastest varying dimensions of MIimage are the image dimensions.

MINC coordinate system
The MINC standard defines how spatial coordinates should be oriented relative to patients. Files are free to have data stored in the desired direction, but positive world coordinates are given a definite meaning in the medical imaging context. The standard is that the positive x axis points from the patient's left to right, the positive y axis points from posterior to anterior and the positive z axis points from inferior to superior.

The conversion of element index to world coordinates is done using the dimension variable attributes MIdirection_cosines, MIstep and MIstart. If the direction cosines are c=(c_x, c_y, c_z), then the vector between adjacent elements along an axis is step x c. If start(i) and c(i) are the MIstart and MIdirection_cosines attributes for dimension i (one of MIxspace, MIyspace and MIzspace), then the first element of the image variable is at world coordinate \sum_i start(i) c(i).

If the direction cosines are not present, then they are assumed to be (1,0,0) for MIxspace, (0,1,0) for MIyspace and (0,0,1) for MIzspace. Direction cosines are unit vectors and should be normalized. As well, the step attribute should carry the information about axis flipping (negative or positive) rather than the direction cosine.

Pixel values and real values
In medical imaging, pixel values are frequently stored as bytes or shorts, but there is a generally a real value associated with each pixel as well. This real value is obtained by a scale factor and offset associated with each image or image volume. The MINC standard indicates how pixel values should be interpreted.

Image data in the MIimage variable can be stored as bytes, shorts, ints (32-bit), floats or doubles. NetCDF conventions use the attributes MIvalid_range or MIvalid_max and MIvalid_min to indicate the range of values that can be found in the variable. For short values, for example, we might have a valid range of 0 to 32000. To convert these integers to real values, we could use a scale and offset. However, these values would have to change if the data where converted to bytes in the range 23 to 228.

If we specify an image maximum and minimum to which MIvalid_max and MIvalid_min should be mapped by an appropriate scale and offset, then we can convert type and valid range without having to change the real maximum and minimum. To allow the maximum dynamic range in an image, we use the variables MIimagemax and MIimagemin to store the real maximum and minimum --- these can vary over any of the non-image dimensions of MIimage.

General convenience functions
MINC provides a number of convenience functions that have nothing to do with medical imaging, but that make the use of NetCDF files a little easier. One of the drawbacks of the NetCDF format is that data can come in any form (byte, short, int, float, double) and the calling program must handle the general case. Rather than restrict this, MINC provides functions to convert types.

The first set of convenience functions are for type conversion.


 * - reads an attribute vector, specifying the numeric type desired and the maximum number of values to read.
 * - reads one attribute value of the specified type.
 * - read a character attribute of a specified maximum length.
 * - write a double precision attribute.
 * - write a string attribute.
 * - get a hyperslab of values of the specified type.
 * - get a single value of the specified type.
 * - put a hyperslab of values of the specified type.
 * - put a single value of the specified type.

Next we have some functions for handling coordinate vectors.


 * - set a vector of coordinates to a single value.
 * - translate the coordinates for one variable to a vector for subscripting another variable.

Finally, there are functions for dealing with variables as groups of attributes, making it easier to modify a file while keeping ancillary information.


 * - copy all of the attributes of one variable to another (possibly across files).
 * - copy a variable definition (including attributes) from one file to another.
 * - copy a variable's values from one variable to another (possibly across files).
 * - copy all variable definitions from one file to another, excluding a list of variables.
 * - copy all variable values from one file to another, excluding a list of variables.

MINC specific convenience functions
A few routines are provided to deal with some of the minc structures. miattput_pointer and miattget_pointer put/get a pointer to a variable attribute. miadd_child helps maintain the hierarchy of variables by handling the MIparent and MIchildren attributes of two variables. Finally micreate_std_variable and micreate_group_variable create some of the standard variables and fill in a few of the default attributes.

Image conversion variables
One of the requirements for file formats mentioned earlier was a software interface to make the interface easy to use. The biggest difficulty in using a flexible format is that the application must handle many possibilities. Where images are concerned, this means various data types and scale factors, and images of differing sizes. The image conversion variable functions of MINC attempt to remove this complication for the programmer.

An image conversion variable (icv) is essentially a specification of what the program wants images to look like, in type, scale and dimension. When an MINC image is read through an icv, it is converted for the calling program to a standard format regardless of how data is stored in the file.

There are two categories of conversion: Type and range conversions change the datatype (and sign) of image values and optionally scale them for proper normalization. Dimension conversions allow programs to specify image dimension size and image axis orientation (should MIxspace coordinates be increasing or decreasing? should the patient's left side appear on the left or right of the image?).

ICV routines
Accessing a file through an icv is a straight-forward process. Create the icv with miicv_create, set properties (like desired datatype) with the miicv_set routines, attach the icv to a NetCDF variable with miicv_attach and access the data with miicv_get or miicv_put. The icv can be detached from a NetCDF variable with miicv_detach and can be freed with miicv_free.

Icv properties are strings, integers, long integers or doubles. For example, MI_ICV_SIGN (the sign of variable values) is a string, while MI_ICV_IMAGE_MAX (image maximum) is a double precision value. Four functions --- miicv_setint, miicv_setlong, miicv_setdbl and miicv_setstr --- are provided to simplify the setting of property values. Programs can inquire about property values with miicv_inqint, miicv_inqlong, miicv_inqdbl and miicv_inqstr.

Type and range conversion
Pixel values are converted for type and sign by specifying values for the properties MI_ICV_TYPE and MI_ICV_SIGN (they default to NC_SHORT and MI_SIGNED). Values can also be converted for valid range and for normalization. These conversions are enabled by setting MI_ICV_DO_RANGE to TRUE (the default).

If MI_ICV_DO_NORM is FALSE (the default) then only conversions for valid range are made. This means that if the input file has shorts in the range 0 to 4095, then they can be converted to bytes in the range 64 to 248 (for example). The real image maximum and minimum (MIimagemax and MIimagemin) are ignored. The valid range is specified by the properties MI_ICV_VALID_MAX and MI_ICV_VALID_MIN, which default to the legal range for the type and sign.

We may want to scale values so that they are normalized either to all values in the MIimage variable or to some user-defined range. To do normalization, set MI_ICV_DO_NORM to TRUE. Setting MI_ICV_USER_NORM to FALSE (the default) causes normalization to the real maximum and minimum of the variable (the maximum of MIimagemax and the minimum of MIimagemin). If MI_ICV_USER_NORM is true then the values of MI_ICV_IMAGE_MAX and MI_ICV_IMAGE_MIN are used (defaulting to 1.0 and 0.0).

When either MI_ICV_TYPE or the file type is floating-point, then the conversion to and from real values is always done using the real image maximum and minimum information. If the internal type is integer and MI_ICV_DO_NORM is FALSE, then the rescaling is done so that the slice maximum maps to the valid range of the internal values.

Note that when converting to integer types, values are rounded to the nearest integer and limited to be within the legal range for the data type.

The above transformations are simple enough, but the use of floating-point values adds to the complexity, since in general we do not want to rescale these values to get the real values. The various possibilities are described in greater detail below.

The details of pixel value conversion
The easiest way to think about the rescaling is through four ranges (maximum-minimum pairs). In the file variable, values have a valid range var_vrange and these correspond to real values var_imgrange. The user/application wants to convert real values usr_imgrange to a useful valid range usr_vrange. From var_vrange, var_imgrange, usr_imgrange and usr_vrange, we can determine a scale and offset for converting pixel values: Input values are scaled to real values by var_vrange to var_imgrange and then scaled again to user values by usr_imgrange to usr_vrange.

If either of the vrange variables are not specified, they default to maximum possible range for integer types. For floating point types, usr_vrange is set equal to usr_imgrange so that no conversion of real values is done.

If normalization is not being done, then for integer types var_imgrange and usr_imgrange are set to [0,1] (scale down to [0,1] and scale up again). When normalizibng, usr_imgrange is set to either the full range of the variable ([0,1] if not found) or the user's requested range. If the variable values are floating point, then var_imgrange is set to var_vrange (no scaling to real values), otherwise var_imgrange is read for each image (again, [0,1] if not found).

What this means for reading and writing images is discussed below.

Reading with pixel conversion
When reading into internal floating point values, normalization has no effect. When reading integers without normalization, each image is scaled to full range. With normalization they are scaled to the specified range and slices can be compared.

When the input file is missing either MIimagemax/MIimagemin (var_imgrange information) or MIvalid_range, the routines try to provide sensible defaults, but funny things can still happen. The biggest problem is the absence of MIvalid_range if the defaults are not correct (full range for integer values and [0,1] for floating point). When converting floating point values to an integer type, there will be overflows if values are outside the range [0,1].

Writing with pixel conversion
The conversion routines can be used for writing values. This can be useful for data compression --- e.g. converting internal floats to byte values in the file, or converting internal shorts to bytes. When doing this with normalization (to rescale bytes to the slice maximum, for example) it is important to write the slice maximum and minimum in MIimagemax and MIimagemin before writing the slice.

The other concern is that MIvalid_range or MIvalid_max and MIvalid_min be written properly (especially if the defaults are not correct). When writing floating point values, MIvalid_range should be set to the full range of values in the variable. In this case, the attribute does not have to be set correctly before writing the variable, but if it exists, the values should be reasonable (maximum greater than minimum and values not likely to cause overflow). These will be set automatically if the routine micreate_std_variable is used with NC_FILL mode on (the default).

Example Reading values
Read an image without normalization

Read an image with normalization

Read a floating point image

Example Writing values
Writing from floating point to byte values

If we were writing a floating point image, the only difference (apart from changing NC_BYTE to NC_FLOAT) would be that we would rewrite MIvalid_range at the end of the file with the full range of floating point values.

Dimension conversion
One of the problems of arbitrary dimensioned images is that it becomes necessary for software to handle the general case. It is easier to write application software if it is known in advance that all images will have a specific size (e.g. 256 x 256) and a specific orientation (e.g. the first pixel is at the patient's anterior, right side).

By setting the icv property MI_ICV_DO_DIM_CONV to TRUE these conversions can be done automatically. The orientation of spatial axes is determined by the properties MI_ICV_XDIM_DIR, MI_ICV_YDIM_DIR and MI_ICV_ZDIM_DIR. These affect any image dimensions that are MI?space or MI?frequency where ? corresponds to x, y or z. These properties can have values MI_ICV_POSITIVE, MI_ICV_NEGATIVE or MI_ICV_ANYDIR. The last of these will prevent flipping of the dimension. The first two will flip the dimension if necessary so that the attribute MIstep of the dimension variable will have the correct sign.

The two image dimensions are referred to as dimensions A and B. Dimension A is the fastest varying dimension of the two. Setting properties MI_ICV_ADIM_SIZE and MI_ICV_BDIM_SIZE specify the desired size for the image dimension. Dimensions are resized so that the file image will fit entirely in the calling program's image, and is centred in the image. The size MI_ICV_ANYSIZE allows one of the dimensions to have a variable size. If property MI_ICV_KEEP_ASPECT is set to TRUE, then the two dimensions are rescaled by the same amount. It is possible to inquire about the new step and start, corresponding to attributes MIstep and MIstart (where pixel position = ipixel*step+start, with ipixel counting from zero). The properties MI_ICV_?DIM_STEP and MI_ICV_?DIM_START (? = A or B) are set automatically and can be inquired but not set.

Although vector images are allowed, many applications would rather only deal with scalar images (one intensity value at each point). Setting MI_ICV_DO_SCALAR to TRUE (the default) will cause vector images to be converted to scalar images by averaging the components. (Thus, RGB images are automatically converted to gray-scale images in this simple way).

It can sometimes be useful for a program to perform dimension conversions on three (or perhaps more) dimensions, not just the two standard image dimensions. To perform dimension flipping and/or resizing on dimensions beyond the usual two, the property MI_ICV_NUM_IMGDIMS can be set to an integer value between one and MI_MAX_IMGDIMS. To set the size of a dimension, set the property MI_ICV_DIM_SIZE (analogous to MI_ICV_ADIM_SIZE). To specify the dimension to be set, add the dimension to the property (adding zero corresponds to the fastest varying dimension --- add zero for the ``A dimension, one for the ``B dimension, etc.). Voxel separation and location can be inquired about through the properties MI_ICV_DIM_STEP and MI_ICV_DIM_START (analogous to MI_ICV_ADIM_STEP and MI_ICV_ADIM_START), again adding the dimension number to the property.

Example Reading with dimension conversion
Reading a 256 x 256 image with the first pixel at the patient's inferior, posterior, left side as short values between 0 and 32000: