Sensory Systems/Computer Models/Descriptive Simulations of Visual Information Processing

Computer Simulation of the Visual System
In this section an overview in the simulation of processing done by the early levels of the visual system will be given. The implementation to reproduce the action of the visual system will thereby be done with MATLAB and its toolboxes. The processing done by the early visual system was discussed in the section before and can be put together with some of the functions they perform in the following schematic overview. A good description of the image processing can be found in (Cormack 2000).

As we can see in the above overview different stages of the image processing have to be considered to simulate the response of the visual system to a stimulus. The next section will therefore give a brief discussion in Image Processing. But first of all we will be concerned with the Simulation of Sensory Organ Components.

Anatomical Parameters of the Eye
The average eye has an anterior corneal radius of curvature of $$r_C$$ = 7.8 mm, and an aqueous refractive index of 1.336. The length of the eye is $$L_E$$ = 24.2 mm. The iris is approximately flat, and the edge of the iris (also called limbus) has a radius $$r_L$$ = 5.86 mm.

Optics of the Eyeball
The optics of the eyeball are characterized by its 2-D spatial impulse response function, the Point Spread Function (PSF)

$$ h(r) = 0.95\cdot \exp\left( -2.6\cdot |r|^{1.36} \right) + 0.05\cdot\exp\left( -2.4\cdot |r|^{1.74} \right) , $$

in which $$r$$ is the radial distance in minutes of arc from the center of the image.

Practical implementation
Obviously, the effect on a given digital image depends on the distance of that image from your eyes. As a simple place-holder, substitute this filter with a Gaussian filter with height 30, and with a standard deviation of 1.5.

In one dimension, a Gaussian is described by

$$ g(x) = a \cdot \exp \left( -\frac{x^2}{2\sigma^2} \right). $$

Activity of Ganglion Cells


Ignoring the we can approximate the response of ganglion cells with a Difference of Gaussians (DOG, Wikipedia )
 * temporal response
 * effect of wavelength (especially for the cones)
 * opening of the iris
 * sampling and distribution of photo receptors
 * bleaching of the photo-pigment

$$ f(x;\sigma) = \frac{1}{\sigma_1\sqrt{2\pi}} \, \exp \left( -\frac{x^2}{2\sigma_1^2} \right)-\frac{1}{\sigma_2\sqrt{2\pi}} \, \exp \left( -\frac{x^2}{2\sigma_2^2} \right). $$

The source code for a Python implementation is available under .

The values of $$\sigma_1$$ and $$\sigma_2$$ have a ratio of approximately 1:1.6, but vary as a function of eccentricity. For midget cells (or P-cells), the Receptive Field Size (RFS) is approximately

$$RFS_P \approx 2 \cdot \text{Eccentricity} ,$$

and for M-cells approximately

$$RFS_M \approx 6 \cdot \text{Eccentricity} ,$$

where the RFS is given in arcmin, and the Eccentricity in mm distance from the center of the fovea (Cormack 2000).

Activity of simple cells in the primary visual cortex (V1)
Again ignoring temporal properties, the activity of simple cells in the primary visual cortex (V1) can be modeled with the use of Gabor filters (Wikipedia ). A Gabor filter is a linear filter whose impulse response is defined by a harmonic function (sinusoid) multiplied by a Gaussian function. The Gaussian function causes the amplitude of the harmonic function to diminish away from the origin, but near the origin, the properties of the harmonic function dominate


 * $$g(x,y;\lambda,\theta,\psi,\sigma,\gamma)=\exp\left(-\frac{x'^2+\gamma^2y'^2}{2\sigma^2}\right)\cos\left(2\pi\frac{x'}{\lambda}+\psi\right) ,$$

where


 * $$x' = x \cos\theta + y \sin\theta\, ,$$

and


 * $$y' = -x \sin\theta + y \cos\theta\, .$$

In this equation, $$\lambda$$ represents the wavelength of the cosine factor, $$\theta$$ represents the orientation of the normal to the parallel stripes of a Gabor function (Wikipedia ), $$\psi$$ is the phase offset, $$\sigma$$ is the sigma of the Gaussian envelope and $$\gamma$$ is the spatial aspect ratio, and specifies the ellipticity of the support of the Gabor function.

The size of simple-cell receptive fields depends on its position relative to the fovea, but less strictly so than for retinal ganglion cells. The smallest fields, in and near the fovea, are about one-quarter degree by one-quarter degree, with the center region as small as a few minutes of arc (the same as the diameter of the smallest receptive-field centers in retinal ganglion cells). In the retinal periphery, simple-cell receptive fields can be about 1 degree by 1 degree. .

Gabor-like functions arise naturally, simply from the statistics of everyday scenes . An example how even the statistics of a simple image can lead to the emergence of Gabor-like receptive fields, written in Python, is presented in ; and a (Python-)demonstration of the effects of filtering an image with Gabor-functions can be found at.



This is an example implementation in MATLAB:

And an equivalent Pyhon implementation would be:

Image Processing
One major technical tool to understand is the way a computer handles images. We have to know how we can edit images and what techniques we have to rearrange images.

Grayscale


For a computer an image is nothing more than a huge amount of little squares. These squares are called "pixel". In a grayscale image, each of this pixel carries a number n, often it holds $$0\leq n \leq 255$$. This number n, represents the exactly color of this square in the image. This means, in a grayscale image we can use 256 different grayscales, where 255 means a white spot, and 0 means the square is black. To be honest, we could even use more than 256 different levels of gray. In the mentioned way, every pixels uses exactly 1 byte (or 8 bit) of memory to be saved. (Due to the binary system of a computer it holds: 28=256) If you think it is necessary to have more different gray scales in your image, this is not a problem. You just can use more memory to save the picture. But just remember, this could be a hard task for huge images. Further quite often you have the problem that your sensing device (e.g. your monitor) can not show more than this 256 different gray colors.

Colour


Representing a colourful image is only slightly more complicated than the grayscale picture. All you have to know is that the computer works with a additive colour mixture of the three main colors Red, Green and Blue. This are the so called RGB colours.

Also these images are saved by pixels. But now every pixel has to know 3 values between 0 and 256, for every Color 1 value. So know we have 2563= 16,777,216 different colours which can be represented. Similar to the grayscale images also here holds, that no color means black, and having all color means white. That means, the colour (0,0,0) is black, whereas (0,0,255) means blue and (255,255,255) is white.

Orientation


WARNING - There are two common, but different ways to describe the location of a point in 2 dimensions: 1) The x/y notation, with x typically pointing to the left 2) The row/column orientation Carefully watch out which coordinates you are using to describe your data, as the two descriptions are not consistent!

1D Filter
In many technical applications, we find some primitive basis in which we easily can describe features. In 1 dimensional cases filters are not a big deal, therefore we can use this filters for changing images. The so called "Savitzky- Golay Filter" allows to smooth incoming signals. The filter was described in 1964 by Abraham Savitzky and Marcel J. E. Golay. It is a impulse-respond filter (IR).

For better understanding, lets look at a example. In 1d we usually deal with vectors. One such given vector, we call x and it holds: $$\mathbf{x} = (x_1,x_2,\dots,x_n) \; with \; n \; \in \mathbb{N}$$. Our purpose is to smooth that vector x. To do so all we need is another vector $$\mathbf(w) = (w_1,w_2,\dots,w_m) \; with \; n>m \; \in \mathbb{N}$$, this vector we call a weight vector.



With $$y(k)=\displaystyle \sum_{i=1}^m w(i)x(k-m+i)$$ we now have a smoothed vector y. This vector is smoother than the vector before, because we only save the average over a few entries in the vector. These means the newly found vectorentries, depends on some entries right left and right of the entry to smooth. One major drawback of this approach is, the newly found vector y only has n-m entries instead of n as the original vector x.

Drawing this new vector would lead to the same function as before, just with less amplitude. So no data is lost, but we have less fluctuation.

2D Filter
Going from the 1d case to the 2d case is done by simply make out of vectors matrices. As already mentioned, a gray-level image is for a computer or for a softwaretool as MATLAB nothing more, than a huge matrix filled with natural numbers, often between 0 and 255.



The weight vector is now a weight-matrix. But still we use the filter by adding up different matrix-element-multiplications. $$y(n,m)=\displaystyle \sum_{i=1}^k \sum_{j=1}^l w_{ij}\times x(n-1+i,m-1+j)$$

Dilation and Erosion
For linear filters as seen before, it holds that they are commutative. Cite from wikipedia: One says that x commutes with y under ∗ if:
 * $$ x * y = y * x \,$$

In other words, it does not matter how many and in which sequence different linear filters you use. E.g. if a Savitzky-Golay filter is applied to some date, and then a second Savitzky-Golay filter for calculationg the first derivative, the result is the same if the sequence of filters is reversed. It even holds, that there would have been one filter, which does the same as the two applied.

In contrast morphological operations on an image are non-linear operations and the final result depends on the sequence. If we think of any image, it is defined by pixels with values xij. Further this image is assumed to be a black-and-white image, so we have
 * $$ x_{ij}= 0\;or\;1, \forall i,j $$

To define a morphological operation we have to set a structural element SE. As example, a 3x3-Matrix as a part of the image.

The definition of erosion E says:


 * $$E(M)=\left\{ \begin{align}

& 0,\ \,if\sum\limits_{i,j=0}^{3}<9 \\ & 1,\ \,else \\ \end{align} \right.\ \ ,with\ {{(se)}_{ij}},M\in SE$$

So in words, if any of the pixels in the structural element M has value 0, the erosion sets the value of M, a specific pixel in M, to zero. Otherwise E(M)=1

And for the dilation D it holds, if any value in SE is 1, the dilation of M, D(M), is set to 1.
 * $$D(M)=\left\{ \begin{align}

& 1,\; if \sum_{i,j=0}^3 (se)_{ij} >=1 \\ & 0,\; else \end{align} \right. \; \; ,with\; (se)_{ij},M \in SE $$.

Morphological Operations on this image ... ... lead to the following results:

Compositions of Dilation and Erosion: Opening and Closing of Images
There are two compositions of dilation and erosion. One called opening the other called closing. It holds:

\begin{align} opening & = & dilation \circ erosion \\ closing & = & erosion \circ dilation \end{align} $$