Engineering Acoustics/The Human Ear and Sound Perception

Summary: This page will briefly overview the human auditory system, transducer analogies, and some non linear effects pertaining to specific characteristics of the auditory system. Results from the discipline of Psycho-Acoustics will be presented.

Introduction
The human ear is a small physical device with disproportionately large properties. On one hand it can withstand sounds with acoustic pressure levels close to 1kPa which are pretty much the loudest encountered in nature and on the other hand it can detect pressure levels that correspond to displacements of the eardrum of about one tenth the diameter of the hydrogen atom. When including the information processing done in the brain and the physiological response that it elicits, one can see why the Human Auditory System has been giving researchers a hard time since the turn of the twentieth century. Some researchers approached the auditory system as a very complicated, active transducer;one that transmits the wave information first acoustically, then mechanically, then hydro dynamically, and finally electro-dynamically to the brain. Others like the legendary Georg Von Bekesy, maintained that the continuously regenerative nature of the living organism should be taken into account when considering the behavior of the auditory system. Humans however are no strangers to complicated problems. After all, we have been to the moon, so what is going on?

The Problem with humans
In order to explore the behavior of any physical system one needs a set of variables describing the system.These variables should be well defined and arise naturally from the physical principles governing the behavior of the system.The same physical principles also provide any researcher with well established means of assessing what constitutes a valid measurement. Furthermore,in any well behaved physical system the experimenter has control over the variables, to such an extent that he or she can hold most of the variables constant and individually vary a few of them to evaluate the relationship between them and quantify their dependence. Additionally, in any linear system the principle of superposition holds, so that the overall effect of varying several variables at the same time equates to a linear combination of the individual contributions observed from varying every individual variable independently while keeping everything else constant The above mentioned usually constitute what can be described as a very happy researcher.However, problems arise when one sets out to evaluate the human auditory system, because hearing is a sensation and just like every other sensation it is an esoteric process. To resolve this problem one has to venture into the realm of psycho-physics and the principles of psychological measurements. It is known that one cannot directly measure sensation, but one can measure the response that the sensation elicits. With the above approach one can measure quantities such as just noticeable differences; perceptible excitation; increased nervous activity etc. However, the validity or relevance of those measurements cannot readily be confirmed by first principles. The nature of the human auditory system is such that one is not able to decouple and independently vary any of the variables of interest(however those might be defined) and even if one could, the principle of superposition,in general, does not apply.

Non Linearity | Part1
After acknowledging the difficulties involved in quantifying the behavior of the auditory system and developing models of hearing one should take a look into specific sources of non-linearity and the mechanisms through such a behavior is imposed upon the auditory system. There is probably no better example of such behavior than what is called the acoustic, auditory or intra-aural reflex.

The Acoustic Reflex
The acoustic reflex in man refers to the tendency of the middle ear muscles controlling the behavior of the ossicles (the little bones in the middle ear) to tense under an intense acoustic stimulus, thereby making the inner ear stiffer and in that way limiting the motion of the stapes (the last bone in the chain). This reduction in the motion of the stapes equates to a real rather than a perceived reduction in the amplitude of the vibrations transmitted through the middle ear to the inner ear. The reflex serves to protect the sensitive inner ear from damage during exposure to loud sounds.

Unfortunately, although fast, the auditory reflex it is not an instantaneous reaction. For low frequencies, the response takes from 20 to 40ms to be elicited and therefore offers no protection against loud impulsive sounds like gunshots and explosions. With the onset of the auditory reflex the entire ear exhibits a marked change in acoustic impedance which was observed by 1934 Geffcken and measured by Bekesy and other researchers in subsequent years. It is argued, however, that the onset of the auditory reflex happens for sound of very high intensity and therefore its effect on perception is limited. On the other hand, the same reflex can be voluntarily elicited by, for example, vocalizing. According to Lawrence A. Kinsler, it seems that the mechanical characteristics of the ear are mainly responsible for the response elicited by the auditory system and hence sound perception. Whatever the exact nature of the auditory reflex may be, or what the precise range that it has the most effect however is beyond the scope of this article.

Perceived loudness of pure tones
The intensity and loudness of sound are two highly interdependent quantities. Loudness belongs to the psychological attributes of sound and intensity is a precisely defined and measurable physical quantity. Because of their strong similarity, the two quantities were once thought to be one and the same, since if one increases the intensity of a particular sound, the sound becomes louder. In the simplest and clearest terms: intensity is measured sound level and loudness is perceived sound level.

The measured sound level is expressed in terms of intensity and intensity level, while the perceived sound level in expressed in terms of loudness and loudness level.

Sound Intensity is defined as the acoustic power per unit area and it is measured in Watts per square meter
 * $$I=\frac{Power}{Unit Area}\quad\left[\frac{W}{m^2}\right]$$

However, the human ear is capable of detecting sound intensity ranging from 1x10−12Wm−2 to 1x102Wm−2(above which intensity permanent damage to the ear will occur). This gives a scale in which the maximum value is 10 000 000 000 000 times larger than the smaller one.

In order to provide more insight and get around the cumbersome numbers we use the Intensity level IL which is defined as the intensity relative to 10x10−12 Wm−2, on a logarithmic scale and it has unit of decibels. .
 * $$I_L=10\log\left(\frac{I}{I_{ref}}\right)\quad[dB]$$

In planar waves in air and standard temperature and mean pressure, acoustic pressure and Intensity are linked by the following relation:


 * $$I=\frac{P^2}{\rho c}$$

Where ρ is the air density and c the speed of sound in air. By doing the following:
 * $$I_L=10\log\frac{\left(\frac{P^2}{\rho c}\right)}{\left(\frac{P_{ref}^2}{\rho c}\right)}=10\log{\left(\frac{P}{P_{ref}}\right)}^2=20\log\frac{P}{P_{ref}}=SPL$$

The expression on the right is deemed the Sound Pressure Level and it is identical to the Intensity Level, but in terms of acoustic pressure.The reference pressure used is 20μPa. It is very close to the average minimum audible acoustic pressure in air in the absence of any noise. It is important to note that the minimum audible pressure is averaged over multiple subjects, therefore for a given percentage of the population, negative Sound pressure levels are perceptible i.e. they can perceive sound pressures smaller than the reference pressure. The chosen reference pressure level corresponds to the reference Intensity through the aforementioned relationship, in a way that SPL and IL are identical.

The qualitative expressions of loud, not very loud, extremely loud, are used to describe loudness. Although these expressions are adequate in describing the sensation in any specific individual, they do a very poor job in quantifying the result. The above qualitative expressions have been made qualitative for pure tones, i.e. sinusoidal waves, with the use of Loudness Level and Loudness.

The loudness level of a particular test tone is an indirect measure of loudness and it is defined as the Sound Pressure Level(SPL) of a 1000 Hz pure tone that sounds as loud as the test tone. The 1000 Hz tone was chosen arbitrarily and retained as the standard. The Loudness level is measured in phons. The Loudness Level of the just audible 1000 Hz tone is defined as 3 phons because the minimum perceptible SPL of a 1 kHz tone is 3 dB. Increments in phons are logarithmic because the SPL is measured in decibels.

The loudness level is very useful in quantifying the sensation, however it fails to provide information on the relation between sounds of different loudness levels. In other words it fails to provide insight on how much louder a sound of e.g. 20 phons is than a sound of 50 phons. To get around this problem, we use Loudness which has units of sones. Loudness is based on the 40 dB, 1000 Hz pure tone which is defined as to have a loudness of 1 sones. The Loudness scale is derived by increasing or decreasing the SPL of the 1 kHz tone until it "sounds twice as loud as before" or "half as loud " etc. Successive halving of the loudness creates the rest of the scale. The Loudness for the remaining tones is determined by the same equal loudness judgment that provides the Loudness Levels.

Loudness and Loudness level are best illustrated and are most useful when plotted against the SPL of pure tones, in what are called equal loudness contours or Fletcher & Munson curves, so named after the earlier researchers, however the way loudness is measured has been significantly altered and standardized since the time when such measurements were first made.

The cochlea
The cochlea, or inner ear, constitutes the hydrodynamic part of the ear. It is a small, hollow, snail shaped member formed from bone and filled with colorless liquid. It has an uncoiled length of about 35mm and a cross-sectional area of about 4mm2 on the end closest to the inner ear, that tapers off to about 1mm2 at the far end. It is filled with two different fluids separated in three different channels that run together from the base of the stapes to the apex of the cochlea, however two of the channels are separated by Raleigh's membrane, which is thin and flexible enough to be neglected from a hydromechanical point of view. The vibrations are transmitted directly from the base-plate of the stapes, the last of the three ossicles, to the contained fluid. The cochlea is divided down the middle by the basilar membrane which is a partly bony and partly gelatinous membrane. It is on this membrane that the organ of corti and the infamous hair cells reside.

The basilar membrane
As previously mentioned, the basilar membrane is a flexible gelatinous membrane that divides the cochlea longitudinally. It is the flexible part of the cochlear partition (the other being rather bony)and it contains about 25 000 nerve endings attached to numerous haircells arranaged on the surface of the membrane. It extends from the base to just before the apex of the cochlea at which point it terminates at the helicotrema. This creates two hydromechanically distinct channels, with the baseplate of the stapes attached to the entrance of the upper channel at the oval window, and a highly flexible membrane called the round window sealing the lower channel. The two channels connect at the apex through the helicotrema which is basically a gap through the cochlear partition.



Whats up with all the hair?
The hair-cells that populate the top surface of the basilar membrane are the last part in the chain of transformation of the mechanical energy of the acoustics wave into electrical impulses. These cells are arranged in an inner row and an outer row in the organ of corti (which runs along the basilar membrane) and they are surrounded by electrically charged cells at different potentials(synapses).

Due to the similarity between the behavior of the inner ear and the behavior of a band-pass filter, the above groups of frequencies have been named critical bandwidths.

Non Linearity | Part 2
Now that a little bit more has been presented about the workings of the inner ear, more peculiarities of the idiosyncratic auditory system can be illustrated, starting with a non linear effect that is fairly common and very noticeable when it occurs. It is the phenomenon of beating.

Beating Phenomena
Beating phenomena are a characteristic of multiple degree of freedom systems, where the various degrees of freedom are coupled to some extent and that receive two harmonic excitations at slightly different frequencies. The excitations can be summed as follows:
 * $$x=A_1e^{j(\omega_1 t)+\Phi_1}+A_2e^{j(\omega_2 t)+\Phi_2}$$
 * $$x=A_1e^{j(\omega_1 t)+\Phi_1}+A_2e^{j(\omega_2 t)+\Phi_2}$$

The resulting vibration is no longer simple harmonic. The inner ear is a continuous system, with the basilar membrane serving as a complicated bandpass filter to separate frequencies. When one or both ears are exposed to sound that consists of two tones with a slight difference in their frequencies, the non uniform distribution and strong localization of the hair cells on the surface of the basilar membrane result in the same group (or critical bandwidth) of hair cells being excited by both tonal components of the incident sound.

As a result, the listener perceives the combination sound as that of a single frequency tone but with periodically varying intensity. This is known as beating. The tones remain indistinguishable until the frequency separation between them, is greater than the a bandwidth. It is really interesting to note that if the two tones are presented to each ear separately, then no beating occurs and the ear is able to resolve the difference.

Links and Resources

 * Dancing Hair Cell
 * American Society of Acoustics
 * Table of sound levels
 * Basilar Membrane Animation