Engineering Acoustics/Human Voice Production

Physiology of Vocal Fold
Human vocal fold is a set of lip-like tissues located inside the larynx, and is the source of sound for a human and many animals. The Larynx is located at the top of trachea. It is mainly composed of cartilages and muscles, and the largest cartilage, thyroid, is well known as the "Adam's Apple."

The organ has two main functions; to act as the last protector of the airway, and to act as a sound source for voice. This page focuses on the latter function. In the following image, the cross section of vocal cord is shown. This three dimensional geometry is made using CT scan data.



Links on Physiology: Discover The Larynx

Voice Production
Although the science behind sound production for a vocal fold is complex, it can be thought of as similar to a brass player's lips, or a whistle made out of grass. Basically, vocal folds (or lips or a pair of grass) make a constriction to the airflow, and as the air is forced through the narrow opening, the vocal folds oscillate. This causes a periodical change in the air pressure, which is perceived as sound.

Vocal Folds Video

When the airflow is introduced to the vocal folds, it forces open the two vocal folds which are nearly closed initially. Due to the stiffness of the folds, they will then try to close the opening again. And now the airflow will try to force the folds open etc... This creates an oscillation of the vocal folds, which in turn, as I stated above, creates sound. However, this is a damped oscillation, meaning it will eventually achieve an equilibrium position and stop oscillating. So how are we able to "sustain" sound?

As it will be shown later, the answer seems to be in the changing shape of vocal folds. In the opening and the closing stages of the oscillation, the vocal folds have different shapes. This affects the pressure in the opening, and creates the extra pressure needed to push the vocal folds open and sustain oscillation. This part is explained in more detail in the "Model" section.

This flow-induced oscillation, as with many fluid mechanics problems, is not an easy problem to model. Numerous attempts to model the oscillation of vocal folds have been made, ranging from a single mass-spring-damper system to finite element models. In this page I would like to use my single-mass model to explain the basic physics behind the oscillation of a vocal fold.

Information on vocal fold models: National Center for Voice and Speech

Single mass model


Figure 1: Schematics

The most simple way of simulating the motion of vocal folds is to use a single mass-spring-damper system as shown above. The mass represents one vocal fold, and the second vocal fold is assumed to be symmetry about the axis of symmetry. Position 3 respresents a location immediately past the exit (end of the mass), and position 2 represents the glottis (the region between the two vocal folds).

The Pressure Force
The major driving force behind the oscillation of vocal folds is the pressure in the glottis. The Bernoulli's equation from fluid mechanics states that:

$$P_1 + \frac{1}{2}\rho U^2 + \rho gh = Constant$$ -EQN 1

Neglecting potential difference and applying EQN 1 to positions 2 and 3 of Figure 1,

$$P_2 + \frac{1}{2}\rho U_2^2 = P_3 + \frac{1}{2}\rho U_3^2$$ -EQN 2

Note that the pressure and the velocity at position 3 cannot change. This makes the right hand side of EQN 2 constant. Observation of EQN 2 reveals that in order to have oscillating pressure at 2, we must have oscillation velocity at 2. The flow velocity inside the glottis can be studied through the theories of the orifice flow.

The constriction of airflow at the vocal folds is much like an orifice flow with one major difference: with vocal folds, the orifice profile is continuously changing. The orifice profile for the vocal folds can open or close, as well as change the shape of the opening. In Figure 1, the profile is converging, but in another stage of oscillation it takes a diverging shape.

The orifice flow is described by Blevins as:

$$ U = C\frac{2(P_1 - P_3)}{rho}$$ -EQN 3

Where the constant C is the orifice coefficient, governed by the shape and the opening size of the orifice. This number is determined experimentally, and it changes throughout the different stages of oscillation.

Solving equations 2 and 3, the pressure force throughout the glottal region can be determined.

The Collision Force
As the video of the vocal folds shows, vocal folds can completely close during oscillation. When this happens, the Bernoulli equation fails. Instead, the collision force becomes the dominating force. For this analysis, Hertz collision model was applied.

$$ F_H = k_H \delta^{3/2} (1 + b_H \delta')$$ -EQN 4

where

$$ k_H = \frac{4}{3} \frac{E}{1 - \mu_H^2} \sqrt{r}$$

Here $$\delta$$ is the penetration distance of the vocal fold past the line of symmetry.

Simulation of the Model
The pressure and the collision forces were inserted into the equation of motion, and the result was simulated.



Figure 2: Area Opening and Volumetric Flow Rate

Figure 2 shows that an oscillating volumetric flow rate was achieved by passing a constant airflow through the vocal folds. When simulating the oscillation, it was found that the collision force limits the amplitude of oscillation rather than drive the oscillation. Which tells us that the pressure force is what allows the sustained oscillation to occur.

The Acoustic Output
This model showed that the changing profile of glottal opening causes an oscillating volumetric flow rate through the vocal folds. This will in turn cause an oscillating pressure past the vocal folds. This method of producing sound is unusual, because in most other means of sound production, air is compressed periodically by a solid such as a speaker cone.

Past the vocal folds, the produced sound enters the vocal tract. Basically this is the cavity in the mouth as well as the nasal cavity. These cavities act as acoustic filters, modifying the character of the sound. The acoustics of vocal tract have traditionally been described on the basis of a source-filter theory. Whereas the glottis produces a sound of many frequencies, the vocal tract selects a subset of these frequencies for radiation from the mouth. These are the characters that define the unique voice each person produces.

Two Mass Model
The basic two mass model is shown in Figure.3 and the two mass model of vocal fold is shown in Figure.4.





Three Mass Model
In Figure5, three mass model of vocal fold is shown.

Rotating Plate Model
The motion of vocal fold can be described with two degrees of freedom. First rotation of mass M2 and displacement r. The equation of motion will be:


 * $$ \ddot{I}+ B \dot{\theta} +k \theta= T$$


 * $$ m\ddot{r}+b(\dot{r}-\dot{r_b})+k_2(r-r_b)=F$$

Where in these equations: T is the applied aerodynamic torque

$$I_c$$ is the moment of inertia for rotation cover

$$B$$ is the rotational damping

$$k$$ is the rotational stiffness

$$k_2$$is the translational stiffness

$$b$$ is the translational damping

$$F$$ is the force

$$M_2$$is the cover mass

$$r_b$$ is the displacement of the body

$$r$$ is the displacement of the cover

The equation of motion for the body mass can be written as:
 * $$ M_1\ddot{r_b}+b(\dot{r_b}-\dot{r})+k_2(r_b-r)+K_1r_b+B\dot{r_b}=0$$

where:

$$K_1$$ is the body stiffness

$$M_1$$ is the body mass

$$B$$ is the body damping

Lumped-element flow circuit for the vocal tract
In the image below, the lumped element flow circuit for the vocal tract airway is shown. The input impedance to the vocal tracts can be shown by resistive and inertive lumped elements. According to the shown circuit we have:




 * $$ P_L-R_s u-I_s\dot{u} -\frac{1}{2} \rho u^2/a_g^2-R_eu-I_e \dot{u}=0$$

Where

$$P_L$$ is a steady lung pressure

$$R_s$$ is subglottal resistance

$$I_s$$ is subglottal inertance(epilaryngeal) input resistance

$$R_e$$ is supraglottal(epilaryngeal) input resistance

$$I_e$$ is supraglottal(epilaryngeal) input inertance