Cognitive Science: An Introduction/Categorization

The concept of categorization is defined to be the process of organizing objects, ideas, and events into groups of similar attributes called categories. It is one of the most basic cognitive processes that humans use to aid in their interaction and perception of their environment. Over the years, many theories have been developed to illustrate how this process is modelled in the brain and how it influences other cognitive concepts such as perceptual processing, learning and decision making. Categorization is also a major area of study in artificial intelligence and computer vision by using software to create cognitive models.

Classical Categorization
The way in which cognitive categorization works is not definitive as there are many theories that are currently present. The oldest theory of this concept is called classical categorization. In this model, a certain set of attributes must be met in order for an object or idea to be considered in a category. As an example, for an object to be considered a bird, it must have feathers, lay eggs and have a beak. That object would not be in the category if it does not meet any of the category's criteria.

Prototype Theory
A more modern theory of categorization was developed by psychologist Eleanor Rosch in the 1970s. This theory, called the prototype theory, is defined as an object or idea being in a category if it resembles a certain representation or a prototype. This differs from the classical view as an object does not need to meet all the rules of a category, and these prototypes or models can change over time. The theory also states that there are three levels of categorization. The top level, also known as the superordinate category, consists of a high degree of generality and inclusion. For example, the superordinate category of a dog is a mammal. The next level down is basic-level classification, where entities in the superordinate category are similar but still can be easily divided by important characteristics. This is where a dog and a cat would be different categories. The subordinate category is the lowest level category with a low degree of generality and inclusion. The differentiation of dog breeds would be at this level.

In Infants
Visual categorization is exhibited from an early age. Research shows that infants are shown to categorize basic colours before the comprehension of language. Habituation is used to investigate infant perception, and using this technique, Bornstein, Kessen and Weiskopf discovered that at four months, humans were able to distinguish between blue, red, yellow and green.

In addition to colours, infants can categorize visual objects exceptionally well. Research shows that categorization processes vary between two and three dimensional stimuli, and eighteen-month-old infants are able to categorize three dimensional objects better than two dimensional stimuli. Booth, Schuler, and Zajicek demonstrated that object function aids in category formation in sixteen-month-olds. Infants were unable to categorize objects when the object performed a different movement than was previously shown to them. It has been speculated that maternal odour and multisensory input enhances visual categorization abilities in infants. However, in a study, it was shown that maternal odour does not enhance four-month-old infants’ ability to categorize cars from other objects using scalp electroencephalogram (EEG).

In the Elderly
Aging influences cognitive abilities and humans of old age are affected by cognitive decline. In a 2016 study, seventeen young and ten elderly subjects were tasked to categorize visuals into two abstract categories. The elderly subjects had difficulty categorizing the exceptions that did not fit in the categories presented to them, but performance for categorizing prototypical stimuli stayed intact due to higher fixation rates to visual features. This demonstrates that subjects of old age have higher perceptual abilities that compensate for their cognitive decline.

Development of Social Categories
Before children enter the formal education system, they have already developed implicit and explicit social preferences and categories. Factors that compose these social categories include gender, race, and spoken language. From a young age, humans tend to interact with members of society that are in the same social category as them. Children make social inferences based social categories and expect members in a group to share beliefs, traits and norms. In a recent study, children with a variety of different ages were asked which people would likely break social conventions. The study showed that children ages three to eleven expected only in-group members to follow social norms while seven to eleven year olds used social group membership to determine who would or wouldn't follow social conventions. Therefore, younger children have a basic view of social categories which develop in later childhood.

Children also categorize out-group members more strongly than in-group members. Woo, Quinn, Méary, Lee, and Pascalis conducted a study where fifty nine- and ten-year-old Malay and Malay Chinese children and forty Malay and Malay Chinese adults with limited exposure to Caucasian individuals had to categorize Caucasian, Malaysian and Chinese faces. The results show that the children were able to distinguish between Caucasian faces easily but had trouble categorizing Chinese and Malaysian individuals. Adults on the other hand were able to distinguish faces at varying speeds (from fastest to slowest Caucasian, Chinese and Malaysian). This illustrates that children have a basic view of races but do have a concept of social categories in which in-group members are members in their society (Malay and Chinese). Adults on the other hand, are able to distinguish other races other than theirs due to other-race categorization advantage where in which their own race faces are easier to differ from other races.

Video Categorization
Software and computer systems heavily use the prototype theory and conceptual clustering in their categorization efforts. In a 2018 study, a new method was created to categorize videos into different genres. These genres include: Animation, Outlet, Sports, e-Learning, Medical, Weather, Defense, Economics, Animal Planet and Technology. This method consisted of a new combination of a fuzzy set of attributes to study shapes of video input. A certain genre may have a certain combination of shapes detected in the frame of the video input. The correlation features are then extracted from each category, and translated into a feature matrix. This feature matrix goes through a neural network to determine what genre or category the video is in.

Text Categorization
In addition to video categorization, text categorization is crucial for search engines, spam filtering as well as document indexing. Text categorization is defined to be the process of organizing unlabeled textual documents into categories that are predefined. Many machine learning algorithms have been created using many techniques such as Naïve Bayes, k-nearest neighbors, neural networks and Support Vector Machines.

Before training a model to predict which documents are organized into what categories, the training data is often preprocessed. This phase entails highlighting key words or phrases in text that differentiate textual categories between one another. A text document first gets tokenized, the process of removing whitespace between words and replacing them with a slash. All the frequently used words that carry no information are removed (stop word filtering), and prefixes and suffixes of words are also stripped leaving the root of the word to be processed (stemming).

As mentioned above, there are a variety of algorithms that have been used in categorizing text. However, Bagging and Adaboost algorithms have been shown to be better performers for stability and accuracy in multiple studies. Bagging is a technique of creating different models from different training datasets. Conversely, Adaboost applies weights to these models based on how accurate each model is to correctly predicting the category of a document.