Issues in Interdisciplinarity 2020-21/Evidence in Intelligence Testing

Introduction
The means by which intelligence is defined and tested has significant implications for individuals livelihoods in modern society so the subject has been the source of controversy.

Intelligence is an abstract concept formulated from variable aspects, and different disciplines have conflicting views on what counts as evidence in measuring intelligence. Some even argue that intelligence testing should cease. However, it can be a valuable theoretical and practical tool in studying mental development and assessing cognitive abilities. Resolving interdisciplinary conflicts through integration could develop more holistic approaches to intelligence testing which can be used to benefit individuals and society. Tests need to be universal across machines, animals, humans, and therefore disciplines, and results standardised against humans. Reaching a common understanding of evidence of intelligence means we'll have certainty of when a machine has achieved intelligence.

This Wikibook will look at evidence in intelligence testing in psychology, computer science and an anthropological/sociological perspective with the goal of highlighting conflicts.

Evidence in Intelligence Testing in Psychology


In, the most common measure of intelligence is Intelligence Quotient (IQ). There are multiple modern IQ tests, the most common in the English-speaking world being the Wechsler Intelligence Scale, with different versions for adults and children. David Wechsler believed intelligence to be made up of interconnecting elements of cognitive ability that could be isolated and measured. The current version of the test, WAIS-IV, is composed of indexes: verbal comprehension, perceptual reasoning, working memory, and processing speed.

The IQ score is calculated based on  comparing individual performance against the average. These scores are normally distributed where average score is the most common (mean of 100 and standard deviation of 15 IQ points). IQ scores are therefore estimates of intelligence rather than being a direct measure. IQ tests have been shown to have high statistical reliability with a confidence interval of approximately 10 points and a standard error of 3 points. Although performance in IQ tests can vary on an individual basis due to external factors such as motivation and anxiety, casting doubt on validity, IQ scores correlate with performance in jobs and schools. Some psychologists regard this as sufficient evidence that IQ tests are viable for practical use in education and jobs.

Other psychologists have been critical of IQ tests with Wayne Weiten stating "IQ tests are valid measures of the kind of intelligence necessary to do well in academic work. But if the purpose is to assess intelligence in a broader sense, the validity of IQ tests is questionable." IQ tests can measure forms of intelligence, but broader aspects including creativity and emotional intelligence are unaccounted. There are alternative tests for measuring aptitude in response to the criticisms. An example is the Mayer-Salovey-Caruso Emotional Intelligence Test.

An Anthropological and Sociological Perspective
Anthropological and sociological understandings focus on an individuals interactions with their wider culture and society. Since intelligence is a contested concept for which definitions vary both across and within cultures, those within anthropology argue that tests measuring intelligence must be emic – i.e., derived from within – the culture of the individual being tested. Due to qualitative differences between cultures, some argue that quantitative comparisons made across cultures will be unhelpful. Taking an anthropological perspective, Sternberg and Kaufman have argued that cultures "designate as “intelligent” the cognitive, social and behavioural attributes that they value as adaptive to the requirements of living in those cultures". Similarly, some sociologists argue that intelligence is a social construct, and its evidence is the product of a particular sociohistorical context concerned with issues surrounding social stratification and inequality.

In relation to measuring intelligence, those taking anthropological and sociological perspectives have highlighted the modifiability of intelligence, along with other external factors influencing performance in particular testing methods. Berry and Irvine have emphasised the need to appreciate the different levels of context that influence the performance of intelligence, including ecological and experimental contexts. Their work has shown how cognitive styles develop in accordance with environmental demands. The requirement is then for the experimental context (context in which intelligence is being tested) to align with the individuals own learning and everyday context. One example comes from Brazilian street children, who, despite being reliant upon their mathematics skills to run their own business to survive, performed poorly on solving the same mathematical problems when tested in school due to the abstract nature of the problems presented which were removed from their real-world context. Conventional IQ and psychometric based tests, are thus considered problematic for their basis in assumptions of the acontextuality of intelligence and cognitive performance, especially when used to compare individuals from diverse cultural and economic backgrounds.

Evidence in Intelligence Testing in Computer Science
Language as Evidence

The imitation game follows from Descartes’ ideas of language versatility in new and challenging situations being the first test of intelligence. An interrogator, separated from a machine and human, asks questions. Through answering, the machine tries to trick the interrogator that it's the human, and the human tries to help the interrogator guess. If the machine is successful, it provides evidence that it's intelligent according to Turing. The machine must mimic a human, pretending it's unable to perform complex equations and faking natural-looking human spelling mistakes. Evidence for intelligence here is adaptability, not the g-factor view of handling complexity.

The HAL project measures conversational ability as evidence of intelligence and an estimate of human maturity is assigned to the machine. The machine’s speech is examined for evidence looking at vocabulary size, response types and.

Cybernetics

Goal-focused systems (both animal and mechanical) where communication, or conversation, between system and environment is a prerequisite for activity. The system’s response/activity as it interacts with the changing environment is taken as evidence of intelligence. Evidence of intelligence is focused on activity and purposeful behaviour rather than reason.

Universal Intelligence Test

A machine receives rewards for interacting with an environment, it must learn the environment structure and what actions receive rewards to maximise amount received. Extra reward is given for applying Occam’s razor which is a intuitive, yet intelligent, method.

IQ

It's rarely considered evidence of intelligence when a machine completes a classical IQ test, possible from 1963 when an AI program passed geometric analogy tasks from WAIS.

Often results are comparable to humans but, like in the case of number completion, the way machines solve problems can be different meaning results have different error distributions.

Related is psychometric AI, the field of building machines that can solve a range (using a single test is useless to evaluate a machine since it can be specialised to solve it) of established tests.

Conclusion
There is still much debate over what is considered evidence in intelligence testing. In machines, testing is focused on evidence for real-time flexibility, adaptability, and creativity; especially in terms of language and conversation. Like anthropological and sociological approaches, computer science tends to focus on interactions with the external environment. This goes against the universal, static, approach of IQ tests which, in its emphasis on evidence for intrinsic characteristics of intelligence measured in isolation, tends to ignore context. Engagement between these disciplines, alongside unexplored disciplines, could help move away from this narrow understanding of intelligence and towards a more practical understanding.