Inclusive Data Research Skills for Arts and Humanities/Data agencies

Introduction and context
The objectives of this section were to:


 * 1) Co-develop the working concept 'Data Agencies' as an approach to data research methods for marginalised and excluded groups
 * 2) Creating a self-sustaining community of arts and humanities researchers interested in data and digital research
 * 3) Co-write a Data Agency toolkit for arts and humanities researchers interested in data and digital research, using Wikibooks as a tool.

Core question is: how do we approach data research methods in ways that are empowering for marginalised and excluded groups?

"Critical science and decolonial theory, when used in combination, can pinpoint the limitations of an AI system and its potential ethical and social ramifications, becoming a "sociotechnical foresight tool" for the development of ethical AI" (Royer, 2020: 22).

Decolonial theory has its foundations in race, law, feminism, queer theory, and philosophical technology studies and understanding the blind spots and limitations of a particular technology necessitates exposing the power dynamics and political relationships that support its application. By infusing a decolonial critical approach into AI, data and socio-technical communities, we could establish insights and approaches that better connect research and technology development to established ethical values grounded in decolonial theory. This will require the development of new research cultures, as well as original technical research in equality and fairness, including the definition of fairness and its ideals, translation, and privacy.

Encouraging inclusive dialogue in research methodologies could contribute to the development of a responsible AI and a renewed responsibility to current technologies. By critically engaging with the past and present, researchers must try to unlearn colonial reasoning, reinstate norms of living that were previously incompatible with life, and create new forms of political and affective transdisciplinary research communities to address these challenges.

Ontological Turn - Digital and the real.

Important distinctions between dualism have been adopted in humanities theory, "A diverse body of work known as the “ontological turn” has made important contributions to anthropological theory. In this article, I build on this work to address one of the most important theoretical and political issues haunting contemporary theories of technology: the opposition of the “digital” to the “real.” This fundamentally misrepresents the relationship between the online and offline, in both directions. First, it flies in the face of the myriad ways that the online is real. Second (and just as problematically), it implies that everything physical is real. Work in the ontological turn can help correct this misrepresentation regarding the reality of the digital" (Boellstorff, 2016: 387).

Define how we are using ontology here relating to data and digital -expand (holbraad, links between decolonial, animism, process phil)

Ontology as a way of acting a reality or realities, performing the process, and constructing and acting those life worlds. What we believe our world is made up of, plays a part in how we interact with it, and our understandings of that world (reality/realities) are shaped also by our actions.

Data epistemologies
What are data epistemologies?

Data epistemologies: How to move past critique and challenge/compliment scientific disciplines (challenge the dualism between coloniser/colonised to redefine decolonisation).


 * "Epistemic cultures are cultures of creating and warranting knowledge. This is what the choice of the term 'epistemic' rather than simply 'knowledge' suggests ... [i[t brings into focus the content of the different knowledge-oriented lifeworlds, the different meanings of the empirical... particular ontologies of instruments, specific models of epistemic subjects" (Knorr-Cetina as cited in Lury 2021: 11)
 * "Big Data, coupled with new data analytics, challenges established epistemologies across the sciences, social sciences and humanities, and assesses the extent to which they are engendering paradigm shifts across multiple disciplines. In particular, it critically explores new forms of empiricism that declare ‘the end of theory’, the creation of data-driven rather than knowledge-driven science, and the development of digital humanities and computational social sciences that propose radically different ways to make sense of culture, history, economy and society" (Kitchin 2014: 1 ).

Gaps between data literacies (individualised), data infrastructures (e.g. university computing, data tools and developers, disciplinary approaches) and critical data thinking (e.g. data feminism, data conscience, critical data literacies) CARE principles.

Data Epistemolgies map

Examples of data epistemologies
Nodocenctrism and paranodality (Mejias 2009 ; Mejias 2013 ; Barnes 2020 )


 * …”the digital network as part of a media economy that reproduces inequality through a hegemonic–yet consensual and pleasurable–culture of participation. To support my thesis, I consider the politics of inclusion and exclusion of the network. In order for something to be relevant or visible within the network it needs to be rendered as a node (a phenomenon I refer to as “nodocentrism”). Thus, digital networks are constituted as totalities by what they include as much as by what they exclude” (Mejias 2013)
 * How do we do paranodality?
 * Engage multiple methods such as Social Network Analysis AND participant observation, action research, ethnography, interviews etc. - Develop for toolkit

The Alternative Epistemologies of Data Activism (Milan & Velden, 2016 )


 * "Data activism indicates the range of sociotechnical practices that interrogate the fundamental paradigm shift brought about by datafication. Combining Science and Technology Studies with Social Movement Studies, this theoretical article offers a foretaste of a research agenda on data activism. It foregrounds democratic agency vis-à-vis datafication, and unites under the same label ways of affirmative engagement with data (“proactive data activism”, e. g. databased advocacy) and tactics of resistance to massive data collection (“reactive data activism”, e. g. encryption practices), understood as a continuum along which activists position and reposition themselves and their tactics".

Materiality of networks and data (e.g. Starosielski )


 * Data materialities and of methods - where do they come from and where do they sit?
 * Actor-network theory and non-human actants.https://monoskop.org/images/e/e4/Latour_Bruno_We_Have_Never_Been_Modern.pdf

Multi-methods and/or other methods:


 * Ethnography
 * Auto/biographical approaches
 * Autoethnography/arts and humanities researchers
 * Quantitative/Qualitative
 * Multimodal representations
 * Centering community and worldview, choose methods best for the community, what works best for them
 * Mixed methods and multi methods
 * Theory of change (empowered researcher in the process)
 * Starting from an ethics of care - CARE Principles.
 * Reflexivities of discomfort (Pillow, 2003)
 * Indigenous Data Sovereignty Networks.
 * Participatory Action Research (PAR) - Data Stewardship.

What does data agency look like? Definition
''' What are Data Agencies? '''

Refers to agency of humans and non-humans. For example, data has agency itself. Which values shape Data Agencies? Where do they come from? How and why?

Values will be a crucial issue if we consider social environments/entanglements as a project of mutual creation, cooperatively constructed and rebuilt.

How are we defining agency? ''And how are we prioritizing it (who/what gets priority)? - develop''

Data agencies may be regarded multilayered since they interact with a variety of systems at different times (people, social structures, economies, and technical systems), all of which may be examined using diverse approaches. Data agencies possess agency distinct from humans and exert multidirectional influence, representing varying meanings to individuals in different circumstances and times, influenced by the researchers agency and their reflexivity.These agencies can affect both humans and machines (as actors) and create areas of conflict that may generate prospective changes.

Data agencies are inherently complex, impacting individuals and data at multiple junctures through feedback loops. If we consider the concept of data agencies as a crucial link between the machine and the human, this situates machine learning technologies as being a key analytical tool for the possibility of being able to decode data agencies.

One way to think of data agencies is to consider the humility of the researcher and how the data challenges the researcher. For example, the researcher should be prepared to shift their own thinking and enter into a reciprocal relationship with the data and its agency (as an object of study) in which the researcher is not just trying to explain and interpret the data but allows it to shift you and become a participant within a movement in your own thinking. The agency of the data could lead us to analytical humility and reflexivity while allowing for conceptualisation between the researchers agency and the data's agency as mutual entities inhabiting different worlds (Holbraad et al, 2014).

Even if we try to adopt pre-existing frameworks or establish prioritization checklists to support research endeavors, a goal of decolonial data agency is to question methodologies and practices, within reason. There will always be more context that can be provided and support research conclusions, but balancing interpretability and explainability with performance and the ability to execute is as fundamental a part of a research process as it is an AI model.

Questions

Where is the agency found?

How does it interact with the machine and human in symbiosis?

Where is the intersection of contact?

Agency embedded in the data though historical processes/extension of the human mind but takes on an agency as the data.

Data and algorithmic code’s encounter with the human but where are the concrete set of connections between humans and machines?

Can we build a system that does not oppose the machine and data/algorithms against the body or being human but is a coevolution and can be decololnised with biases revised?

If so, at which point in the process?

Where are the intersecting value systems that could define an agreed set of 'fairness' values to work with?
 *  Definitions 
 * Data should not stand alone but within the human experience..
 * Data as an extension of the human mind and historical practices of collecting data embedded in the data.
 * Data has agency itself.
 * Where do data agencies fit into the wider structure and where is the data living
 * Multilevel data agencies (not just the researcher, data agencies are intertwined with wider social structures).
 * Data and human re-transformed through entanglements and cant separate human from the data.
 * Encountering the data agency we must realise we may reach the limitations of our own thinking (through alterity/difference).
 * Invent new resources to shift our own ways of thinking and reevaluate your own assumptions and positionality and catergories thinking we are using in research (reflexivity).
 * Consider critiques and difficulties with data already existing in the agency of the data.
 * Representation: inclusion and exclusion in data sets and data agencies.

Would it be possible to decolonise data agencies?
Machine learning often superficially scrapes data instead of delving into in-depth analysis--expand.

Issues of alignment and values competing values may prevent stakeholders from defining fairness while developing machine learning systems. Where is the intersection of an agreed-upon idea of 'fairness' between distinct value systems (which may be applied to frameworks)? Consider the concept of sameness (not universalism) but a sameness within/ 'inside' difference (Taussig alterity/mimesis and Greaber).

How then can biases then be revised by decolonial approaches? For example, by animism, and through indigenous languages, meaning and context? Would it be possible to build a decolonial AI?

Decolonial thinking, from an African perspective, challenges the dualism inherent in scientific thought through a relational, contextual, and historical approaches (Fanon, 1952). A decolonial approach could play a crucial role in identifying and addressing biases in machine learning by examining the historical and contextual factors that shape data agencies. This approach could enable a more nuanced understanding of the information being processed, thereby enhancing the accuracy and cultural sensitivity of the outcomes.

Deep learning mechanisms that prioritise calculation in data sets and corpus-gen-AI, what are the issues where can a decolonial position materialise?

Possible entry point: Assessment of base code at the test and evaluation stage may be helpful, considering the potential impact on model performance and efficiency from a decolonial perspective. Additionally, incorporating feedback loops during the training process could further optimise the deep learning mechanisms for improved results and how use cases/biases in data sets could impact protective classes and be revised before impact by the machine.

Ethnography and Large Language Models

Inclusiveness: African philosophy and artificial intelligence - value (mis) alignment.

Development theories prioritise control over the environment, individual freedom, self-interest, and market dynamics. These concepts contribute to welfare, private property, materialism, and the unification of value through the instrumentalization of the market and the production process. All these factors play a role in the labour force process and changes to market dynamics through the development of new technologies. Ultimately, these concepts play a critical role in shaping society's economic structure by influencing various aspects of the economy and technological development.

New digital divisions related to technological advancements are emerging and issues such as capacity building, new labour redundancies, privacy concerns, and ethical challenges will continue to develop. Unlike previous waves of automation, a new wave of automation will impact a significant portion of the economy, particularly middle-class occupations. There will also be new opportunities for employment, new job positions, and increased productivity and previously undervaued parts of the economic activity such as caring and social service occupations will become increasingly valuable.

Is it possible to develop intercultural digital/data ethics? Deep learning principles that incorporate ontologies and indigenous languages, as well as diverse perspectives, could help address some of the ethical challenges in the digital age. Collaborative efforts between different cultures and communities will be essential to ensure that digital advancements are developed and implemented in a responsible and inclusive manner.

African Philosophy

The Nguni languages of southern Africa are the foundation for the concept of reciprocity, also known as Ubuntu (I am because you are) which enhances concepts such as the economy, inclusivity, and reciprocity, refining their understanding. It considers how humans and non-human beings (nature and beliefs) are more important than economics. All living things, including nature, and the earth are closely interconnected and interdependent, emphasizing unity over separation. It encompasses the idea that each individual contributes to collective well-being, fostering harmony within society and 'intelligence' may also be generated through relational beings and as a whole (collective intelligence).

The grammar of African languages e.g Muntu and Bantu (which focus on personhood) enables exploring Ubuntu-related ideas embodying the fundamentals of African thought. While English is a noun-based language with teleological future thought, African languages (with complex noun class systems) prioritise motion and verbs, emphasising the present. For example, people who came before you are part of the living community, and future generations also shape the community that exists today. The decisions you make are influenced by those who preceded you and those who will come after you ( links to critical race theory and historical perspectives expand with taussig and defacement and labour of the negative analysis https://www.sup.org/books/title/?id=432).

This is in misalignment with transhumanism because transhumanism can be based on a restricted worldview; the concept of robots replacing humans and developing general intelligence presupposes the ability to separate and individualise intelligence, which is in contention with the concept of Ubuntu (collective intelligence).

Where are African ethics in AI located? AI and language.

Language is used in programming in AI and could be for a shared intercultural digital ethics. Including African languages in computer programs, connecting different dialects, embracing their African perspective, and moving away from English as the standard could be considered.

How will bias continue to impact AI and Data? Are there interventions points to reduce bias? Or will they decrease or exacerbate bias? Will it make AI more transparent or more opaque?

What does economics entail? Governance and the bottom-up approach?

Ontologies and deep learning-- How can a framework be constructed from an intercultural viewpoint?

How could a decolonial approach account for data agencies and bias in data be developed as a critical approach, and ultimately how to get beyond that?

Can we use “polyvocality ,” a movement towards the inclusion of multiple perspectives in and on data agencies? How reliable is the data to obtain the perspectives of different stakeholders and the relationships between them and the data agencies? What theoretical framework do we use to define perspectives and their relations with the underlying data? Understanding polyvocality in machine learning will require a methodology that can accurately capture the diverse perspectives of stakeholders. A combination of qualitative and quantitative research methods may be beneficial in developing a theoretical framework that defines perspectives and their interconnectedness within machine learning and data agencies.

Examples?
 * Data 4 Black Lives (D4BL)
 * Masakhane
 * Data cooperatives
 * Data stewards and stewardships
 * Data trusts
 * Data feminism
 * Machine learning for te reo Māori
 * Indigenous AI
 * AI Intersections database

Challenges to data agency approach
Political Economy approach to data agencies: rather than abandon the categories of “subject” and “object” and of “Society” and “Nature,” as suggested by proponents of “the ontological turn,” researchers can compare subject–object transformations and the naturalization of social power relations in the two contexts. In acknowledging the ultimate dependence of modern technology on exchange rates and financial strategies in a globalized economy, we realize that the agency of modern artifacts is also dependent on human subjectivity. In shifting the focus of comparative anthropology from ontology to political economy, we can detect that modern technology is a globalized form of magic (Hornborg, A, 2015: 35).

=
How do Data agencies relate to Roy Bhaskar's Critical Realism and the Philosophy of Meta-Reality Part II: Agency, Perfectibility, Novelty ======

Would a multi methods approach to data agencies be an appropriate method?

Can a data toolkit be developed to analyse data agencies during research for arts and humanities researchers?

What kinds of things would help us?

 * Data epistemologies map - conceptual framework
 * Meta survey of methods
 * Diagnostic tool - what methods and skills are related to what kinds of questions?
 * Person-centred
 * "Data joy" (what does this look like?)
 * Advocacy and participatory approaches and centering community worldview.
 * Choose research methods best for the community and the outputs that you are aiming for in research project.
 * Promote data and the DaRes project as a resource as an agency for the community.
 * Research should not be extractive.
 * Defining concept - data agencies, multi methods
 * Ethnography of large language models/linguistic anthropology

Data Empowerment for Empowerment of Arts and Humanities Research
Interdisciplinary work centred on data activism and justice research is key, as well as how to avoid mistranslation and develop a common language for arts and humanities researchers to use.

Add decolonial data methods and data agencies toolkit

Examples:

Part of this will include how to feed into technical and legal frameworks to raise awareness and question the coding and design of data sets. But also how researchers might be empowered by legal, policy, and technological advancements, as well as what else needs to change. Data empowerment could also entail legitimising a non-scientific worldview of data through the arts and humanities' multiple complexities and values.