Algorithm – a defined procedure, or set of rules, used in mathematical or computational problem solving to complete a particular task.
Artificial Intelligence (AI) – a branch of computer science dealing with the ability of computers to simulate intelligent human behaviour.
Artificial General Intelligence (AGI) – the theoretical ability for computers to simulate general human intelligence, including the ability to think, reason, perceive, infer, hypothesise, but with fewer errors. This is not yet reality, but is the goal of many large AI research enterprises.
Artificial Narrow Intelligence (ANI) – the ability of a computer to simulate human intelligence when focused on a particular task. A well-known example of this is the development of chess-playing computers, which beat the best human players in the early 2000s and are now accepted as training tools available on any smartphone.
Bias – Any pre-learned attitude or preference that affects a person’s response to another person, thing or idea. In the context of GenAI it commonly refers to a chatbot’s reflection of bias present in its training data (namely, the internet) in its responses to users’ queries.
Big data – an accumulation of a data set so large that it cannot be processed or manipulated by traditional data management tools. It can also refer to the systems and tools developed to manage such large data sets.
Data mining – the practice and process of analysing large amounts of data to find new information, such as patterns or trends.
Data tagging – a process in data classification and categorisation, in which digital ‘tags’ are added to data containing metadata. In the context of GenAI, training data for Large Language Models is tagged by humans so the AI can learn whether to include or exclude it from its responses. This may be to comply with legal requirements, or ethical and moral codes.
Deep learning – a subset of machine learning that works with unstructured data and, through a process of self-correction, adjusts its outputs to increase its accuracy at a given task. In the context of AI, this process is closely related to reinforcement learning.
Generative Artificial Intelligence – Generates new content from existing data in response to prompts entered by a user. It doesn’t copy from an original source, but rather paraphrases text or remixes images and produces new content. It learns via unsupervised training on big data sets, but does not reason or think for itself.
Hallucination – Answers from GenAI chatbots that sound plausible, but are untrue, or based on unsound reasoning. It is thought that hallucinations occur due to inconsistencies and bias in training data, ambiguity in natural language prompts, an inability to verify information, or lack of contextual understanding.
Large Language Model (LLM) – A specific type of GenAI that specialises in producing natural language. Chatbots such as Gemini, Copilot and GPT-4 are examples of LLMs.
Machine Learning – The capacity of computers to adapt and improve their performance at a task, without explicit instructions, by incorporating new data into an existing statistical model.
Natural Language Processing (NLP) – The analysis of natural language by a computer. It is a field of computer science that takes computational linguistics and combines it with machine and deep learning models to allow computers to understand text and speech, and respond accordingly. Voice activated “smart” technologies and translation software are two of many everyday uses for NLP.
Neural networks – a computing architecture, or machine learning technique, where a number of processors are interconnected in a manner inspired by the arrangement of human neurons; they can learn through a process of trial and error.
Pattern recognition – a data analysis method that uses machine learning to recognize patterns, or regularities, in big data.
Plug-in – a small piece of software that enhances the ability a larger system or application to fulfil a specific task, e.g. a referencing plug-in on a web browser can pull the metadata from a webpage into a reference management system to create a bibliographic entry.
Recurrent neural networks (RNN) – a neural network that is trained to “remember” past data to predict what should come next. It is used for ordinal tasks, such as language translation and natural language processing, as language is sequential arrangement of letters to create meaning.
Reinforcement learning – a process whereby a deep learning model learns to become more accurate at a specific task based on feedback. The process by which it improves its accuracy is called backpropagation.
Supervised learning – a machine learning method where the model is trained using data that has been labelled by a human, i.e. training using examples. This is useful for predicting future outcomes based on past trends where data already exists.
Turing Test – a ‘test’ devised by British computer scientist Alan Turing to distinguish if a computer was “intelligent”. He posited that a human interrogator must ask questions over a fixed period of time to both a computer and a human, and distinguish which was which based on their replies. A computer would be deemed to have passed the Turning Test when a human could not distinguish between its responses and a human’s.
Unsupervised Learning – a machine learning method where the model identifies patterns in unlabelled data and makes inferences for use on future, unseen data. This is useful for looking at raw data and seeing if there are patterns occurring.