Data Science, Machine Learning & AI
Data Science, Machine Learning & Artificial Intelligence
- Computational Linguistics
- Data Analytics
- Data Management
- Information Retrieval
- Machine Learning
- Multi-Agent Systems
- Operational Research
- Recommender Systems
Data Science is concerned with collecting, maintaining and deriving meaning from data. Computers have allowed humans to search information for patterns in a way which is impossible or unfeasible for the human brain. But unlike the human brain, computers don’t necessarily understand the meaning of their findings, and have previously asked very limiting questions. As the amount of data available for analysis, and the range of sources from which it comes, increases, researchers need to build ever more sophisticated models and algorithms to derive meaningful results. This is particularly important because the quality of data can be inconsistent across time or sources - often data is collected for one reason, and then researchers analyse it for other reasons. Data Science brings together key ideas from fields such as computer science (algorithms, representation, visualization, application development), statistics (modelling, analysis, prediction), design (information design, interaction design), psychology and cognitive science (language and perception), and the humanities and social sciences (storytelling and narrative, social learning) among others.
Data Management focuses on the collection, organisation and storage of data ensuring that it is fit for purpose and establishes standard ways to organise and store it. It is also concerned with keeping the data intact and in some cases can be interested in anonymising data so that it can be used without compromising security or privacy. Data Analytics is the broad science of deriving meaning from data and contains a wide range of disciplines. As data becomes ever larger, programs and processes are designed whereby algorithms can sort data and bring up relevant information or answers to questions. Information Retrieval allows end users to search large repositories of data for the information that they want, for example an online library catalogue or your personal collection of photos and videos. Recommender Systems are often linked to information retrieval. When a user searches for information in a large database, recommender systems use context to filter the results, such as a user’s previous purchases, or purchases from other users who match the user’s profile. It is recommender systems technologies that allow Amazon to suggest interesting books, Spotify to suggest your next favourite song, or Netflix to suggest what you will watch next.
In Operational Research (OR), complex problems are analysed and solutions or strategies are identified. Operational Research works to bring efficiencies in complex systems, and this often requires modelling and analytics to first understand the moving parts of the system in order to begin to model how they might move most efficiently. Operational Research relates to data science because the information which goes into an OR problem can be large and unstructured and needs to be processed. OR defines the information needed to make the strategic decisions, and feeds into data collection and analysis to then create the solution. For example, if researchers want to use OR to design the most efficient airport, the data which they need relates to passengers, flight timetables, baggage transfer, staff and security requirements. Each of these categories of data will be collected in different ways and at times may not be collected in a way that is useful to the OR researcher.
Computational Linguistics is a field that seeks to build computational models of language. These models can be used to understand more about how we communicate, and to build computer systems that can communicate with us. Advances in Computational Linguistics underpin the conversational interfaces that now appear on so many of our devices (e.g. Siri and Alexa), bots that create original content such as Twitter posts or news articles, and systems that help us deal with the massive amounts of text, video and audio data that now surrounds us.
Multi-agent Systems can be used to model complex scenarios by simulating the behaviour of key actors in a system as intelligent agents. These might be vehicles in a traffic simulation used to optimise traffic flows, people in an epidemiological simulation used to model the spread of infectious disease, or animals in an environmental simulation used to model relationships between predators and prey. Multi-agent Systems can capture behaviours that emerge through the interactions of many agents that other modelling techniques cannot capture.
Artificial intelligence can be defined as the the science and engineering of making intelligent computer programs capable of preforming tasks that require subtleties of judgement, interpretation and generalisation that we associate with human intelligence. Developments in Artificial Intelligence help us to better understand how people make decisions and perform tasks, as well as building computer systems that do these things. Machine Learning is a subfield of artificial intelligence (and a cornerstone of data science) that gives computers the ability to learn through data, observations, and interacting with the world without being explicitly programmed. Machine Learning also offers the potential to build systems that can generalize and adapt to new settings and improve with experience. Machine Learning technologies have underpinned recent advances in self-driving cars, automated medical diagnoses, conversational interfaces, and other automated decision-making systems. Machine Learning also offers the potential for computer programs to move beyond the limits of human knowledge and imagination to bring entirely new insights. Machine Learning is seen as a means to achieve true Artificial Intelligence and computational creativity.