Real World Data Science

AI series: Healthy datasets for optimised AI performance

Even “correct” data can cause all manner of mischief when fed indiscriminately into AI training and test data sets. Here the various issues that can arise and possible routes to resolve them are laid out, taking several lessons from healthcare.

Welcome to Real World Data Science, a new project from the Royal Statistical Society, in partnership with the American Statistical Association. We bring together data science students, practitioners, leaders, and educators to share knowledge, learn from, and be inspired by real-world applications of data science.

Latest content

AI Series: Generative AI models and the quest for human-level artificial intelligence

Diego Miranda-Saavedra explores some of the merits and limitations of modern machine learning models, and considers where these ‘intelligent’ systems might sit in the…

Diego Miranda-Saavedra

Apr 29, 2024

Article series: What is AI? Shedding light on the method and madness in these algorithms

AI has become a hot topic of debate but these discussions can circulate fears and fantasies more than a meaningful analysis of real world data. In this special issue we…

Anna Demming

Apr 22, 2024

Data science and AI in the public sector: An interview with ONS’s Penny Holborn

Penny Holborn, head of faculty for the Office for National Statistics Data Science Campus, talks to Jonathan Gillard about keeping up with new developments, the value of…

Jonathan Gillard

Mar 27, 2024

Is data science a new and exciting set of skills, necessary for analyzing 21st century data? Or is it (as some have claimed) a rebranding of statistics, which has carefully developed time-honored methods for data analysis over the past century? In this article, we use two popular data science algorithms to examine the difference between data science, statistics, and other occupations.

Jonathan Auerbach, David Kepplinger, and Nicholas Rios ask, ‘What is data science?’

Read their analysis

Viewpoints

£10m for UK regulators to ‘jumpstart’ AI capabilities, as government commits to white paper approach

It’s been a busy seven days for AI news in the UK as two major government reports were published, millions of pounds of new investments were announced, and warnings rang out…

Brian Tarran

Feb 8, 2024

UK government sets out 10 principles for use of generative AI

Government says staff need to understand what generative AI is, its limitations, and how to deploy the technology lawfully, ethically and securely.

Brian Tarran

Jan 22, 2024

When will the cherry trees bloom? Get ready to make and share your predictions!

The 2024 International Cherry Blossom Prediction Competition will open for entries on February 1. There’s cash and prizes on offer for the best entries, including having…

Brian Tarran

Jan 18, 2024

Case Studies

Deduplicating and linking large datasets using Splink

Robin Linacre introduces an open source tool, developed by the UK Ministry of Justice, which uses probabilistic record linkage to improve the quality of justice system data.

Robin Linacre

Nov 22, 2023

Learning from failure: ‘Red flags’ in body-worn camera data

When body-worn cameras were rolled out to juvenile correctional officers in Texas in 2018, senior leaders hoped proactive analysis of camera metadata could be used to…

Noah Wright

Nov 16, 2023

Food for Thought: The value of competitions for confidential data

The Food for Thought Challenge attracted new eyes from computer science and data science to think about how to address a critical real-world data linkage problem. And, in…

Steven Bedrick, Ophir Frieder, Julia Lane, and Philip Resnik

Aug 21, 2023

Generative AI is based on neural networks, which are so-called ‘black boxes’. This makes it difficult or impossible to explain the inner workings of the model which has potential implications if in the future you are challenged to justify decisioning or guidance based on the model.

Excerpt from the UK government’s framework for the use of generative AI

Read our key takeaways

Ideas

Article series: What is AI? Shedding light on the method and madness in these algorithms

AI has become a hot topic of debate but these discussions can circulate fears and fantasies more than a meaningful analysis of real world data. In this special issue we…

Anna Demming

Apr 22, 2024

What is data science? A closer look at science’s latest priority dispute

Two popular data science algorithms – naïve Bayes and eigen centrality – are used to examine the difference between data scientists, statisticians, and other occupations.

Jonathan Auerbach, David Kepplinger, and Nicholas Rios

Feb 19, 2024

Creating Christmas cards with R

The programming language R is capable of creating a wide variety of geometric shapes that can be used to construct high quality graphics – including festive images. In this…

Nicola Rennie

Dec 12, 2023

Careers

‘I fell in love with math, really, and fell into data science because of that’

A passion for solving mathematical problems led Niclas Thomas to a PhD in machine learning and then a career in data science in the retail sphere. Now head of data science…

Brian Tarran

Oct 4, 2023

‘I was inspired by the power that numerical data have to tell stories and promote policy change’

Claire Morton is an undergraduate student at Stanford University. In this Q&A, Claire explains how a high school job in a cell biology lab led to college studies in…

Brian Tarran

Jun 28, 2023

‘Living my identity takes courage. It is the same courage necessary to start a new business’

Albert Lee, the founding partner at Summit Consulting, describes his career journey – from mathematics and economics at university to the birth of data science and building…

Brian Tarran

Jun 28, 2023

AI series: Healthy datasets for optimised AI performance

Latest content

AI Series: Generative AI models and the quest for human-level artificial intelligence

Article series: What is AI? Shedding light on the method and madness in these algorithms

Data science and AI in the public sector: An interview with ONS’s Penny Holborn

Viewpoints

£10m for UK regulators to ‘jumpstart’ AI capabilities, as government commits to white paper approach

UK government sets out 10 principles for use of generative AI

When will the cherry trees bloom? Get ready to make and share your predictions!

Case Studies

Deduplicating and linking large datasets using Splink

Learning from failure: ‘Red flags’ in body-worn camera data

Food for Thought: The value of competitions for confidential data

Generative AI is based on neural networks, which are so-called ‘black boxes’. This makes it difficult or impossible to explain the inner workings of the model which has potential implications if in the future you are challenged to justify decisioning or guidance based on the model.

Ideas

Article series: What is AI? Shedding light on the method and madness in these algorithms

What is data science? A closer look at science’s latest priority dispute

Creating Christmas cards with R

Careers

‘I fell in love with math, really, and fell into data science because of that’

‘I was inspired by the power that numerical data have to tell stories and promote policy change’

‘Living my identity takes courage. It is the same courage necessary to start a new business’

Published by

Our partners

With support from

About us

Contribute