Next week, the Royal Statistical Society (RSS) is hosting a panel debate on “Evaluating artificial intelligence: How data science and statistics can make sense of AI models.” The event forms part of the AI Fringe programme of activities and is timed to precede the UK government’s AI Safety Summit at Bletchley Park.
RSS president Andrew Garrett is chairing this free, in-person event on 31 October, and he’ll be joined by five panellists to discuss big questions around AI model development, evaluation, risk and benefits:
- Mihaela van der Schaar, John Humphrey Plummer professor of machine learning, artificial intelligence and medicine at the University of Cambridge and a fellow at The Alan Turing Institute
- Detlef Nauck, head of AI and data science research, BT
- Stephanie Hare, researcher, broadcaster, and author of Technology is Not Neutral
- Mark Levene, principal scientist, department of data science, National Physical Laboratory
- Martin Goodson, chief executive, Evolution AI, and former chair of the RSS Data Science and AI Section
We sat down with Andy for a quick-fire Q&A to hear more of what’s in store for next week’s event.
The RSS event takes place one day before the UK government’s AI Safety Summit. Why is it so important for statisticians and data scientists to be involved in the debate over AI safety?
Statistics and data science are at the heart of the AI movement – it’s really a question of taking data and information, and using statistical algorithms to create outputs. That’s at the core of what we do as statisticians and data scientists. Although it’s called AI, it uses mathematical and statistical methods.
The AI Safety Summit focuses on risks posed by certain types of AI systems. Where do you see the biggest risk?
Risk depends upon the purpose and the impact of the AI. It’s very different whether something is being used to inform or recommend or persuade or decide. If it’s a decision-making system, say, there is a bigger risk associated with it if the decision to be made will have an important impact on your life – so, that might be a medical decision or a decision on whether you’re to receive benefits or housing, or how you’re treated in the judicial system. It’s important to understand what the AI is being used for and how much control you have over it, and also how much oversight there is. Will the decision be made solely by an algorithm, or is there human oversight?
There is a particular concern moving forward around misinformation and disinformation. That is a genuine concern, particularly with big elections coming up in the UK and beyond. People are sharing things that they don’t realise are disinformation, so it is really important to understand where the information is coming from, its provenance; that’s incredibly important. We’ve seen with hallucinogenic AI that sometimes there are references given for outputs that don’t actually exist. So we need to constantly scrutinise where information sources are coming from and, if sources are quoted, do they really exist?
The RSS policy response to the UK government’s AI regulation whitepaper urges investment in a Centre for AI Evaluation Methodology. What informed this recommendation?
The white paper uses the word “proportionate” many, many times. When you talk about safety, inevitably you move into the area of risk, and if you want safety measures to be proportionate, and you want those measures to be based upon risk, then – effectively – you have to understand how you evaluate risk, how you evaluate the probability of something happening, and what the impact might be if it does happen. That naturally lends itself to needing to evaluate both the potential harm but also the potential benefit of using AI. I think the summit is focused more around harm, and the concerns around potential harms, rather than the trade off between harm and benefit. But that was certainly the reason we got into talking about the importance of evaluation, and evaluation is used in other industries where there is high risk. Drug development is an example, as is healthcare, where treatments and new methods are evaluated so that people are informed about both the potential harms and the potential benefits.
We have a really important role to play in focusing attention not only on AI outputs but on inputs. That is an area that hasn’t received enough attention but it is one that statisticians understand incredibly well.
A lot of the focus of the AI debate is on outputs. Should we be talking more about inputs – the data the models are trained on, where the data is coming from, and issues of quality and bias, etc.?
There should be more discussion of this, yes. Statisticians and data scientists, but statisticians in particular, are trained very much around the data generating process, so we naturally think about how data is gathered, the potential biases in a dataset, and the representation that you need for a study. We understand that the data going in is as important as the outputs produced. And I think it’s becoming more and more necessary to understand where exactly the data is coming from, whether it’s diverse, whether it’s representative of the populations you want to study, and so on. This is an incredibly important area, and provenance of data – like provenance of information – will become ever more important. So, with a large language model, for example, what information is it trained on? Did the developers have permission to use that information? How representative is that information? These sorts of questions need to be addressed, because your outputs will only ever be as good as your inputs.
AI systems are having wide-reaching effects across society. What impact are AI tools having on the work of statisticians and data scientists, and how would you evaluate the impacts so far: good, bad, or neutral?
You have to be a cynic and an evangelist at the same time. There is some very good work being done but also some very naive work. AI is not magic. It requires the same thoroughness and lifecycle management as anything else. Certainly in terms of pattern recognition, image recognition, it’s been very useful. On MRI images, for example, can you reduce the amount of time humans need to spend looking at the data because you have an AI tool helping with the assessment? Of course, the challenge is then, when you have an AI assessment, what do you compare it to? You could compare it to what an expert would assess, but is that a suitable reference point for saying something is a good system, knowing that humans themselves are not perfect? AI systems are able to handle large datasets, large images, very quickly. And so the speed of being able to do that has a potential advantage, although it depends on the level of human oversight. Where we’re seeing these tools being advantageous, I think, is where you have some human oversight but some of the heavy lifting is being done by the AI systems.
What do you hope will emerge from the panel debate at the RSS next week? Reaching consensus on such a big, broad topic is unlikely, perhaps, but what are the kinds of things that you’re hoping to learn and take away from the discussion?
We’ve got some very good practitioners on the panel, and I’m hoping that we’ll generate some really good discussion from the panel and some really good questions from the audience. When it comes to the AI conversation generally, there’s a danger that it has focused too much so far on either the academic view of things or the large tech company perspective, so we’re probably missing out a whole tranche of people who are working at the coalface on these things, working in smaller companies. So, I’d like to understand a little bit more about what is happening in that part of industry. I know there’s a big focus on things like building out capability in the UK, and that’s not simply a question of having people with expertise – it’s about having access to things like the right sort of computing environments. So, I think there’s going to be some interesting discussion around what’s holding back industry. Overall, though, what I’d like to see come out of this meeting is a more proportionate response, from people who are working on this on a day-to-day basis. Statisticians are good at that – at coming up with a measured response, an informed response. Do we have the same concerns about the existential threat of AI that have been discussed by some of the larger companies, for example?
Aside from coming along and contributing to this panel discussion, how else can statisticians and data scientists engage with the AI debate and help shape a collective response to this major issue?
I’d certainly encourage them to join the RSS and be a part of our work on this. We want to be a strong voice in the debate on AI because it is underpinned by statistical and mathematical techniques, as I mentioned at the start. We have a really important role to play in focusing attention not only on AI outputs but on inputs. That is an area that hasn’t received enough attention but it is one that statisticians understand incredibly well – and it’s one that brings into discussion issues such as ethics, consent, copyright, etc., and that’s very much where we should be engaging as well.
Register now for “Evaluating artificial intelligence: How data science and statistics can make sense of AI models,” a free, in-person debate at the RSS offices in London, 4 pm – 6 pm, Tuesday, October 31.
- Copyright and licence
- © 2023 Royal Statistical Society
This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) International licence.
- How to cite
- Tarran, Brian. 2023. “‘Statistics and data science are at the heart of the AI movement – we want to be a strong voice in the debate.’” Real World Data Science, October 25, 2023. URL