Episoder

  • With the public release of large language models like Chat GPT putting Artificial Intelligence (AI) firmly on our radar, this episode explores what benefits this technology might hold for statistics and analysis, as well as policymaking and public services.

    Joining host, Miles Fletcher, to discuss the groundbreaking work being done in this area by the Office for National Statistics (ONS) and across the wider UK Government scene are: Osama Rahman, Director of the ONS Data Science Campus; Richard Campbell, Head of Reproducible Data Science and Analysis; and Sam Rose, Deputy Director of Advanced Analytics and Head of Data Science and AI at the Department for Transport.

    Transcript

    MILES FLETCHER

    Welcome again to Statistically Speaking, the official podcast of the UK’s Office for National Statistics. I'm Miles Fletcher and, if you've been a regular listener to these podcasts, you'll have heard plenty of the natural intelligence displayed by my ONS colleagues. This time though, we're looking into the artificial stuff. We'll discuss the work being done by the ONS to take advantage of this great technological leap forward; what's going on with AI across the wider UK Government scene; and also talk about the importance of making sure every use of AI is carried out safely and responsibly. Guiding us through that are my ONS colleagues - with some of the most impressive job titles we've had to date - Osama Rahman is Director of the Data Science Campus. Richard Campbell is Head of Reproducible Data Science and Analysis. And completing our lineup, Sam Rose, Deputy Director of Advanced Analytics and head of data science and AI at the Department for Transport. Welcome to you all. Osama let's kick off then with some clarity on this AI thing. It's become the big phrase of our time now of course but when it comes to artificial intelligence and public data, what precisely are we talking about?

    OSAMA RAHMAN
    So artificial intelligence quite simply is the simulation of human intelligence processes by computing systems, and the simulation is the important bit, I think. Actually, people talk about data science, and they talk about machine learning - there's no clear-cut boundaries between these things, and there's a lot of overlap. So, you think about data science. It's the study of data to extract meaningful insights. It's multidisciplinary – maths, stats, computer programming, domain expertise, and you analyse large amounts of data to ask and answer questions. And then you think about machine learning. So that focuses on the development of computer algorithms that improve automatically through experience and by the use of data. So, in other words, machine learning enables computers to learn from data and make decisions or predictions without explicitly being programmed to do so. So, if you think about some of the stuff we do at the ONS, it's very important to be able to take a job and match it to an industrial classification - so that was a manually intensive process and now we use a lot of machine learning to guide that. So, machine learning is essentially a form of AI.

    MILES FLETCHER
    So is it fair to say then that the reason, or one of the main reasons, people are talking so much about AI now is because of the public release of these large language models? The chat bots if you like, to simpletons like me, the ChatGPT’s and so forth. You know, they seem like glorified search engines or Oracles - you ask them a question and they tell you everything you need to know.

    OSAMA RAHMAN
    So that's a form of AI and the one everyone's interested in. But it's not the only form – like I said machine learning, some other applications in data science, where we try in government, you know, in trying to detect fraud and error. So, it's all interlinked.

    MILES FLETCHER
    When the ONS asked people recently for one of its own surveys, about how aware the public are about artificial intelligence, 42% of people said they used it in their home recently. What sort of things would people be using it for in the home? What are these everyday applications of AI and I mean, is this artificial intelligence strictly speaking?

    OSAMA RAHMAN
    If you use Spotify, or Amazon music or YouTube music, they get data on what music you listen to, and they match that with people who've been listening to similar music, and they make recommendations for you. And that's one of the ways people find out about new music or new movies if you use Netflix, so that's one pretty basic application, that I think a lot of people are using in the home.

    MILES FLETCHER
    And when asked about what areas of AI they'd like to know more about, more than four in 10 adults reported that they'd like to know better how to judge the accuracy of information. I guess this is where the ONS might come in. Rich then, if I could just ask you to explain what we've been up to, what the Data Science Campus has been up to, to actually bring the power of artificial intelligence to our statistics.

    RICHARD CAMPBELL
    Thanks Miles. Yeah, a few things that ONS has been doing in this very broad sphere of artificial intelligence, and it's really in that overlap area that Osama mentioned with data science, so I'd pick out a few sorts of general areas there. So, one is automation. You know, we're always keen to look at how we can automate processes and make them more efficient. It frees up the time of our analysts to conduct more work. It means that we are more cost effective. It means that our statistics have better quality. It's something we've done for years but AI offers some new opportunities do that. The other area which Osama touched on is the use of large language models, you know, we can get into the complexities of data. We can get much more out of data; we can complete tasks that would have been too complex or too time consuming for real data scientists. And this is good news, actually, because it frees up the data scientists to add real valuable human insights. Some of the places we've been using this. So, my team for example, which is called reproducible data science and analysis, and we use data science and engineering skills to develop computer systems to produce statistics where the data is a bit big, or what I tend to call a bit messy or a bit complex for our traditional computer systems. We use AI here through automation, as I mentioned, you know, really making sure that we're making systems as efficient and high quality as possible. Another thing we're interested in doing here is quite often we’re doing something called re-platforming systems. So, this is where we take a system that's been used to produce our statistics for years and years and look to move it on to new technology. Now we're exploring with Osama's team the potential for AI to do a lot of the grunt work for us there to sort of go in and say, right, what is going on in this system? How is it working, how we can improve it? One other thing I'll mention, if Osama doesn't mind me treading on the territory of his team, is the Stats Chat function that we've used on the ONS website. So, this is using AI to enable a far more intelligent interrogation of the vast range of statistics that we've got, so it no longer requires people to be really knowledgeable about our statistics. It enables them to ask quite open questions and to be guided to the most relevant data.

    MILES FLETCHER
    Because at the moment, if you want to really explore a topic by getting into the depths of the data, into the granular data, you’ve really got to know what you're looking for haven’t you? This again is an oracle that will come up with the answers for you and just present them all ready for your digestion.

    RICHARD CAMPBELL
    That's right. And I tend to think of these things as a starting point, rather than the whole answer. So, what it’s enabling you to do is to get to the meat of the issue a lot quicker. And then you can focus your energy as a user of our statistics in doing the analysis that you want rather than thinking “how do I find the right information in the first place?”

    MILES FLETCHER
    Osama, that sounds like an intriguing tool. Tell us precisely how it works then, what data does it capture, what's in scope?

    OSAMA RAHMAN
    So the scope is publicly available documents on the ONS website. And there's a specific reason for that. So, these AI tools, you can have it look at the whole internet, you can have it look at subsets of data, you can point it to specific bits of data, right? And what's important for us is actually the work of the ONS, that statistics we produce are quality assured and relevant. And by providing these guardrails where you know, Stats Chat only looks at ONS published data, we have a degree of assurance that the data coming back to the user is likely to be of good quality and not based on who knows what information.

    MILES FLETCHER
    Because when you use, to name one example, ChatGPT for example, the little warning comes back saying “ChatGPT can make mistakes, consider checking important information.” And I guess that's fundamental to all this isn't it. These tools, as intelligent as they might be, they're only as good - like any system - as the information that's going in the front end.

    OSAMA RAHMAN
    That's absolutely correct, which is why we have these guardrails where, you know, the functionality on Stats Chat is focused on published ONS information.

    MILES FLETCHER
    That does mean that something that's offered by an organisation like the ONS does have that sort of inbuilt potential to be trustworthy and widely used. But of course, you might say, to have a really good tool it's got to be drawing on masses of information from right across the world. And it's interesting how, and you mentioned that it's open-source data, of course, that's most available for these tools at the moment, but you're seeing proprietary data coming in as well. And this week, as we're recording this, the Financial Times, for example, has announced that it's done a deal with one of the big AI firms to put all of its content into their database. Do you think there's scope for organisations like the ONS around the world to collaborate on this and to provide you know, really powerful tools for the world to exchange knowledge and data this way?

    OSAMA RAHMAN
    So there is collaboration going on. There's collaboration, both within government - we're not the only department looking at these sorts of tools; there's also collaboration internationally. I think the difference you know... our information on our website is already publicly available. That's why it's on the net, it is a publication. But there's a difference in situation with the FT where, you know, a lot of the FT information is behind a paywall.

    MILES FLETCHER
    Yeah, it has a sort of democratising tendency that this publicly available information is being fed into these kinds of sources and these kinds of tools. That's big picture stuff. It's all very exciting work that's going on. But I'll come back to you Rich just for a second. What examples practically, because I think that the Stats Chat project is still a little way off actually being available publicly, isn't it?

    RICHARD CAMPBELL
    Yeah, I think it is still a little way off. So, I think the key thing that we're doing at the moment and something we've done for years, but AI is helping is the use of automation principles. Just making things quicker. Now in a data science context, this might be going through very, very large data sets, looking for patterns that it would take an analyst a huge amount of time and probably far too much patience than they would have to find.

    MILES FLETCHER
    So for example, in future then we might find that - and this is one issue that recurs in these podcasts - obviously about the limitations of official statistics is they tend to lag. This is another way of making sure that data gets processed faster. And therefore, the statistics are more timely, and therefore the insights they provide are really much more actionable than perhaps they might be at the moment.

    RICHARD CAMPBELL
    Yeah, that's spot on. There's potential in there for pace of getting the statistics from the point that the data exists to getting it into published statistics. There's potential there for us to be able to combine and bring more sources together. There's also some behind the scenes stuff that helps as well. So, for example, quite often we are coding up the systems to produce new or improved versions of official statistics. And we're looking at the possibility of AI speeding up and supporting that process, perhaps for example, by giving us an initial draft of the code. Now, why does that matter for people in the public, you know, does anybody actually care? Well, what it means is that we can do things quicker and more to the point we can focus the time of our expert data scientists and other analysts in really helping people understand the data and the analysis that we're producing.

    MILES FLETCHER
    Okay, so lots of interesting stuff in the pipeline there. But I’d like to bring in Sam now to talk about how AI is actually being used in government right now. Because in your work Sam at the Department for Transport, you've actually been working on some practical projects that have been gaining results in the real world.

    SAM ROSE
    We have - we've been doing loads actually, and my poor team probably haven't had any time to sit still for the last 18 months or so. And I think like most ministerial departments, we're doing lots and lots of work to automate existing processes, so much like Rich has alluded to in your space, we're looking at the things that take up most of the time for our policy colleagues and looking at how we can automate those. So, for example, drafting correspondence, or automating policy consultation processes, or all of that kind of corporate memory type stuff. Can we mine big banks of data be it text or otherwise and summarise that information or generate new insights that we wouldn't have been able to do previously? But I think slightly more relevant maybe for you guys, is the stuff we're doing on creating new datasets or improving datasets. So, a few things. We're training a machine learning model to identify heavy goods vehicles from Earth observation data. And that's because we don't have a single nationally representative data set that tells us where these heavy goods vehicles park or stop outside of existing kind of service stations, and what we want to understand is where are those big areas of tarmac or concrete where they're all parking up as part of their routine journeys, so that we can look at when we're rolling out the green infrastructure for heavy goods vehicles, we're looking at where the important places that we need to put that infrastructure are. And that data doesn't exist at the moment. So we're using machine learning to generate a new dataset that we wouldn't otherwise have.

    MILES FLETCHER
    And how widespread are these kinds of projects across government in the UK now?

    SAM ROSE
    So I think that there are loads of different things and I wouldn't be able to speak on behalf of everybody but I know lots of different areas of government are looking at similar kind of automation and productivity projects like our kind of drafting all of the knowledge management area. I think there's things like Osama alluded to where DEFRA for example, I think they're using Earth observation data to assess biodiversity for example. So, there's lots of stuff that's common between lots of government departments, and then there's lots of stuff that's very specific to individual departments. But all along the way there's lots of collaboration and working together to make sure we're all learning continuously and where we can collaborate on a single solution that we are.

    MILES FLETCHER
    I guess one of the central public concerns about the spread of AI once again that it will cost jobs, that it will do people out of the means of making a living that they've become used to. And I guess from government's point of view, it's all about doing much, much more with the resources that we have and making government much more effective.

    SAM ROSE
    Yes, absolutely. And it's not necessarily - and I think Rich mentioned this earlier - it's not necessarily about doing our jobs for us. It's about improving how we can do our jobs and being able to do more with less, I think, so freeing up the human to do the bit that the human really needs to do and enabling the technology to do their very repeatable very automatable parts of the job. And indeed, in some instances, this technology can actually do the work better than humans. So be it identifying really complex patterns and datasets, for example. Or a good example from us in transport is we've trained machine learning model to be able to look at images of electric vehicle charge point installations and be able to identify that similar or the same image that has been submitted more than once. Now that's estimated to have saved over 130 man years of time, you know, that's not a task that we would have been able to do with just humans.

    MILES FLETCHER
    And you would have to be pretty alert as a human and have a very high boredom threshold to process all that material yourself and spot the fraudsters.

    SAM ROSE
    Yeah, well, quite. And that's, I think, a really nice example of where again, it's not taking our jobs, but it's enabling us to do something that we wouldn't have been able to do previously and improve the service that we're providing.

    MILES FLETCHER
    Now, our ability collectively, whatever sort of organisation we're involved in, our ability to make the most of AI depends on of course having the right skills, and Osama I guess this is where the Data Science Campus comes in as the government's Centre of Excellence for data science, principally, but I guess also in this context, artificial intelligence as well. What work have you been involved in to make sure that the supply of those skills and knowledge is on tap for government?

    OSAMA RAHMAN
    So firstly, I would say we are a (one) centre of excellence within government. I think you know, what's been brilliant to see since the campus was set up has been that actually more and more government departments have excellent data science, AI teams. Sam leads one at DfT. There is, of course, 10DS (or 10 Data Science) at number 10 [Downing Street]. There's a Cabinet Office team. So, there's lots of teams that now work in this area. Some of the stuff we've been doing is we have various training programmes that we have run. We have senior data masterclasses so that actually, senior leaders within government can understand better the power of data. 10DS, Sam's area, have all been running hackathons, which actually improve skills as well. So, it's no longer just us who are building capability. I think it's great to see that across government and across departments there are teams improving skills within their departments, bringing in others from outside to work with them. So, there's a lot going on there.

    SAM ROSE
    Just really quickly, it's important to think that skills are not just skills of data scientists, but skills of everybody's ability to use this kind of technology. There's a lot of work going on at the moment looking at what we need to do both internally to government, but also out there in all of our sectors to make sure that our workforce has the skills it needs to be able to more rapidly kind of adopt and be able to take advantage of all the benefits that this technology brings to us. I mean from a very personal point of view, and I don't really know all of the answers to this, but you know, I'm thinking about what actually, if large language models can help us to generate efficient code, then actually, what skills do I need in my data scientists? If it's not writing code, is it actually the analytical thinking and being able to understand how to apply these kinds of technologies? So, I think it changes what we need in the workforce that we have.

    MILES FLETCHER
    Inevitably, though, if we're talking about this kind of technology being rolled out across government and thereby increasing the power of government to know more about more people, then concerns obviously, about the ethical use of data come in...

    RICHARD CAMPBELL
    Maybe if I can just come in on that one Miles. Using data safely and responsibly - it's built into our very DNA in ONS and across government. And our keenness to sort of learn how to do new tools new techniques is always going to be tempered by our need to ensure that we are responsibly using the data that's been entrusted to us. And I think we need to sort of strike a balance here. We need to ensure that we don't take this responsibility as an excuse to not try and adopt new technology such as AI, but it also means we have to do so with care and responsibility and to do it at an appropriate pace. The key thing, I think, for me is ensuring that we can retain control of the data that we've been entrusted with. And so, understanding what AI is doing with that data, considering what data we're giving access to it, what data is being processed, and what data is being generated. And this is really at the forefront of our minds and our collective use of this. I think our approach - and Osama touched on this earlier - is to sort of be novel and start with open source and non-sensitive data first, so that will help us learn how we can effectively use it before we go on to some of the more sensitive data that we hold.

    SAM ROSE
    We have to have ethics and data protection at the heart of everything we do, which then does have the tendency I think necessarily to reduce the pace of our ability to roll things out a little bit. But as government we do, I think have more responsibility. We can't have those kind of oops moments that some of the big tech companies have had when they're trying to reverse engineer the data to remove bias and that you know, things like that that then fundamentally undermine the output of their models. I think when you're doing a job that affects individual people, and providing services that affect citizens then we don't really have the luxury of getting it wrong like that, and we have to try to make sure we get it right first time. So, all of the things that Richard said about starting with, you know, safer datasets and working our way up before we deploy these models is kind of fundamental to how we're going to learn and ensure that we're doing it safely and securely

    MILES FLETCHER
    Osama what's your take on the ethics question?

    OSAMA RAHMAN
    First of all, I would echo everything just said. You know the Statistics Code of Practice is an annex to the Civil Service Code, it applies to all of us not just statisticians - I'll point that out. It is I think, not just in the ONS, I think for analysts and data scientists and specialists across government, this is kind of built into their DNA. Central Digital and Data Office has put together guidance and circulated it across government on the safe use of AI within government. So, within government, we do take this quite seriously. And then actually in terms of the use of some of these techniques, I think pointing these tools at data and information that we know is accurate is an important starting point - so having those guardrails. If it's going to be used for decision making, then having a human in the loop is quite important to make sure that the use is ethical. So, there's a bunch of safety checks that we do put in which I think allow for us to have some assurance that the use of these tools will be safe and ethical.

    RICHARD CAMPBELL
    I think just as one additional point is you know; this isn't a new challenge for us. It's a different flavour of a challenge that we faced in considering new technology in the past. So, we can think in fairly recent times the use of cloud technology to securely and safely store data. If we go further back the use of the Internet, go back further, again, the use of computers to hold data. And what I think we've demonstrated time and time again, is that we do approach these things responsibly and maturely. But we do find opportunities to use all of them to improve the quality of statistics and analysis and the service that we offer the public.

    MILES FLETCHER
    Looking to the future then, and this is a very fast-moving future of course, I'd like to get your takes on also where you see us in five years’ time in 10 years’ time with this. I mean starting with the Office for National Statistics – Osama and Rich particularly on this. How will we start to see the published statistics and the big key topics, but also the granular insights that we provide on all kinds of areas. How will we see that changing and developing do you think? Where are you going to put your money?

    RICHARD CAMPBELL
    I think predicting the future in this way is quite a dangerous game. I’m thinking back to you know, if we had this podcast in the year 2000 and we asked ‘’how would the internet form part of our working lives?’ We would have predicted something which would have been quite different from the impact that it had. Saying all that I think it will make a fundamental difference to the way that we work. I see that it will be integrated in the day-to-day tasks that we do in a similar way that we used computers to speed up and change the way that we produced statistics. I think it will enable our users to far better interact and engage with data and analysis. So, it will be less of us producing a specific finalised product for them, and more for them to be able to sort of get in ask questions, probe and really, really interact. And I think lastly, it will give us more potential to work and analyse data because one thing, and I think this is really important to say, AI will give more opportunities for analysts. It won’t take them away. It will give them more space, more tools to work with to produce better, more complex, more useful datasets and analysis for ONS and for its users.

    SAM ROSE
    I was just going to add that I think it will fundamentally change the nature of what we do. A little bit like Rich said, the sort of work that we do will be different, but really critically, I think in a few years’ time we won't really notice that change. I was thinking that most people have forgotten that 10 or more years ago before you left the house to go somewhere new, you would have consulted your map. Whereas actually nobody, or very few people, do that anymore. So, I think we're going to forget very quickly that lots of what we will be doing will be AI driven.

    MILES FLETCHER
    So it's a big evolutionary step forward, if not quite a revolution. Do you agree with that Osama?

    OSAMA RAHMAN
    Absolutely, because some of us have actually been using sort of transformers-based models, which is what these large language models are based on for... My team has been working with those for at least the last eight years. But I wanted to just pick up on what Rich just said. And it is an evolution right. And you can't separate the tools from the data. And one of the things we're getting now is data that is much more granular and of much higher velocity than the data we were used to. So that allows us to look at things at a more local level, at a more timely level. What I do completely agree with Rich on is actually a lot of these tools and methodologies allow the technical production of statistics to get more efficient, which then allows you to produce more statistics at a disaggregated level - at a regional level or local authority area level or looking at different sub populations. It allows us to update statistics more frequently. But then also what it allows us to do, because it's not just about the production of the statistics, it's about what those statistics actually tell you is going on. And I think it allows the people we have at the ONS and other government departments to spend more time on the real value added which is “what does this mean?”

    MILES FLETCHER
    It's interesting if you're researching a particular topic, it must be good to sort of evolve your methodology quickly and to refine your processes on the run as it were to explore a particular topic. One thing of course we need in statistics is consistency of methodology and approach. Does that limit do you think, either of you, the ability for statistics to get more insightful to get more germane to issues because we have to stick to accepted methodologies to provide that consistency over the long run?

    RICHARD CAMPBELL
    I don't think it does Miles. I mean, you're right there. There's always a challenge for us in that, that consistency is really important, that comparability in a time series. Equally, users do want us to look for improvements, more detail, whether that's granularity or whatever else. And actually, we've got a really good successful track record of both maintaining the consistency of our statistics, while at the same time introducing new and improved methods. We do it with GDP. We do it with inflation, we do it with population, it's something that we do time and time again. And, actually, I think automation AI offers up some really exciting opportunities here in terms of methods that can be applied. There's actually an element of it, which will help us in the understanding and documentation and consistent application of the methods as well. It’s perhaps one of the less – if you don’t mind me using the word - “sexy” applications of AI but using it to ensure that our documentation is absolutely spot on and done quickly. To ensure that we are applying methods really quickly and consistently. I think AI offers us potential to do that even better.

    MILES FLETCHER
    “Sexy” in the particular way that we refer to progress in data science.

    RICHARD CAMPBELL

    Yes, quite.

    OSAMA RAHMAN
    Can I just come in on this? And it's possibly worth using a specific example of putting out statistics on prices. In the old days you’d basically have people going out into the field, and that's where you'd find a basket of goods, and using pen and paper would collect prices. Now where a lot of national statistical organisations are going to is actually getting scanner data, because most things when you pay for them nowadays in many parts of the world, it's scanned first, electronically rather than rung up through a cash register of some sort. So, scanner data provides a lot of information about what is being purchased, and at what price it's being sold at various retail outlets. And so, you have this data which again is much more granular and has much higher velocity then price data you can collect through surveys, and you know, how you integrate that into the production of pricing statistics and other economic statistics is really, you know, a really interesting question and work that a lot of national statistical organisations are working on. So, there's still the basic methodology remains the same. It's you know, kind of defined a basket of goods, but you expand the scale of the basket, we'll get prices on at what each of the elements are those baskets are being sold at, and then produce a price measure an inflation measure, right. But these tools and the increasing quantity of data allow us to do that. But you know, the basic methodology is kind of the same, but actually the increase in this data allows us to do that in kind of a different way. It's an evolution.

    MILES FLETCHER
    It does all suggest though, that perhaps the survey might finally be replaced - the big social surveys that the ONS runs. Do you think that the surveys days are numbered, therefore because of AI?

    RICHARD CAMPBELL

    No.

    OSAMA RAHMAN

    No.

    [LAUGHTER]

    MILES FLETCHER

    A resounding no.

    RICHARD CAMPBELL
    That was a resounding no, and it's not a pre-rehearsed one. And maybe I'll just take us back Miles. So, if we went back about the best part of 10 years, everyone was talking about big data. You know, the days of a survey was gone. What we needed was these big, complex, sometimes quite messy data sources that were collected for a variety of other reasons, and that we could utilise those to sort of answer all of the statistical questions that we had. Now, what we found out actually is that yes, these data sources can give us a lot of potential; data science is helping us make the most of them; AI is helping us make even more from them. What we also learned though, is that they work best when they're complementing the surveys, rather than trying to replace. Think of it a bit as horses for courses. Actually, though, I want to give an example of where AI might be able to help us improve the response rates on surveys. So, AI might be able to help respondents navigate through some of the surveys, helping them understand what it is that they're being asked. Helping them answer a bit more efficiently. So that might actually remove a barrier that some people, some businesses have to respond to surveys. So, you never know we might see a bit of an uptick in response rates with a bit of AI’s help.

    OSAMA RAHMAN
    And I think the other thing I would add is what surveys are particularly good at is getting information on the extremes of the distribution. It's great if you think everything's going to be generated through digital footprints data and online services, but actually not everyone... some people... apparently dumb phones are coming back into fashion. Or there are groups that you know, for whatever reason, are not picked up in other forms of data. And actually, surveys are really important for accessing, getting information about hard to reach groups at the end of the distribution.

    MILES FLETCHER
    I think that’s kind of reassuring that for all the promise of AI in this brave new world, that we hope won't be a dystopian future, but whether it will deliver all those things that we've been talking about in terms of better insights, faster statistics, and all that. It's still good to hear though, isn't it, that there is no substitution from speaking to real human beings directly.?

    OSAMA RAHMAN
    I agree entirely.

    MILES FLETCHER
    Well, that's it for another episode of Statistically Speaking and, in summary, I suppose the use of AI feels like a natural evolution with a number of potential benefits, and potentially huge benefits, but with its adoption we need as always to be thoughtful and ethical. So, thanks to all our guests: Sam Rose, Osama Rahman, and Rich Campbell, and of course, thanks to you as always for listening. You can subscribe to future episodes of this podcast on Spotify, Apple podcasts and all the other major podcast platforms. You can also follow us on X - formerly known as Twitter - via the @ONSfocus feed. I’m Miles Fletcher and from myself and producer Steve Milne. Until next time, goodbye.

    ENDS

  • The ONS podcast returns, this time looking at the importance of communicating uncertainty in statistics. Joining host Miles Fletcher to discuss is Sir Robert Chote, Chair of the UKSA; Dr Craig McLaren, of the ONS; and Professor Mairi Spowage, director of the Fraser of Allander Institute.

    Transcript

    MILES FLETCHER

    Welcome back to Statistically Speaking, the official podcast of the UK’s Office for National Statistics. I'm Miles Fletcher and to kick off this brand new season we're going to venture boldly into the world of uncertainty. Now, it is of course the case that nearly all important statistics are in fact estimates. They may be based on huge datasets calculated with the most robust methodologies, but at the end of the day they are statistical judgments subject to some degree of uncertainty. So, how should statisticians best communicate that uncertainty while still maintaining trust in the statistics themselves? It's a hot topic right now and to help us understand it, we have another cast of key players. I'm joined by the chair of the UK Statistics Authority Sir Robert Chote, Dr. Craig McLaren, head of national accounts and GDP here at the ONS, and from Scotland by Professor Mairi Spowage, director of the renowned Fraser of Allander Institute at the University of Strathclyde. Welcome to you all.

    Well, Sir Robert, somebody once famously said that decimal points in GDP is an economist’s way of showing they've got a sense of humour. And well, that's quite amusing - particularly if you're not an economist - there's an important truth in there isn't there? When we say GDP has gone up by 0.6%. We really mean that's our best estimate.


    SIR ROBERT CHOTE
    It is. I mean, I've come at this having been a consumer of economic statistics for 30 years in different ways. I started out as a journalist on the Independent and the Financial Times writing about the new numbers as they were published each day, and then I had 10 years using them as an economic and fiscal forecaster. So I come at this very much from the spirit of a consumer and am now obviously delighted to be working with producers as well. And you're always I think, conscious in those roles of the uncertainty that lies around particular economic estimates. Now, there are some numbers that are published, they are published once, and you are conscious that that's the number that stays there. But there is uncertainty about how accurately that is reflecting the real world position and that's naturally the case. You then have the world of in particular, the national accounts, which are numbers, where you have initial estimates that the producer returns to and updates as the information sets that you have available to draw your conclusions develops over time. And it's very important to remember on the national accounts that that's not a bug, that's a feature of the system. And what you're trying to do is to measure a very complicated set of transactions you're trying to do in three ways, measuring what the economy produces, measuring incomes, measuring expenditure. You do that in different ways with information that flows in at different times. So it's a complex task and necessarily the picture evolves. So I think from the perspective of a user, it's important to be aware of the uncertainty and it's important when you're presenting and publishing statistics to help people engage with that, because if you are making decisions based on statistics, if you're simply trying to gain an understanding of what's going on in the economy or society, generally speaking you shouldn't be betting the farm on the assumption that any particular number is, as you say, going to be right to decimal places. And the more that producers can do to help people engage with that in an informed and intelligent way, and therefore mean that decisions that people take on the basis of this more informed the better.


    MF
    So it needs to be near enough to be reliable, but at the same time we need to know about the uncertainty. So how near is the system at the moment as far as these important indicators are concerned to getting that right?

    SRC
    Well, I think there's an awful lot of effort that goes into ensuring that you are presenting on the basis of the information set that you have the best available estimates that you can, and I think there's an awful lot of effort that goes into thinking about quality, that thinks about quality assurance when these are put together, that thinks about the communication how they mesh in with the rest of the, for example, the economic picture that you have, so you can reasonably assure yourself that you're providing people with the best possible estimate that you can at any given moment. But at the same time, you want to try to guide people by saying, well, this is an estimate, there's no guarantee that this is going to exactly reflect the real world, the more that you can do to put some sort of numerical context around that the more the reliable basis you have for people who are using those numbers, and thinking about as I say, particularly in the case of those statistics that may be revised in future as you get more information. You can learn things, obviously from the direction, the size of revisions to numbers that have happened in the past, in order to give people a sense of how much confidence they should place in any given number produced at any given point in that cycle of evolution as the numbers get firmer over time.

    MF
    If you're looking to use the statistics to make some decision with your business or personal life, where do you look for the small print? Where do you look for the guidance on how reliable this number is going to be?

    SRC
    Well, there's plenty of guidance published in different ways. It depends, obviously on the specific statistics in question, but I think it's very important for producers to ensure that when people come for example to websites or to releases that have the headline numbers that are going to be reported, that it's reasonably straightforward to get to a discussion of where do these numbers come from? How are they calculated? What's the degree of uncertainty that lies around that arising from these things? And so not everybody is obviously going to have an appetite for the technical discussion there. But providing that in a reasonably accessible, reasonably findable way, is important and I think a key principle is that if you're upfront about explaining how numbers are generated, explaining about the uncertainty that lies around them in as quantified way as you can, that actually increases and enhances trust in the underlying production and communication process and in the numbers rather than undermining it. I think you have to give the consumers of these numbers by and large the credit for understanding that these things are only estimates and that if you're upfront about that, and you talk as intelligently and clearly as you can about the uncertainties - potential for revision, for example - then that enhances people's confidence. It doesn't undermine it.

    MF
    You mentioned there about enhancing trust and that's the crux of all this. At a time we're told of growing public mistrust in national institutions and so forth, isn't there a risk that the downside of talking more about uncertainty in statistics is the more aware people will become of it and the less those statistics are going to be trusted?

    SRC
    I think in general, if you are clear with people about how a number is calculated, the uncertainty that lies around it, the potential for revision, how things have evolved in the past - that’s not for everybody, but for most people - is likely to enhance their trust and crucially, their understanding of the numbers that you're presenting and the context that you're putting around those. So making that available - as I say, you have to recognise that different people will have different appetites for the technical detail around this - then there are different ways of presenting the uncertainty not only about, you know, outturn statistics, but in my old gig around forecasts of where things are going in the future and doing that and testing it out with your users as to what they find helpful and what they don't is a valuable thing to be doing.

    MF
    You've been the stats regulator for a little while now. Do you think policymakers, perhaps under pressure to achieve certain outcomes, put too much reliance on statistics when it suits them, in order to show progress against some policy objective? I mean, do the limitations of statistics sometimes go out of the window when it's convenient. What's your view of how well certainty is being treated by those in government and elsewhere?

    SRC
    Well, I think certainly in my time as a forecaster, you were constantly reminding users of forecasters and consumers of that, that again, they're based on the best available information set that you have at the time. You explain where the judgements have come from but in particular, if you're trying to set policy in order to achieve a target for a particular statistic at some point in the future, for example, a measure of the budget deficit, then having an understanding of the uncertainty, the nature of it, the potential size of it in that context, helps you avoid making promises that it's not really in your power to keep with the best will in the world, given those uncertainties. And sometimes that message is taken closer to heart than at other times.

    MF
    Time I think to bring in Craig now at this point, as head of national accounts and the team that produces GDP at the ONS to talk about uncertainty in the real world of statistical production. With this specific example, Craig, you're trying to produce a single number, one single number that sums up progress or lack of it in the economy as a whole. What do you do to make the users of the statistics and the wider public aware of the fact that you're producing in GDP one very broad estimate with a lot of uncertainty built in?

    CRAIG MCLAREN
    Thanks, Miles. I mean, firstly, the UK economy - incredibly complex isn't it? The last set of numbers, we've got 2.7 trillion pounds worth of value. So if you think about how we bring all of those numbers together, then absolutely what we're doing is providing the best estimate at the time and then we start to think about this trade off between timeliness and accuracy. So even when we bring all of those data sources together, we often balance between what can we understand at the point of time, and then equally as we get more information from our businesses and our data suppliers, we evolve our estimates to understand more about the complex nature of the UK economy. So where we do that and how we do that it's looking quite closely at our data sources. So for example, we do a lot of surveys about businesses, and that uses data provided by businesses and that can come with a little bit of a what we call a time lag. So clearly when we run our monthly business surveys that's quite timely. We get that information quite quickly. But actually when we want to understand more detail about the UK economy, we have what we call structural surveys, and they're like our annual surveys. So over time, it can take us a couple of years actually to get a more complete picture of the UK economy. So in that time, absolutely. We may revise the estimate. Some businesses might say, well, we forgot about this. We're going to send you a revised number. We look at quite closely about the interplay between all the dynamics of the different parts of the economy, and then we confront the data set. So I think by bringing all this information together, both on the timeliness but also as we get a more complete picture, we start to refine our estimates. So in practice, what we do find is as we evolve our estimates, we can monitor that. We do look quite closely at the revisions of GDP, then we can produce analysis that helps our users understand those revisions and then we quite heavily focus on the need for rapid information that helps policymakers. So how can policymakers take this in a short period of time, but then we provide this information to understand the revision properties of what we would call that about how our estimates can change and evolve over time as we get additional information going forwards.

    MF
    So let's just look at the specifics, just to help people understand the process and how you put what you've just explained so well into action. Craig, the last quarterly estimate of GDP showed the economy contracted slightly.

    CM
    That's exactly right Miles and I think where we do produce our estimates in a timely basis, absolutely they will be subject to revision or more information as we get them. So this is why it's important, perhaps not to just focus on a single estimate. And I know in our most recent year in the economy, when that's all pretty flat, for example, or there's sort of a small fall, we do have a challenge in our communication. And that becomes a little bit back to the user understanding about how these numbers are compiled. And also perhaps how can you use additional information as part of that? So as I mentioned the UK economy is very complex, GDP is a part of that, but we also have other broader indicators as well. So when we do talk about small movements in the economy, we do need to think about the wider picture alongside that.

    MF
    Okay, so the last quarterly estimate, what was the potential for revision there? Just how big could that have been?

    CM
    We don't formally produce what we call range estimates at the moment. We are working quite closely with colleagues about how we might do that. So if you think about all the information that comes together to produce GDP, some of that is survey base which will have a degree of perhaps error around it, but we also use administrative data sources as well. So we have access to VAT records anonymized of course, which we bring in to our estimates. So the complex nature around the 300 different data sources that we bring in to make GDP means that having a range can be quite a statistical challenge. So what we do is we can actually look at our historical record of GDP revisions, and by doing that, in perhaps normal times, are quite unbiased. And by that, I mean we don't expect to see that to be significant either way. So we may revise up by perhaps 0.1 or down by 0.1, but overall, it's quite a sort of considered picture and we don't see radical revisions to our first estimates over time.

    MF
    You're saying that when revisions happen they are as likely to be up as they are to be down and there's no historical bias in there either way, because presumably, if there was that bias detectable, you would have acted some time ago, to make sure it was removed from the methodology.

    CM

    Exactly. Exactly.

    MF

    Just staying with this whole business of trying to make a very fast estimate because it is by international standards, a fast estimate of a very, very big subject. How much data in percentage terms would you say you’ve got at the point of that first estimate as a proportion of all the data you're eventually going to get when you produce your final number?


    CM
    It does depend on the indicator Miles. So the UK is one of the few countries in the world that produces monthly GDP. So we are quite rapid in producing monthly GDP. Robert did mention in the introduction of this session that with monthly GDP we do an output measure. So this is information we have quite quickly from businesses. So our monthly GDP estimate is based on one of the measures of the economy. So that uses the output measure. We get that from very rapid surveys, and that has quite a good coverage around 60 or 70% that we can get quite quickly. But then as we confront with our different measures of GDP, that's when the other sources come in. So we have our expenditure measure which takes a bit longer and then we have our income measure as well. So we have this process in the UK working for a monthly GDP which is quite rapid. We then bring in additional data sources and each of these measures have their own strengths and weaknesses until we can finally confront them fully in what we call an annual framework. And then often that takes us a couple of years to fully bring together all those different data sources so we can see the evolution of our GDP estimates as additional data comes in.

    MF
    Now looking back to what happened during the pandemic, of course, we saw this incredible downturn in the economy as the effects of lockdown took effect on international travel that shuddered to a halt for a while and everyone was staying at home for long periods. The ONS said at that point, it was the most significant downturn it had ever recorded. But then that was closely followed of course when those restrictions were eased by the most dramatic recovery ever recorded. Just how difficult was it to precisely manage the sheer scale of that change, delivered over quite a short period, relatively speaking, just how good a job did the system do under those very testing circumstances?

    CM
    It was incredibly challenging and I think not just for official statistics of course but for a range of outputs as well. Viewing it in context now, I think when the economy is going relatively stable, perhaps a 0.1 or 0.2 change, we might start to be a bit nervous if we saw some revisions to that but if you think about I believe at the time was around 20% drop in activity and actually the challenge of ensuring that our surveys were capturing what was happening in the economy in the UK, and in the ONS we stood up some additional surveys to provide us with additional information so we could understand what was happening. We still have that survey that's a fortnightly survey. So the challenge that we had was to try and get the information in near real time to provide us with the confidence and also obtaining information from businesses that are not at their place of work, so they weren't responding to our surveys. So we had to pivot to using perhaps telephone, collecting information in a different way really to understand the impact the economy. So when we look back now, in retrospect, perhaps a 20% drop should that have been 21 or 22%. It's all relative to the size of the drop is my main point I would make. So in the context of providing the information at the time, we were quite fortunate in the survey on the data collection front to really have a world leading survey for businesses that provided that information in near real time, which we could then use to understand the impacts on different parts of the UK economy. And I think now when we get new information in an annual basis, we can go back and just confront that data set and understand how reliable those estimates were, of course.

    MF
    Of course the UK was not alone in making some quite significant revisions subsequently to its initial estimates, what was done, though, at the time to let the users of the statistics know that because of those circumstances, which were so unusual, because the pace of change you were seeing was so dramatic, that perhaps there was a need for special caution around what the data was seeming to say about the state of the economy?

    CM
    Exactly, and it was unprecedented of course as well. So in our communication and coming back to how we communicate statistics, and also the understanding as well. We added some additional phrasing, if you like Miles, to ensure people did sort of understand and perhaps acknowledge the fact that in times like this, there is an additional degree of uncertainty. So the phrasing becomes very important, of course to reflect that these are estimates they're our first estimate at the time, they perhaps will be maybe more revised than perhaps typically we would expect to happen. So the narrative and communication and phrasing, and the use of the term estimate, for example, became incredibly important in the time of the pandemic. And it's also incredibly important in the context of smaller movements as well. So while we had this large impact on COVID, it was our best estimate at the time, and I think it's important to reflect that, and as we get more and more understanding of our data sources, then those numbers will be revised. So what we did do was really make sure that was front and centre to our communications just to reflect the fact that there can be additional information after the fact but this is the best estimate at the time and there's a degree of uncertainty. And we've continued that work working closely with colleagues in the regulator to understand about how best we can continue to improve the way that we communicate around uncertainty in what is a complex compilation process as well.

    MF

    Professor Mairi Spowage. You've heard Sir Robert talking earlier about the importance of understanding uncertainty in statistics and the need to make sure our statistical system can deal with that, and explain it to people properly. You've heard Craig also there explain from a production point of view the length to which the ONS goes to deal with the uncertainty in its initial estimates of GDP and the experience of dealing with those dramatic swings around the pandemic. What is your personal take on this from your understanding of what the wider public and the users of economic statistics have a right to expect? What do you make of all that?

    MAIRI SPOWAGE

    So I think I’d just like to start by agreeing with Robert, that explaining uncertainty to users is really important. And in my view, and certainly some research that some of my colleagues at the Economic Statistics Centre of Excellence have done, which show that actually it increases confidence in statistics, because we all know that GDP statistics will be updated as more information comes in when these are presented as revisions to the initial estimates. And I think the more you can do to set expectations of users that this is normal, and sort of core part of estimation of what's going on in the economy, the better when these revisions inevitably happen. We very much see ourselves as not just a user of statistics, but also I guess a filter through which others consume them. We discuss the statistics that ONS produce a lot, and I think we like to highlight for example, if it's the first estimate that more information will be coming in where revisions have happened. And particularly when you're quite close to zero, as we've been over the last year or so, you know, folks can get quite excited about it being slightly above or below zero, but generally the statistics are in the same area even though they may be slightly negative or slightly positive.

    MF

    Yes, and I'd urge people to have a listen to our other podcast on the whole subject of what is a recession to perhaps get some more understanding of just how easily these so called technical recessions can in fact be revised away. So overall then Mairi, do you think the system is doing enough that people do appreciate, particularly on the subject of GDP, of course, because we've had this really powerful example recently, is doing enough to communicate the inherent uncertainty among those early estimates, or perhaps we couldn't be doing more?

    MS

    Yeah, absolutely. Obviously, there's different types of uncertainty and the way that you can communicate and talk about uncertainty when you're producing GDP statistics is slightly different to that, that you might talk about things like labour market statistics, you know. I know there are a lot of issues with labour market statistics at the moment, but obviously, the issues with labour market statistics in normal times is really about the fact it's based on a survey and that therefore has an inherent uncertainty due to the sampling that has to be done. And it might mean that a seemingly you know, an increase in say unemployment from one quarter to the next isn't actually a significant difference. Whereas with GDP, it's much more about the fact that this is only a small proportion of the data that will eventually be used to estimate what's happened in this period in the economy. And over time we’ll sort of be building it up. I think the ONS are doing a good job in trying to communicate uncertainty in statistics but I think we could always do more. I think having you know, statisticians come on and talk about the statistics and pointing these things out proactively is a good idea. So much more media engagement is definitely a good idea. As I said, we try and through you know informal means like blogs and podcasts like this, to talk about the data that have been produced. And you know, when there are interesting features of it, which are driving some of the changes and to what extent those might change. So, one of the features over the last year for 2023 has been the influence of, you know, things like public sector strikes on the data, because when there's less activity in the public sector that also changes the profile of growth over the year quite a lot. And that's been very influential over 2023. So I think it's important that there's more discussion about this and, to be honest, more knowledge in economic circles about how these statistics are put together. Or you know, I'm an economic statistician rather than an economist per se, and I think the more knowledge and awareness that can be amongst economic commentators on these issues, I think the better because if we’re upfront about the uncertainty, I think it increases the confidence when these revisions inevitably happen.

    MF

    Perhaps then it is the way the statistics are told in the media and elsewhere? Of course, they're invested by those observers with more authority perhaps than they deserve. Particularly, of course, it must be very tempting if you're a politician and the numbers are going your way, then obviously you want people to believe they are absolutely 100% accurate.

    MS

    Absolutely. We're in a funny situation at the moment. I mean, you know, our research institute focuses a lot on the Scottish economy. And the data for Scotland for 2023 shows... Yes, it shows two quarters of contraction and two quarters of growth, but they're not joined together. So there wasn't a technical recession in Scotland. But you know, over the year, basically, the Scottish and UK economies have had a really poor year with hardly any growth. But you know, I haven't seen it yet, but I'm expecting that there will be some people, you know, sort of crowing about that, like it's really showing that the Scottish economy is doing better or something when it's not really. So there will always be politicians who try to you know, over interpret changes in the data. Another example would be the first estimates of quarterly growth in the first part of 2023 showed 0.4 growth in Scotland compared to 0.1 in the UK, and there were politicians saying that Scotland was growing four times as fast as the UK. These things will happen, but you know, one of our roles to be honest is in our regular blogs and communications with the policy community, particularly in Scotland, but also beyond, is to point these things out and say that they're a bit silly. That no doubt these things will be revised and come closer together and nobody should get too excited about them.

    MF

    Thinking particularly about when you're looking at levels of geography different from the UK for yourselves in Scotland and from where I'm sitting here in Wales as well, for that matter. Do the data tend to become more or less accurate, should we have more or less confidence in the sort of datasets we're seeing for those different levels of geography?

    MS

    Well, generally it becomes more unreliable, and it's subject to more uncertainty. A lot of the data that's used is based on business surveys for estimating what's going on in the economy. And there are two areas of uncertainty there. The samples at smaller geographies are smaller so it's greater uncertainty because of sampling variability. But there's also a key problem on the data infrastructure in the UK that business data - this is across GB because Northern Ireland's is collected slightly differently - is collected on units which are GB wide. So it does make estimating what's going on in the parts of GB quite challenging. And there are some additional estimation procedures that need to be done to actually say what's going on in Scotland or in Wales. So it does add an additional layer of uncertainty to any sort of economic estimation at sub-UK geographies.

    MF

    I should add at that point that improving the quality of regional sub-national data has always been an important part of the ONS’s work and continues to be part of its strategic drive. But Sir Robert, from what you've seen recently, particularly over the last year, the way that GDP estimates have been used in the media and in politics, and particularly the whole business of comparing quite small differences in GDP change internationally and the significance that's invested in that, the relative growth rates between one country or another. Has there been too much discussion around that; has too much weight being put on that recently from where you've been sitting?

    SRC

    Well, I think just to pick up on the point that Mairi was making, you can end up investing, you know, much too much significance in comparisons of what's going on in one place and in another place over a relatively short period of time in which there's likely to be noise in those numbers. So, as she said, the idea that you know, taking one area where the growth is 0.4 in a quarter and another where it’s 0.1 and saying that one economy is growing four times more quickly, while strictly true on the basis of those that is really not an informative comparison, you have to look over a longer period for both. When you get to international comparisons, there's the additional issue of the extent to which although there are international standards and best practices as to how, for example, national accounts are put together. The way in which this is actually carried out from place to place can be done in different ways that make those sorts of comparisons again, particularly over short periods, but also when the economy is doing strange things as it was during the course of the pandemic, particularly tricky. So in the GDP context, obviously, there was the question mark about having big changes in the composition of what the education and health sectors were doing as we went into the period of lockdown and therefore judging how the output of those sectors had changed was a really very tricky conceptual judgement to make. And one of the issues that arose about trying to make international comparisons is that different people will be doing that in different ways, depending in part on how they measure the outputs of health and education under normal circumstances. So if you are going to do international comparisons, it’s certainly better to look over a longer period. So you're avoiding being misled by short term noise but also having a care to the way in which methodologies may differ, and that that may matter that sometimes more than it does allow others to see if this is actually a meaningful comparison of like for like.

    MF

    It's also worth pointing out, as I think we have in previous podcasts, that the UK is one of the few economies that does actually seek to measure the actual output of public services, whereas some countries just make broad assumptions about what those sectors have been doing. But it's also worth mentioning, I think that some countries simply don't revise as much as we do because their system makes an initial estimate, and then they don't return to it for some years in the case of a number of countries.

    SRC

    Yes, that's true. And so then the question is sometimes - and I think this arose relatively recently in the UK context - of the set of revisions that you look at and change the international comparison, but you know that some countries have not yet essentially done the same set of revisions. For example, the way in which you try to pull together the estimates of output income and expenditure at times afterwards as you have more information from annual surveys you have more information on incomes for example, from the tax system. So, again, at any given moment, even though you know, you're in both cases, trying to say well, what's our best sense of what was going on a year ago, different countries will be at different stages of the statistical production process and the proportion of the eventual total information set on which you base your estimates, you know, some countries will have incorporated more of that than less, and so a revision that you're doing this month, somebody else may not do for six months, and that again, complicates the picture, and really again, suggests that looking at international comparisons at too high frequency or too much in the recent past, there are bigger uncertainties and caveats that you ought to be placing around big calls and big interpretations based on that.

    MF

    Yes, and while it's hard enough to know where you are at any given point in the economy, it's even harder of course - infinitely harder you might say - to work out where on earth you're going to go next. You've spent a lot of time in the forecasting business, how are forecasters, and I know the Bank of England in particular is taking a good look at this at the moment - the data it relies upon in order to make its forecast - what can the statistical system be doing to support organisations with that unenviable task of having to look into the future and guide us on what is going to happen next?

    SRC

    Well, I think from the perspective of the forecasters themselves, many of the same principles that we've been talking to in terms of how the statistical system should communicate uncertainty applies in spades, in the case of forecasts where explaining how you've reached the judgements that you have, the uncertainty that you know, past forecast errors, that particular sensitivity of a forecast, a judgement that you'd be maybe making in some part of it, the more you can do to explain that increases people's trust rather than reduces it. From the perspective of the statistical producer helping the forecaster, I think, again, explanation, if you have got particular difficulties, particular reasons why you think there might be greater uncertainty than in the past around particular numbers, it's very important. The current evolution of the labour market statistics is a good example of that - you need to be talking to the big users and the big forecasters about the particular uncertainties, there may be at a given time so they can take account of that as best they can. On the other hand, having been a forecaster for 10 years, I certainly took the view that for forecasters to complain about revisions in economic data is like sailors complaining about waves in the sea. I'm afraid that is what you're dealing with. That's what you have to sail on and everybody makes their best effort to come up with the best possible numbers, but it's a fact of life. And your knowledge and understanding of what's going on in the past now, and how that informs your judgements in the future evolves over the time. It doesn't remain static and you're gazing through a murky cloud at some time, but that doesn't reduce the importance of doing the best job you can.

    MF

    Final word then for the forecasters and for everybody else. The statistics are reliable but understand their limitations.

    SRC

    Yeah.

    MF
    Well, that's it for another episode of Statistically Speaking. Thanks to Sir Robert Chote, Professor Mairi Spowage, and Dr. Craig McLaren, and of course, thanks to you, as always for listening.

    You can subscribe to future episodes of this podcast on Spotify, Apple podcasts, and all the other major podcast platforms. And you can follow us on X, previously known as Twitter, via the @ONSfocus feed.

    I'm Miles Fletcher and from myself and producer Steve Milne. Goodbye

    ENDS.

  • Manglende episoder?

    Klik her for at forny feed.

  • In this episode Miles is joined by the National Statistician, Sir Ian Diamond, to reflect on what has been a busy and transformative year at the Office for National Statistics.

    Transcript


    MILES FLETCHER
    This is “Statistically Speaking”, the official podcast of the UK Office for National Statistics, I’m Miles Fletcher. This is our 20th episode, in fact, a milestone of sorts, though not a statistically significant one. What is significant is that we're joined, once again, to look back at the highlights from another 12 months here at the ONS by none other than the National Statistician himself, Professor Sir Ian Diamond. Ian, thanks for joining us again. The year started for you with being reappointed as the national statistician. As 2023 developed, how glad did you feel to be back?

    SIR IAN DIAMOND

    Of course, you know, I was hugely privileged to be invited to continue. It's one of the most exciting things you could ever do and I will continue to do everything in my power to bring great statistics to the service of our nation.

    MF
    To business then, and this time last year, we sat in this very room talking about the results of Census 2021, which were coming in quite fresh then. And we've seen the fastest growth of the population, you told us, since the baby boom of the early 1960s. Over the course of the year much more data has become available from that census and this time, we've been able to make it available for people in much richer ways, including interactive maps, create your own data set tools. What does that say about the population data generally and the way that people can access and use it now? How significant is that there's that sort of development?

    SID
    Well I think we need to recognise that the sorts of things that we can do now, with the use of brilliant technology, brilliant data science and brilliant computing is enabling us to understand our population more, to be able to make our data more accessible. 50, 60, 70 years ago, 150 years ago, we would have just produced in about six or seven years after the census, a report with many, many tables and people would have just been able to look at those tables. Now, we're able to produce data which enables people to build their own tables, to ask questions of data. It’s too easy to say, tell me something interesting, you know, the population of Dorset is this. Okay, that's fine, but actually he wants to know much more about whether that's high or low. You want to know much more about the structure of the population, what its needs for services are, I could go on and on. And each individual will have different questions to ask of the data, and enabling each individual to ask those questions which are important to them, and therefore for the census to be more used, is I think, an incredibly beautiful thing.

    MF
    And you can go onto the website there and create a picture...

    SID
    Anyone can go onto the website, anyone can start to ask whatever questions they want of the data. And to get very clearly, properly statistically disclosed answers which enable them to use those data in whatever way they wish to.

    MF
    And it's a demonstration of obviously the richness of data that's available now from all kinds of sources, and behind that has been a discussion of, that's gone on here in the ONS and beyond this year, about what the future holds for population statistics and how we can develop those and bring those on. There's been a big consultation going on at the moment. What's the engagement with that consultation been like?

    SID

    Well the engagement's been great, we’ve had around 700 responses, and it addresses some fundamental questions. So the census is a really beautiful thing. But at the same time, the census, the last one done the 21st of March 2021, was out of date by the 22nd of March 2021, and more and more out of date as you go on and many of our users say to us, that they want more timely data. Also by its very nature a census is a pretty constrained data set. We in our country have never been prepared to ask for example, income on the census yet this is one of the most demanded questions. We don't ask it because it is believed that it is too sensitive. And so there are many, many, many questions that we simply can't ask because of space. There are many more questions that we simply cannot ask in the granularity that we want to. We've been doing some work recently to reconcile the differences between estimates in the number of Welsh speakers from surveys with estimates on the number of people in the census who report they speak Welsh. Frankly, it would be better if we were able to ask them to get information in a more granular way. And so while the census is an incredibly beautiful thing, we also need to recognise that as time goes on, the technology and the availability of data allowing us to link data becomes much more of a great opportunity that we have been undertaking a lot of research, a lot of research which was asked for by the government in 2013, following the report by Chris Skinner, the late Chris Skinner, Joe Hollis, and Mike Murphy, which is a brilliant report. We said at the moment we need to do another census in 2021. That's what we have done and I believe it to be one of the best coverages there has ever been. And yet we need to assess whether administrative data could be used in future to provide more timely, more flexible and more accessible data and that's what the consultation is about. I will be making a recommendation to the UKSA (UK Statistics Authority) board in the future. In the near future we have to say, and I think it is worth saying that what the consultation says to us is that people are very, very, very much in favour of the direction of travel but at the same time as yet accepting our prototype, unconvinced about the data flows and the sustainability of those data flows to enable us to do it and so, we are looking at how to respond to other very important analyses and we will do so in the near future.

    MF
    When can the people who contributed to that consultation, roughly when should they expect to hear from us?

    SID
    I think the expectation is we'll publish something by the end of quarter one in 2024.

    MF
    Surveys have continued to be a very important part of what the ONS does, these very large national surveys, and yet one of the biggest challenges of the has been maintaining coverage and particularly response rates and obviously, particularly with the Labour Force Survey recently that has been a particular issue for the ONS hasn’t it. Where do things stand now as we move into modernising the traditional Labour Force Survey and moving to a new model because it's an issue statistics bodies around the world have been dealing with, it's harder to get people to complete surveys like they used to.

    SID
    I think it's a fair point that response rates globally are a challenge and response rates globally, not only in national statistics issues, but in the private sector organisations that also collect data, are a challenge. So we need to recognise that. A part of that is that historically, one could find people at home, knock on doors, have that conversation with people, and perhaps post pandemic people are less willing to have a conversation at the house. Also, people are very busy. They work in multiple occupations. They are not always in, they live in housing accommodation which is more and more difficult to access. This there is no kind of single magic bullet here that we could press all we would have. The first thing to say Miles is that we recognise that and that's why we worked with our colleagues at His Majesty's Treasury to provide a project to go to what we call a transformed Labour Force Survey. And I think that that's a hugely exciting project for a number of reasons. One, the labour force survey which has been around for a long time, the questionnaire had become a little bit unwieldy. And also we wanted to enable people to have much more flexibility at the time of which they answered the question. We are in the field with the pilots for that service. We've been pretty good. There are good response rates. There are also some challenges around getting the questions right. These aren’t challenges that stress me, that's why you do a pilot, but at the end of the day we're hoping to be able to transform into that new Labour Force Survey early in 2024, in the first half of 2024. We're working very closely in doing that with our major stakeholders and the Bank of England, His Majesty's Treasury and the Office for Budgetry Responsibility (OBR), is you take a joint decision on when people feel comfortable that we have had enough dual running to enable us to move forward. The other question that I'd have to raise around surveys more generally, is on inflation, which we have all been subjected to in many, many areas in the last couple of years, inflation in survey collection has increased massively and so in the last year we've had to make real judgments about how we maintain quality. And in the next few years, we will really be needing to think through exactly how we conduct our surveys and the cost of doing so.

    MF
    Yes. Of all the people who should be aware of inflation are the people who report it, and certainly the impacts of those relatively high rates of inflation have impacted us as much as anybody else. The challenges not withstanding of running surveys, the interest of government bodies in getting that information directly from people does continue to underline the unique value of surveys. Some people say Oh, well, they surely they can get this information from other sources I've even seen it suggested that social media could provide the answers, but there is a unique value isn't there and actually getting a statistically representative sample of people and speaking to them directly.

    SID
    It depends Miles, I think it absolutely depends on what the question is you're trying to answer. If you're trying to get some answers to a question where the answer can be obtained through administrative data sources, then you don't need a survey. Surveys are difficult to conduct and difficult to pilot and plan and extremely expensive to undertake. So you should only do a survey if you can't get the information from somewhere else. Therefore, you know, I do think that we need to be very, very careful in thinking through when we need to do surveys. Does that mean to say we don't need to do surveys? Absolutely not. There are reasons why you need to do surveys. It may be that you need to really spend some time identifying whether someone really is eligible for hte questions you're going to ask or you may want attitudes. I don't know how to get someone's attitude without asking them. And so there are reasons why you would want to do a survey, but I would argue that you should only do a survey when you cannot get the data from elsewhere. And you also mentioned social media. Social media is an incredibly interesting and important source of data. Now, I wouldn't necessarily say it was statistically representative, but we absolutely have to be flexible in what we call data. We have to be sure of the quality of those data and we have to be sure that we are really aware of what the population is that are represented by those data. So we are using many, many, many types of data now that we would not have used 50 years ago, we simply couldn't have used things like telephony data, things like card data, things like data from satellites to address questions which those data are the best way of providing answers.

    MF
    And there are some fantastic examples of that around the ONS. If you look at how we've changed prices over the last couple of years, again, the measurement of inflation, bringing in new data sources most recently from the US car industry, from the rail industry as well and it all means that the estimates of inflation are now based on many hundreds of thousands of price points, where it used to be just a few things.

    SID
    It doesn't matter what the numbers are, frankly, it matters that you've got a good coverage it matters that you have the most appropriate method and that your data are as accurate as possible. And I do think it is incredibly important. We use a wider range of data sources. I think it's incredibly exciting what those data sources are, but we should only do so being unbelievably careful about what the metadata are that go with them, what the coverage is, why we are using them and whether or not they represent an improvement over what we could do before.

    MF
    Okay, so we've seen in the area of prices, the measurement of inflation, there's new innovative data sources coming from outside, coming from industry. What sort of an improvement does that represent in how we measure inflation, when it's such an important time for cost of living?

    SID
    Well, I think it helps because we have more accurate data. We have more timely data, we have data that are real. So on rail prices, we know what people pay as opposed to what the price as advertised necessarily is and I think that is important. And so being able to properly understand what the consumer is doing, therefore, what inflation is, is to me, incredibly important. I would say that all this effort that we're putting in would not necessarily just be about prices. Here it is about do we understand more about what is going on in the economy, and there are many more questions that we can ask from those data when you've got them, and simply from some of the fixed price point data that we have previously.

    MF
    Now one massive change we've seen lately, and this is another area we've managed to improve coverage, is of course the private rental sector. It’s become much more important as we've seen house prices coming under pressure and mortgages under pressure by high interest rates and so forth. It's revealed a very interesting picture of long-term change, and also in more contemporary terms, what's actually going on with the economy right now.

    SID

    Oh 100%


    MF
    Talking about areas where we've been able to form a new view of what's really been going on. An area that attracts a particular commentary during the course of the year is expenditure on research and development. Regarded as a very important area of activity if you're talking about productivity, future economic growth...we substantially upgraded our estimates of R&D. What was the story behind that? Why was that necessary?

    SID
    Well it was incredibly important because we looked very carefully at our data, we look very carefully at our samples, we looked at our coverage and we decided that we needed properly to to bring in a much wider range of business. And we were reflecting very much those businesses from a very wide range of areas who were able and available to claim R&D tax credits, and therefore to be able to get a decent sample, and the critical thing here is not only were we making good estimates, but we were able to understand much more about what, particularly for smaller tech and creative industry companies, was R&D. And I think that is something that we need to recognise particularly in those smaller companies where there's a much greater flexibility about what people would call R&D.

    MF
    It’s a reflection perhaps that startups are the sort of firms that do R&D these days, and less so the sort of industrial behemoths with huge R&D departments. But there was an interesting change nonetheless, and obviously considerable improvement in measuring that very important area. This all I guess comes under the umbrella of future proofing practices and systems and this all came under a refreshed data strategy that we launched during the course of the year. One of the fundamental principles underlying that, where is it taking us?

    SID
    I think, I mean, just where I've been coming from, are to do a much more holistic view of what data are and how we really use data which are most appropriate to answer the questions that we have, and we recognise that the economy and indeed society are changing very quickly, and therefore we need appropriate data to be able to answer those questions. For example, if you look at employment, there are many, many people in our society who have three, four, even five jobs, we need data which enable us to find out what the distribution of the number of jobs people have is, what they're spending their time doing, and how that impacts on our understanding of the labour force.

    MF
    Worth perhaps recognising some of the particular areas where new data has also been able to shed new light and particularly think of the payments industry which obviously digital payments happen very quickly. They provide almost a daily update on the state of consumer spending. With it obviously the state of the of the of the wider economy. We've managed to strike up partnerships with a huge cross section of the payments sector. What is the particular value of that? And what do we say to perhaps other data providers who might wish to enter into similar arrangements?

    SID
    Well I think we’d say we do everything ethically, and with complete privacy, but at the same time in the public good. And that is, to me, incredibly important. And so understanding what the consumer is spending money on understanding what the consumer is not spending money on, and the transitions, is incredibly important to enable policy which impacts very positively on all of our fellow citizens. So we are very proud of those partnerships. We value them greatly. We don't take them for granted. And those data, entirely ethically provided, with great security but at the same time enabling us to understand what is going on at an early stage in the economy is incredibly important.

    MF
    And of course it’s worth restating, as mentioned already, that of course all of these data are anonymized and aggregated, and no individual would ever be able to identify themselves or be identified from that fast payments data which of course is helping to inform economics policy.

    MF
    Providing data to the people who do make policy and around government and to make sure that policies are really informed by evidence of course that is the major purpose behind the new Integrated Data Service, which was accredited this year under the Digital Economy Act. And that's enabling data to be shared around government in a way that simply wasn't possible before.

    SID
    80 datasets available now and indeed, that number going up more or less by the day. And one of the most important things here is that there are very few challenges which government face which simply can be addressed by data from one department. Therefore, what we need to be able to do is to link data from different sources to enable us in a very granular way to be able to answer questions about topics for which the answer requires data from many sources. And the Integrated Data Service allows us to do that. It allows us to do at a pace and allows us to do it in a way which brings a wide variety of analysts to the party. And I think that, you know, this year major milestone in getting Digital Economy Act accreditation. And we will be looking to streamline the process of using it over the next year, as well as seeing more and more and more projects on it having successful results.

    MF
    And sharing between departments at the national level is important, but also it's been a long-term aim of the ONS to improve its coverage at local levels. And again, there's another important initiative kicked off this year, and that's the launch of ONS Local.

    SID
    Yes and I’d say that the two are linked. It doesn't matter whether you are at a national level or whether you are at a regional level, linked data are important, but we are very pleased working with funding from our colleagues at the Department of Housing, Levelling Up and Communities to have been able to place ONS staff in regions. So we're not talking about teams of people in Manchester or teams of people in Exeter, but we are talking about interlocuters in the southwest, northwest for example, who can really work with the leaders there to ensure that we've got local data for local leaders to make local decisions and that's incredibly important because the questions that people wish to ask are different in different parts of the country and therefore we need to recognise that so it is a good initiative, which I hope will bear fruit in 2024.

    MF
    And the importance of data in government has been underlined by a big initiative, which takes in everybody, not just statisticians and analysts, but everybody in the civil service, has been engaged in what's called the One Big Thing campaign to spend time learning about data that's important to the use of data. How has that initiative been going? The ONS has been a central part of that. How's it been going? How important is it?

    SID
    It is critical. We do not need every public servant to be able to be a brilliant statistician, but we need every public servant to be data literate. We need every public servant to be able to understand data and the best policy comes about when analysts and policymakers and potential beneficiaries work together. And that requires that you can have that data literate conversation. And so I think One Big Thing is a great thing.

    MF
    In fact that the need for people to better understand data became evident early this year, of course, when our GDP revisions were quite dramatically revised in the early part of the autumn as the estimates for the big peak pandemic years, 2021 and 22. There was quite a reaction from some parts of the media and beyond, who reported that our original figures were, because they had revised so dramatically, were simply wrong. I mean, that's not the case. revisions of course have always been integral part of the process. Indeed the OSR, the statistics regulator, found as part of its review our approach to be, and their words were appropriate and well managed, however, it also found the ONS could communicate better the uncertainty in those early estimates of GDP and that's a learning point for the future.

    MF
    We saw particular attention recently for the natural capital outputs, measures of the natural environment, and they attracted a degree of media interest we haven't seen so far, helped by the fact we're able to bring it to life with an analysis of time spent in nature and so forth, and you spoke to BBC Countryfile about that particular work. What's your overriding thoughts on that release? Are we moving to the point where these kinds of measures are getting more exposure? Are they being recognised for their value?

    SID
    I thought the national Natural Capital stuff was brilliant. I've always thought, as I said last year, that we should put alongside GDP measures of the environment and measures of well-being, but you need a concise picture and that's where we're moving in the future.

    MF
    As we speak, we're heading into the bleak midwinter of 2023. The nation is doing all it can to avoid a seasonal bout of flu and the other viruses that traditionally do the rounds at this time of year. And that's seen a revival of our surveillance effort. The Winter Coronavirus Infection Study (WCIS). Tell us about that. What's the purpose of it and what's happening?

    SID
    Yeah, working very much for our colleagues at the UK Health Security Agency who asked us whether we would be prepared to stand back up some of the work we do on surveillance of winter flus, COVID and other issues. and we're of course pleased and proud to be asked. We’re using a different strategy to the one we were using in the past, this is very much simply a mail out of tests enabling people to take a test and then to make estimates, and at the moment the good news is that the estimates of positivity are relatively low, but the bottom line is we need to recognise that without some good hard data on those levels it's pretty impossible for government to plan, and so I think it’s a really exciting initiative. It's a smaller survey than one in the past. It's a survey which will make national estimates rather than many regional estimates, but it's one that we think is extremely exciting, and builds on some of the work we've done in the past.

    MF
    And now of course everyone knows how to self-administer a COVID test and that ability makes it much easier to run these big.

    SID
    Oh 100%. I do think we need to recognise the way in which the world moves on. And certainly, when we first set up the COVID infection survey in 2020. We were not aware of the extent to which people could self-administer, we learned pretty quickly that's why we were able to transition to self-testing, but I think we are in a world where we can do this at pace and provide estimates very, very quickly.

    MF
    Well, thank you very much for joining us. Great to have you with us again at the end of the year. You could choose just three words to sum up your 2023

    SID
    Exciting, full of change and high-quality statistics.

    MF
    And looking ahead to 2024, which pieces of work are you looking forward to most?

    SID
    The economy is changing quickly, society is changing quickly. We will continue to change and to be ever more effective. We've talked about some of the things we're bringing on board and looking forward to a brand-new website to improve our communication. And I think it's going to be a very exciting time.

    MF

    Professor Sir Ian Diamond, thanks very much for joining us.

    That's it for another episode of Statistically Speaking, you can subscribe to future episodes of this podcast on Spotify, Apple podcasts and all the other major podcast platforms and also follow us on X, previously known as Twitter, via the @ONSFocus feed. I'm Miles Fletcher, and from myself, our producer Steve Milne, and everyone here at the ONS, we wish you seasonally adjusted greetings, goodbye.

    ENDS

  • The ONS led the way informing the UK response to the Coronavirus pandemic. But what lessons can be learned and how can we best prepare not only ourselves, but the rest of the world, for the next pandemic?

    Transcript

    MILES FLETCHER

    This is Statistically Speaking, the Office for National Statistics (ONS) Podcast. I'm Miles Fletcher, and as we approach the darkest months of winter, we're revisiting COVID-19.

    Now the ONS doesn't do predictions, and we're certainly not forecasting a resurgence of the virus, either here in the UK or anywhere else. But pandemic preparedness has been the driving force behind two important pieces of work that we're going to be talking about this time. Looking beyond our shores, how well equipped now is the world in general to spot and monitor emerging infections? We'll hear from Josie Golding of the Wellcome Trust on that, including how even weather events like El Nino could affect the spread of viruses. We'll also talk to my ONS colleague, Joy Preece about the pandemic preparedness toolkit, a five-year project backed by Wellcome to create and develop resources that will help countries with health surveillance in the event of future pandemics.

    But first, and closer to home, a new UK winter surveillance study to gather vital data on COVID-19 is now well underway. Jo Evans is its head of operations. Jo, this is a brand new COVID-19 survey the ONS is running in partnership with the UK Health Security Agency (UKHSA). What is the new survey and what's it going to be monitoring over the winter?

    JO EVANS

    So this is now the winter COVID infection study. And we're going to be going out to, I think we've got 145,000 people signed up, and we're going to ask them to take a lateral flow test to see if they are testing positive for COVID-19. Then we'll ask them to tell us a little bit about how they're feeling, what symptoms they have and some other household information - what work do they do? Do they have caring responsibilities? And so on.

    MF

    So we're gonna be getting people to take a test and everyone's familiar of course now with administering their own lateral flow test, that wasn't the case back in the early days of the pandemic, when it was a new thing for the vast majority of us. So they'll take a test that'll tell us whether they are positive or negative for COVID-19. And on top of that, we're going to be gathering data in the form of a questionnaire.

    JE

    That's right. And then this is a collaboration this time, so we'll be working with the UKHSA. I mean, we've worked with them on the COVID infection study before, but this time what we'll be doing is looking at those responses of how many people are telling us that they have COVID-19 And we'll be trying to understand that by where people live or their age group and so on, but we'll be sharing that information with UKHSA and they will then be looking at what the impact is on hospitals. So what they call the infection hospitalisation rate, how many people are going into hospital because they have COVID, so it'll really help us understand what pressures there are on the NHS over this winter period.

    MF

    And that will give us some inkling, once again, about how many people are infected but not actually displaying any symptoms?

    JE

    Absolutely. And we do ask people about their symptoms and if they tell us they test positive, we'll then be sending them a second questionnaire, a follow up, asking them to keep testing until they get two consecutive negative tests so that we can see how long they are testing positive, but we'll also ask them how long did their symptoms last and did they need to go and see a doctor, did they take any medication, so really trying to understand how they're experiencing that period of illness.

    MF

    So during those critical winter months, that'll give us some insight into what's really going on on-the- ground and in communities.

    JE

    That's right and we're running this study from November right through to March so that we can understand that, because COVID, unlike flu, it's not a seasonal virus, but we know that the NHS really suffers through the winter with those increased pressures, with more people needing their services. And this is about understanding what's happening out there. In the community, and what impact that is having on our healthcare services.

    MF

    Another very important aspect of that is we're going to be monitoring for people who say they're suffering the symptoms of what is popularly known as long-COVID, ongoing impacts of the virus, and that will fill a very important evidence gap won’t it.

    JE

    Absolutely. We will in a follow up questionnaire be asking people how long they've had COVID for and whether they have long-COVID. And interestingly, in some research we did when we were designing the questionnaire, long-COVID sufferers told us that they know precisely what date their symptoms started and how long they've had it because of the impact it's been having on their lives. So we are hopeful that this study will provide some really useful information.

    MF

    So 145,000 people taking part. Has it been difficult to get as many people as that involved?

    JE

    Do you know what, we got halfway there within the first 48 hours, people were so keen to take part in this study. We've really been surprised about that.

    MF

    It's probably a reflection of the success of the profile that the original study had.

    JE

    I think so, people are really keen to do their bit here and get involved in this study. And we've had a lot of participants, particularly in the older age groups, who have signed up so we will have to do something that we call weighting of the data across the different age groups, but we do this all the time and we are also going out to those under 16s, right up to the over 70s.

    MF

    And as well as taking part in a very important public study, people get a COVID test for free and can see for themselves whether they've been affected.

    JE

    Yeah, think that's one of the things people are keen to do, particularly over the winter periods when we're going to be mingling and visiting family, that reassurance really that you're going to test every month and find out whether or not you have COVID, I think we all want to make sure that we are virus free before we go and see our loved ones over Christmas, for example.

    MF

    Well, we're meeting to discuss this in mid-November. The first results are still a few weeks away but how are things going, we've got enough people? Are the tests out in the field yet?

    JE

    The tests are out in the field. I think we're looking to get two publications in before Christmas, so testing windows start next week. We're expecting around 25,000 people a week to take their tests and answer their questionnaire.

    MF

    And over the course of a month then, all 145 we hope will have been covered?

    JE

    Yes, I mean 145 is a fantastic number, and if we get all of them taking their test kits each month, then yes, that number will be higher. But even if we were looking at a 50 or 60% response rate, that is excellent for a social survey.

    MF

    Yes, and all the time, what we've heard in other contexts, is that it is difficult to get people to take part in surveys, but certainly in this case people can see the need for it and have come forward in their thousands. It's possibly worth pointing out though that you do have to be selected to take part, that's very important isn’t it, that we've never looked for volunteers. We've selected households randomly and that approach, that's very important to make this a really, really reliable survey isn't it?

    JE

    Yes, and as soon as there was information about the study in the newspapers earlier this year, we had people ringing up and asking to take part and we've had to explain to them that we want a nice random sample so that we can have a fully representative study.

    MF

    So ONS will be producing the figures then it's over to our colleagues in the UK Health Security Agency to interpret what that means from a public health point of view, and what response might be necessary. Absolutely.

    JE

    Absolutely. And they'll be producing some statistics as well. Looking particularly, as I said, at that infection hospitalisation rate.

    MF

    So are we expecting the virus to take off again, or is it just a just a precaution to be monitoring things in this way?

    JE

    When we started this, it was more about understanding if there would be that impact on the NHS over the winter. But then we did see back in September, a new variant, particularly in the US, and as you know, from looking at COVID over the past few years, when you see a new variant coming, sort of appearing in one country, you know that it will come here eventually. So, it's about keeping track of that really, although because we are doing lateral flow tests, we won't actually have information about what kind of variant people have, but it will just be to look to see whether we're seeing an increase in positivity in the community.

    MF

    Okay, so all eyes on the first result, and we wish you, and the team getting the survey together once again, every success on what is a highly valuable and important exercise.

    So we've heard how the new winter surveillance study is helping us track ongoing COVID infection here at home. But we're also using the experience the ONS gained during the last pandemic to prepare not only ourselves, but other countries around the world for another one, Josie, with that global perspective in mind, my first question to you just to get us started is what have been the biggest learnings, the biggest take homes if you like, for Wellcome from the pandemic. And what's your priority now as an organisation considering how best to respond to others?

    JOSIE GOLDING

    Thank you for having me today, I think this is good to be reflecting on COVID in the future. So the biggest take home message is, probably I can look at the positives and the negatives, so I'll go on the negatives first.

    So I think we had a lot of the tools for responding to outbreaks and bigger events but I think we weren't prepared to deal with such a massive pandemic that we saw at SARS-CoV-2, we had expected to prepare for something like influenza and of course we probably didn't use our imagination of how the impact would be so great, affecting people in so many different ways. I think we need to really use that imagination going forward, it’s about thinking through the variety of different impacts we could see across different populations. I think we've learned a lot on how we communicate with the public, with the key people who are involved, and take those lessons because I think we did struggle. I think globally, not just Wellcome as one of the actors on communicating the importance, and the push to be better prepared to respond to these pandemics.

    One of the successes, and I'll put this up from a Wellcome point of view, really was the true integration of research into the response. And you know, this has been building up for many years from the Ebola West Africa outbreak in 2014. And tested again, and tested and tested and refined, on how we do this across the small research community who are engaged in those relatively smaller outbreaks to now a complete game change on how people expect research to be integrated into outbreak responses through pandemics.

    So I think that's now set the new status quo, and before I had to convince people of the importance, I think the importance now speaks for itself.

    MF

    Yes, it was notable in the early stages of the pandemic, those countries, notably in East Asia that have had experience of major respiratory viruses, and dealing with those on a public health point of view, didn't seem to be much better prepared than us in the West, who perhaps have underestimated the risks?

    JG

    It is absolutely true. You know, it is testing the system over and over and over again. So you know who your stakeholders were, you knew how to get things done quickly and at speed. And I think that's the one piece we have to keep remembering that we can keep preparing, but you still need to keep testing the system to ensure that it works in practice. But through it all I would say, you know, one of the things that Wellcome is taken away from SARS-CoV-2 is really the belief that we can't predict exactly what's going to come next when it comes to emerging infectious diseases. We have to keep that in mind, but actually the way to test the system time and time again, is dealing with the health priorities right now. So things like antimicrobial resistance. We know this has been a growing threat for many years. It's had some setbacks through SARS-CoV-2 and the pandemic, you know, we need to really re-energise the community to really take this seriously and to finance and to conduct the research that's required. But there are other threats that are, you know, common health issues, common infectious diseases that countries are dealing with, and we should be integrating the readiness, haemorrhagic fevers, viral fevers, other viruses, whatever it may be, into how we deal with those everyday infectious diseases.

    MF

    And what's the legacy been from an analytical point of view of the first few years for the period that is now known as peak COVID? Have we got that to draw on now because we're seeing the virus continuing to emerge? We're seeing potential threats from new variants and possibly other viruses.

    JG

    I don't think it's evenly applied across the globe on taking advantage of the systems. The approaches that were built up during SARS-CoV-2, some countries are able to maintain some of the resources that have been built or pivot into other health priorities. But that is a bit of a gap that we are seeing. I'll give an example of what I think is a great statistic, you know, for pathogen genomic sequencing and how that was used to track variants and making that as close to real time as you could find through the accelerator and diagnostics working group that mapped out the capacity in countries to be able to conduct pathogen genetic sequencing. And I think at the time, this is going back to 2022, that 77% of the world's countries were able to conduct sequencing when that's a massive game change for a tool that really wasn't a, partly an add on, into how you would do some of the epidemiological research at the beginning of outbreaks. So I think being able to pivot that tool and make sure that these types of facilities and the training and the expertise that people have built up over time can be sustained, working with those communities to be able to identify what are the real use cases for pathogens. And so I think, yes, some of it has probably not been evenly distributed, but we could always be doing more to be able to ensure we can better understand the variants as they come about, but also, what does it mean for a variant you know, how, what changes will that make, what impact will that have on our health?

    MF

    Hearing the UK with our partners, the UK health security agency, we are preparing, as you well know I'm sure, to run a further study going into the winter. What is the role of studies all like that? Are they uniform now across the world or this is not as similar surveillance programmes going on? Or do we remain a bit of a one off in doing this in the UK?

    JG

    I don't believe that this is evenly spread around the world. We ourselves at Wellcome had made a decision to continue funding our SARS-CoV-2 work on the genomics as well as the characterization of these variants as they come out. And what difference does it make in people who've been vaccinated or with other health conditions? We know when we've engaged with the research community across a variety of countries around the world, it ends up being very novel that this research has continued to happen. So I do think there is a gap, and it is becoming more challenging for public health institutes, WHO and others, to gather this information to understand are the vaccines still effective when we have these new variants, are they more transmissible, and other impacts that we would assess for those new variants. So I do think it's becoming more limited, and so of course, we need to make sure that the data we do generate is of high quality.

    MF

    The focus has been very much on COVID, but of course as we've seen historically and in other countries, other viruses have emerged and have serious public health consequences in those countries globally. What other emerging diseases do we need to have our eye on at the moment?

    JG

    Since SARS-CoV-2 really picked up we've had a global impacts event that affected you know, very select communities around the world, and is still ongoing, but not to the same level. We have the ongoing threat of avian influenza, we have El Nino upon us, which is likely to further impact the rates of cholera that we're going to see as well as impacting temperatures, so mosquito borne viruses and other types of arboviruses, potentially broader than that, so it is happening right now. I don't think we even need to sit back and think what it could be. And there are many events that we need to be preparing for. And particularly with something like El Nino as a particular weather event but thinking about the climate crisis. This is only going to grow we need to really collect the evidence now to understand what difference will it make what risks will it pose by experience, and geographical distribution further afield.

    MF

    Yes, can you unpack that a bit for us, because most people will be aware of El Nino as a meteorological phenomenon. How does that translate into public health impacts?

    JG

    The whole background and where we've been watching and waiting for more certainty and whether this was actually going to happen this year, but it's a very high, I think it's now greater than 90% certainty it's going to happen from this part of the year onwards, and and it will vary depending on where in the region it will impact you for droughts or flooding. And of course we need to better understand well, what impact would that have on cholera? Cholera is a prime example. While it's not directly linked to El Nino as it stands right now, we have seen such a change in the cholera distribution in Malawi being a great example where it's seeing rises in cases outside of the expected weather event. So you'd expect it in flooding season but you're seeing it more in dry seasons. So El Nino will make this worse potentially. It's being able to track it and understand the issue we have with events like El Nino is that we don't have enough information on it. We need to be better from a researcher's perspective, we need to just understand the researchers in those countries that are likely to be affected, their opportunity to gather as much evidence about the impact of El Nino so that when it comes around again, we'll be better able to apply what we've learned now.

    MF

    Without wishing to sensationalise, what do you think the risks are of another big global pandemic of the sought we saw with COVID-19?

    JG

    I’m a virologist by training so I'm always thinking that viruses hold the opportunity for some of the greatest opportunity for change. I'm always hopeful that you know these risks like the SARS-CoV-2 are actually quite rare events, to see something take off and to be able to transmit that successfully to humans. We see there are many events where we have what people refer to as spillover events between animals to humans, but it takes quite a change and we don't fully understand what the change might be or why it adapts for people to be more susceptible. So I think it is a risk, it’s a known risk even for SARS-CoV-2 which could change drastically. It's a very early stage in understanding this virus and how it operates. So I think we just have to be prepared, to be continually preparing, for the event that it could happen. I think influenza is the greatest one that it would be surprising if we didn't have a global event for influenza of some kind in the next few years. We've been preparing for this for quite some time.

    MF

    Is that the one that was anticipated then? Because if you look back historically, and this was the big comparison of course that was made with the COVID pandemic, it was so called Spanish flu wasn't it after the First World War, which was a huge global pandemic.

    JG

    Yes, it has been the one that we've always focused on. And if we look at the way that we've managed to monitor the change in the evolution of the virus, and to build the infrastructure globally, to be able to do the research and track that within laboratories and share information on that to help inform the vaccine production, which is very seasonal, influenza has changed every year. This is decades in the making. It's been ongoing for decades, and still, you know, we still have problems with making sure we have enough vaccine at the right time in the right supply to be accessible to all. So even for something that we know is likely to come we still struggle to get to that level of preparedness and it takes a lot of effort and time and it will continue, hopefully SARS-CoV-2 will help evolve those structures and I know a lot of the interest has been to combine, where we can, with coronaviruses respiratory like illnesses in the future to make it more efficient, but it's a big undertaking to really map out what you can do for a single pathogen. So, we have to work to see where we can build in those efficiencies across multi pathogen approaches.

    MF

    So one response and this is a project that you're working on with ONS and we're going to talk about now, and bring Joy in to explain to us, and that's the pandemic preparedness toolkit. The ONS is developing alongside Wellcome as I say, Joy, you're part of the team at ONS creating the toolkit just to take you from the very beginning, how did the ONS get involved in this and why is ONS well placed to facilitate this work?

    JOY PREECE

    Well, this was a proposal that we put together for Wellcome in the aftermath of, I think some of the early years of the pandemic response, and the of course well-known Coronavirus Infection Study (CIS) which I was part of, and I think what we really learned during COVID, the during big years, 2020 and 2021 in the UK, was how important really active data monitoring was when a disease is multiplying exponentially. That's really frightening stuff. You can't afford to just sit back and wait and see. So the key to any successful response has to be figuring things out like the reproduction number really, really quickly. You know, you needed to know how many people were affected, how fast infections were increasing, how the numbers of infections related to numbers of deaths, and we saw during COVID that it wasn't enough to just wait until people were already so ill that they were turning up in a healthcare setting in a hospital or you know, even worse kept to have systems that could be producing those kinds of statistical insights early from a community setting. And so that's the unique approach that ONS really took here. Linking up our statistical offices with the public health agencies and the decision makers. They're using our experience with surveys, with administrative data, with data modelling and data science, drawing on connections that we had with academics, with expert epidemiologists to try and get answers to those important questions as quickly as we could. And I think the unique thing that the ONS and other statistical offices around the world can bring to this is the very fact that we aren't part of the public health systems. So we bring here in ONS that expertise in a social research settings or community settings here, and you know, even apart from numbers of infections, there's topics like employment patterns, travel and tourism opinions and lifestyle habits, which tells you really important things about how people are interacting and behaving which gives you the ability to do some really, really clever modelling or things like disease communicability as that's kind of the background ONS brought to this from our experience during COVID. But, of course, we have experience as well supporting capacity building with overseas national statistical institutes.

    MF

    Now regular listeners to these podcasts will know we recently spoke to our colleagues in Ghana, about everything they're doing in partnership with our own, so there's countries like Ghana, who are very much part of this pandemic preparedness project as well.

    JP

    That's right. So what we are looking to do from the back of everything we learned in the UK is to go out and work with, initially we've identified that we want to be working with three different partner countries to co-create a toolkit that can be generalizable, and that can be accessed globally. And what's really important here is that it isn't about, oh, well here's a model that the UK used and therefore it's applicable to everybody because yes, we just heard from Josie that's not the case. Different countries have different contexts, different experiences of different diseases and have built different infrastructures and skills as a result. So what we really want to do is generate that pool of knowledge internationally and co-create a toolkit that allows countries, based on their unique context, to draw from it and we're talking here of practical guidance. Statistical Methods, knowledge products, case studies, training materials, really this is about capacity building to support that kind of infectious disease surveillance, but in a way that may look very different depending on the country’s context. So it's about an international community of practice here.

    MF

    Well, that's why it's a toolkit. Not a template for how to deal with a pandemic.

    JP

    That's absolutely right. So we're thinking about this under I think, three headings. So data collection on one hand, statistics and modelling on another but crucially, also the relationship building between statistical producers and public health professionals, and that's really essential, because you want to help toolkit users build those relationships of trust between analysts and the public health decision makers that needs to be already there. It needs to be there as a fundamental before a crisis hits because otherwise the opportunities for that kind of productive collaboration that ONS was able to do during the COVID pandemic, just become so so limited.

    MF

    And it’s about sharing of learnings as well, sharing of intelligence...

    JP

    No, absolutely, absolutely right. This is why I talk about it as a co-created toolkit. This is something that will be kind of jointly delivered in collaboration between ourselves and the three countries that we end up working with. We're going through a process right now to kind of identify volunteers and select countries that we'll work with, but also our first stage of that we are launching with a couple of kind of lessons learned workshops are where we're inviting statistical Institute's from around the world and a number of large international NGOs and experts in the field to come and talk about their experience of you know, disease surveillance of COVID and pandemic response and of other disease response. I start drawing out you know, what is there that we can find in common, what are there that are common challenges that have been common enablers to make your situation better, what is there that collectively we identify as the key criteria for a toolkit that will have the most value to the most countries.

    MF

    And Josie, from Wellcomes point of view, what are what are Wellcome’s ambitions for this piece of work?

    JG

    We're very keen to understand how that toolkit can be of value to others, but also think through what is that epidemic preparedness model? So how would you apply it in the future to whatever that disease may be? So we never know maybe during the lifespan of this project, there will be opportunities for the countries to be able to test it out to see what works for them in in real time. I mean, I hope there’s not another pandemic, but we have to just work on that assumption that as we go along, during this particular project, there could be something that might have to test it. But we'll see.

    MF

    There was of course much soul searching about the effectiveness of particularly early response to COVID in certainly this country, in the UK and in other Western countries. But, of course, looking back, and now we have the data, it was the global south that was disproportionately affected.

    JG

    It's fair to say, and I think there's still an unanswered question why some countries were affected more and some were less, and I think Joy has be very clear about the different contexts that countries operate in, but that includes also the populations as well and the other diseases they might have seen, what other health issues. I know, there's been less cases observed on the African continent and of course, that is down to the ability to test, but to a degree it is a different population structure that we're seeing. So yes, we want to make sure that these types of tools are equitably shared, and applied to whatever the health requirements are for their systems. And I think this is the exciting part of this project. And I guess my main kind of message that I repeat to everybody is the only way to prepare for a future pandemic or a future epidemic is simply to deal with the health issues that we have right now. So make sure that we're thinking about things like MMR, and thinking about the impact of climate and understanding better how it's changing the dynamics of infectious diseases such as mosquito borne, or other viruses or malaria, you know, the common issues that many countries are facing and just act now rather than just planning for something. I want to see some real tangible research and systems being built. I think that's why the ONS approach for this is important because it’s about just getting on with it. And not just, you know, coming up with a theoretical model, it is actually working with the countries to see how you're going to apply it now. So we have to just keep focusing on that it could be tomorrow. So just get going.

    JP

    Well, and I would just say aye because I completely agree Josie because it's very easy to get caught up in talking about a pandemic response. But of course, a pandemic response, you can only draw on the resources that are already there. There is no time in a crisis situation to be developing things that are substantially new. So what we're really talking about when we talk about a pandemic preparation is about supporting improved health statistics for all sorts of purposes. You know, data and modelling and communicating and understanding the statistical insight and actually having that really good disease, that has a multiplier benefit for a whole range of health outcomes. Whether or not we see a pandemic tomorrow, we should be planning, even if it doesn't happen tomorrow. And I think that's the critical thing in this, this isn't a once in 100 years. This is an event that is happening on a daily basis, when people are catching diseases and communicating diseases on a daily basis, and providing improved tools to support that has a benefit even in the absence of a large-scale event.

    MF

    And that's it for this episode of Statistically Speaking, next time, as the end of the year approaches, we'll be joined once again by the UK’s National Statistician. If you've got a question for us then please ask us via @ONSfocus on the X social media platform, or Twitter for us traditionalists.

    Thanks to all of our guests today and our producer Julia Short. You can of course subscribe to new episodes of Statistically Speaking on Spotify, Apple podcasts and all other major platforms.

    ENDS

  • In this episode we talk about the growth of data use in the media and the potential impact of misinformation on the public’s trust in official statistics.

    Navigating podcast host Miles Fletcher through this minefield is Prof Sir David Spiegelhalter, from the University of Cambridge; Ed Humpherson, Head of the Office for Statistics Regulation; and award-winning data journalist Simon Rogers.

    Transcript

    MILES FLETCHER

    Welcome again to Statistically Speaking, the official podcast of the UK’s Office for National Statistics, I'm Miles Fletcher. Now we've talked many times before in these podcasts about the rise of data and its impact on our everyday lives. It's all around us of course, and not least in the media we consume every day. But what or who to trust: mainstream media, public figures and national institutions like the ONS, or those random strangers bearing gifts of facts and figures in our social media feeds?

    To help us step carefully through the minefields of misinformation and on, we hope, to the terra firma of reliable statistical communication, we have three interesting and distinguished voices, each with a different perspective. Professor Sir David Spiegelhalter is a well-known voice to UK listeners. He's chair of the Winton Centre for Risk Evidence Communication at the University of Cambridge and was a very prominent voice on the interpretation of public health data here during the COVID pandemic. Also, we have Ed Humpherson, Director General of regulation and head of the Office for Statistics Regulation (OSR), the official stats watchdog if you like, and later in this podcast, I'll be joined by award winning data journalist and writer Simon Rogers, who now works as data editor at Google.

    Professor, you've been one of the most prominent voices these last few years – a fascinating few years, obviously, for statistics in which we were told quite frankly, this was a golden age for statistics and data. I mean, reflecting on your personal experience as a prominent public voice in that debate, when it comes to statistics and data, to be very general, how well informed are we now as a public, or indeed, how ill-informed on statistics?

    DAVID SPIEGELHALTER

    I think things have improved after COVID. You know, for a couple of years we saw nothing but numbers and graphs on the news and in the newspapers and everywhere, and that went down very well. People didn't object to that. In fact, they wanted more. And I think that has led to an increased profile for data journalism, and there's some brilliant ones out there. I'm just thinking of John Burn-Murdoch on the FT but lots of others as well, who do really good work. Of course, in the mainstream media there is still the problem of non-specialists getting hold of data and getting it wrong, and dreadful clickbait headlines. It is the sub editors that wreck it all just by sticking some headline on what might be a decent story to get the attention and which is quite often misleading. So that's a standard problem. In social media, yeah, during COVID and afterwards, there are people I follow who you might consider as - I wouldn't say amateurs at all, but they're not professional pundits or media people - who just do brilliant stuff, and who I've learned so much from. There are also some terrible people out there, widespread misinformation claims which are based on data and sound convincing because they have got numbers in them. And that, I mean, it's not a new problem, but now it is widespread, and it's really tricky to counter and deal with, but very important indeed.

    MF

    So the issue aside from - those of us who deal with the media have heard this a hundred times - I don't write the headlines, reporters will tell you when you challenge that misleading kind of headline. But would you say it’s the mainstream media then, because they can be called out on what they report, who broadly get things right? And that the challenge is everything else - it's out there in the Wild West of social media?

    DS

    Yeah, mainstream media is not too bad, partly because, you know, we've got the BBC in this country, we’ve got regulations, and so it's not too bad. And social media, it's the Wild West. You know, there are people who really revel in using numbers and data to make inappropriate and misleading claims.

    MF

    Is there anything that can be done? Is it the government, or even those of us like the ONS who produce statistics, who should we be wading in more than we do? Should we be getting out there onto the social media platforms and putting people right?

    DS

    It's difficult I mean, I don't believe in sort of censorship. I don't think you can stop this at source at all. But just because people can say this, it doesn't give them a right for it to be broadcast wide, in a way and to be dumped into people's feeds. And so my main problem is with the recommendation algorithms of social media, where people will see things because it's getting clicks, and the right algorithm thinks persona will like it. And so we just get fed all this stuff. That is my real problem and the obscurity and the lack of accountability of recommendation algorithms right across social media is I think, a really shocking state of affairs. Of course, you know, we come on to this later, but we should be doing something about education, and actually sort of pre-empting some of the misunderstandings is something I feel very strongly about with my colleagues. You’ve got to get in there quick, and rather than being on the backfoot and just reacting to false claims that have been made, you've got to sort of realise how to take the initiative and to realise what misunderstandings, misinterpretations can be made, and get in there quickly to try to pre-empt them. But that of course comes down to the whole business of how ONS and others communicate their data.

    MF

    Because when you ask the public whether they trust them - and the UK statistics authority does this every two years - you ask the public if they trust ONS statistics, and a large proportion of them say they do. But of course, if they're not being presented with those statistics, then they're still going to end up being misled.

    DS

    Yeah, I mean, it's nice to get those responses back. But, you know...that's in terms of respondents and just asking a simple question, do you trust something or not? I think it's good to hear but we can't be complacent about that at all. I’m massively influenced by the approach of the philosopher, Baroness Onora O’Neill, who really makes a sharp distinction between organisations wanting to be trusted and revelling in being trusted, and she says that shouldn't be your objective to be trusted. Your objective should be to be trustworthy, to deserve trust, and then it might be offered up to you. And so the crucial thing is trustworthiness of the statistics system and in the communications, and that's what I love talking about, because I think it's absolutely important and it puts the responsibility really firmly back to the communicator to demonstrate trustworthiness.

    MF

    So doing more as stats producers to actually actively promote data and get people to come perhaps away from the social platforms, and to have their own websites that present data in an accessible way, in an understandable way, where people can get it for nothing without requiring an expensive subscription or something, as some of the best of the media outlets would require.

    DS

    The other thing I'd say is there's no point of being trustworthy if you’re dull, as no one's going to look at it or take any notice, and other media aren't going to use it. So I think it's really worthwhile to invest, make a lot of effort to make what you're putting out there as attractive, as vivid and as grabbing as possible. The problem is that in trying to do that, I mean, that's what a lot of communicators and media people want to do, because of course they want people to read their stuff. But what that tends to do largely is make their stuff kind of opinionated and have a very strong line, essentially to persuade you to either do something or think something or buy something or vote something. So much communication has to do with persuading that I think it's just completely inappropriate. In this context, what we should be doing is informing people.

    In a way we want to persuade them to take notice, so that's why you want to have really good quality communications, vivid, get good people out there. But in the end, they’re just trying to inform people, and that's why I love working with ONS. I just think this is a really decent organisation whose job is just trying to raise the...to obviously provide official statistics...but in their communications, it's to try to raise the level of awareness raise the level of discussion, and by being part of a non -ministerial department, they're not there, the comms department, to make the minister look good, or to make anyone look good. Its just there to tell people how it is.

    MF

    Exactly. To put that data into context. Is this a big number or is this is a small number, right? Adjectives can sometimes be very unhelpful, but often the numbers don't speak for themselves, do they.

    DS

    Numbers never speak for themselves, we imbue them with meaning, which is a great quote as well from Nate Silver.

    MF

    And in doing that, of course, you have to walk the same line that the media do, in making them relevant and putting them into context, but not at the same time distorting them. There's been a big debate going on recently, of course, about revisions. And if you've listened to this podcast, which we'd always advise and consume other articles that the ONS has published, we've said a lot about the whole process of revising GDP, and the uncertainty that's built into those initial estimates, which although helpful, are going to be pretty broad. And then of course, when the picture changes dramatically, people are kind of entitled to say, oh hang on, you told us this was something different and the narrative has changed. The story has changed because of that uncertainty with the numbers, shouldn't you have done more to tell us about that uncertainty. That message can sometimes get lost, can’t it?

    DS

    Yeah, it's terribly important. You’ve got to be upfront. We develop these five points on trustworthy communication and the first one was inform, not persuade. And the second is to be balanced and not to have a one-sided message to tell both sides of the story, winners and losers, positives and negatives. And then to admit uncertainty, to just say what you don't know. And in particular, in this case, “provisionality”, the fact that things may change in the future, is incredibly important to emphasise, and I think not part of a lot of discussion. Politicians find it kind of impossible to say I think, that things are provisional and to talk about quality of the evidence and limitations in the evidence, which you know, if you're only basing GDP on a limited returns to start with, on the monthly figures, then you need to be clear about that. And the other one is to pre-empt the misunderstandings, and again, that means sort of getting in there first to tell you this point, this may change. This is a provisional judgement, and you know, I think that that could be emphasised yet more times, yet more.

    MF

    And yet there's a risk in that though, of course the message gets lost and diluted and the...

    DS

    Oh no, it always gets trotted out - oh, we can't admit uncertainty. We can't tell both sides of story. We have to tell a message that is simple because people are too stupid to understand it otherwise, it's so insulting to the audience. I really feel a lot of media people do not respect their audience. They treat them as children - oh we've got to keep it simple, we mustn't give the nuances or the complexity. All right, if you're going to be boring and just put long paragraphs of caveats on everything, no one is going to read that or take any notice of them. But there are ways to communicate balance and uncertainty and limitations without being dull. And that's what actually media people should focus on. Instead of saying, oh, we can't do that. You should be able to do it. Good media, good storytelling should be able to have that nuance in. You know, that's the skill.

    MF

    You’re absolutely right, you can't disagree with any of that, and yet, in communicating with the public, even as a statistics producer, you are limited somewhat by the public's ability to get used to certain content. I mean, for example, the Met Office recently, a couple of years back, started putting in percentage of chance of rainfall, which is something that it hadn't done before. And some work on that revealed just how few people actually understood what they were saying in that, and what the chances were actually going to be of it raining when they went out for the afternoon’s work.

    DS

    Absolute nonsense. That sorry, that's completely I mean, I completely rely on those percentages. My 90-year-old father used to understand those percentages. Because it’s a novelty if you are going to ask people what they understand, they might say something wrong, such as, oh, that's the percentage of the area that it's going to rain in or something like that. No, it's the percentage of times it makes that claim that it's right. And those percentages have been used in America for years, they're completely part of routine forecast and I wouldn't say the American public is enormously better educated than the British public. So this is just reluctance and conservatism. It's like saying oh well people don't understand graphs. We can't put up line graphs on the news, people don't understand that. This is contempt for the public. And it just shows I think, a reluctance to make an effort to explain things. And people get used to stuff, once they've learned what a graph looks like, when they see it again, then they'll understand it. So you need to educate the public and not, you know, in a patronising way, it's just that, you know, otherwise you're just being misleading. If you just say, oh, you know, it'll rain or not rain you're just misleading them. If you just say it might rain, that's misleading. What does that mean? It can mean different things. I want a percentage and people do understand them, when they've got some experience of them.

    MF

    And what about certainty in estimates? Here is a reaction we add to the migration figures that ONS published earlier in the summer. Somebody tweeted back to say, well estimates, that’s all very good but I want the actual figures. I want to know how many people have migrated.

    DS

    Yeah, I think actually, it's quite a reasonable question. Because, you know, you kind of think well can’t you count them, we actually know who comes in and out of the country. In that case it’s really quite a reasonable question to ask. I want to know why you can't count them. And in fact, of course ONS is moving towards counting them. It's moving away from the survey towards using administrative data to count them. So I think in that case, that's quite a good question to ask. Now in other situations, it's a stupid question. If you want to know if someone says, oh, I don't want an estimate of how many people you know, go and vote one way or do something or other, I want to know how many, well then you think don’t be daft. We can't go and ask everybody this all the time. So that's a stupid question. So the point is that in certain contexts, asking whether something is an estimate or not, is reasonable. Sometimes it's not and that can be explained, I think, quite reasonably to people.

    MF

    And yet, we will still want to be entertained. We also want to have numbers to confirm our own prejudices.

    DS

    Yeah, people will always do that. But that's not what the ONS is for, to confirm people's prejudices. People are hopeless at estimating. How many, you know, migrants there are, how many people, what size ethnic minorities and things, we know if you ask people these numbers, theyre pretty bad at it. But people are bad at estimating all numbers. So no, it's ONS’s job to try to explain things and in a vivid way that people will be interested in, particularly when there's an argument about a topic going on, to present the evidence, not one side or the other, but that each side can use, and that's why I really feel that the ONS’s migration team, you know, I have a lot of respect for them, when they're changing their format or consulting on it, they go to organization's on both sides. They go to Migration Watch and the Migration Observatory and talk to them about you know, can they understand what's going on, is this data helping them in their deliberations.

    MF

    Now, you mentioned earlier in the conversation, education, do we have a younger generation coming up who are more stats literate or does an awful lot more need to be done?

    DS

    A lot more needs to be done in terms of data education in schools. I'm actually part of a group at the Royal Society that is proposing a whole new programme called mathematics and data education, for that to be put together within a single framework, because a lot of this isn't particularly maths, and maths is not the right way or place to teach it. But it still should be an essential part of education, understanding numbers, understanding data, their limitations and their strengths and it uses some numeracy, uses some math but it's not part of maths. The problem has always been where does that fit in the syllabus because it doesn't, particularly at the moment. So that's something that every country is struggling with. We're not unique in that and, and I think it's actually essential that that happens. And when you know, the Prime Minister, I think quite reasonably says people should study mathematics until 18. I mean, I hope he doesn't mean mathematics in the sense of the algebra and the geometry that kids do, get forced to do essentially, for GCSE, and some of whom absolutely loathe it. And so, but that's not really the sort of mathematics that everyone needs. Everyone needs data literacy. Everyone needs that.

    MF

    Lies, damned lies and statistics is an old cliche, it's still robustly wheeled out in the media every time, offering some perceived reason to doubt what the statisticians have said. I mean looking ahead, how optimistic are you, do you think that one day we might finally see the end of all that?

    DS

    Well my eyes always go to heaven, and I just say for goodness sake. So I like it when it's used, because I say, do you really believe that? You know, do you really believe that, because if you do you're just rejecting evidence out of hand. And this is utter stupidity. And nobody could live like that. And it emphasises this idea somehow, among the more non-data-literate, it encourages them to think that numbers they hear either have to be sort of accepted as God given truths or rejected out of hand. And this is a terrible state to be in, the point is we should interpret any number we hear, any claim based on data, same as we’d interpret any other claim made by anybody about anything. We’ve got to judge it on its merits at the time and that includes do we trust the source? Do I understand how this is being explained to me? What am I not being told? And so why is this person telling me this? So all of that comes into interpreting numbers as well. We hear this all the time on programmes like More or Less, and so on. So I like it as a phrase because it is so utterly stupid, then so utterly, easily demolished, that it encourages, you know, a healthy debate.

    MF

    We're certainly not talking about good statistics, we're certainly not talking about quality statistics, properly used. And that, of course, is the role of the statistics watchdog as we're obliged to call him, or certainly as the media always call him, and that's our other guest, Ed Humpherson.

    Ed, having listened to what the professor had to say there, from your perspective, how much misuse of statistics is there out there? What does your organisation, your office, do to try and combat that?

    ED HUMPHERSON

    Well, Miles the first thing to say is I wish I could give you a really juicy point of disagreement with David to set off some kind of sparky dialogue. Unfortunately, almost everything, if not everything that David said, I completely agree with - he said it more fluently and more directly than I would, but I think we are two fellow travellers on all of these issues.

    In terms of the way we look at things at the Office for Statistics Regulation that I head up, we are a statistics watchdog. That's how we are reported. Most of our work is, so to speak, below the visible waterline: we do lots and lots of work assessing reviewing the production of statistics across the UK public sector. We require organisations like the ONS, but also many other government departments, to be demonstrating their trustworthiness; to explain their quality; and to deliver value. And a lot of that work just goes on, week in week out, year in year out to support and drive-up evidence base that's available to the British public. I think what you're referring to is that if we care about the value and the worth of statistics in public life, we can't just sort of sit behind the scenes and make sure there's a steady flow. We actually have to step up and defend statistics when they are being misused because it's very toxic, I think, to the public. Their confidence in statistics if they're subjected to rampant misuse or mis explanation of statistics, it's all very well having good statistics but if they go out into the world and they get garbled or misquoted, that I think is very destructive. So what we do is we either have members of the public raise cases with us when they see something and they're not they're not sure about it, or indeed we spot things ourselves and we will get in contact with the relevant department and want to understand why this thing has been said, whether it really is consistent with the underlying evidence, often it isn't, and then we make an intervention to correct the situation. And we are busy, right, there's a lot there's a lot of there's a lot of demand for work.

    MF

    Are instances of statistical misuse on the rise?

    EH

    We recently published our annual summary of what we call casework - that's handling the individual situations where people are concerned. And we revealed in that that we had our highest ever number of cases, 372, which might imply that, you know, things are getting worse. I'd really strongly caution against that interpretation. I think what that increase is telling you is two other things. One is, as we as the Office for Statistics Regulation, do our work, we are gradually growing our profile and more people are aware that they can come to us, that’s the first thing this is telling you; and the second thing is that people care a lot more about statistics and data now, exactly as Sir David was saying that this raised profile during the pandemic. I don't think it's a sign that there's more misuse per se. I do think perhaps, the thing I would be willing to accept is, there's just a generally greater tendency for communication to be datafied. In other words, for communication to want to use data: it sounds authoritative, it sounds convincing. And I think that may be driving more instances of people saying well, a number has been used there, I want to really understand what that number is. So I would be slightly cautious about saying there is more misuse, but I would be confident in saying there's probably a greater desire to use data and therefore a greater awareness both of the opportunity to complain to us and of its importance.

    MF

    Underlying all of your work is compliance with the UK code of practice for statistics, a very important document, and one that we haven't actually mentioned in this podcast so far…

    EH

    Shame on you, Miles, shame on you.

    MF

    We're here to put that right, immediately. Tell us about what the code of practice is. What is it for? what does it do?

    EH

    So the Code of Practice is a statutory code and its purpose is to ensure that statistics serve the public good. And it does that through a very simple structure. It says that in any situation where an individual or an organisation is providing information to an audience, there are three things going on. There's the trustworthiness of the speaker, and the Code sets out lots of requirements on organisations as to how they can demonstrate they're trustworthiness. And it's exactly in line with what David was saying earlier and exactly in line with the thinking of Onora O’Neill – a set of commitments which demonstrate trustworthiness. Like a really simple commitment is to say, we will pre-announce at least four weeks in advance when the statistics are going to be released, and we will release them at the time that we say, so there is no risk that there's any political interference in when the news comes out. It comes out at the time that has been pre-announced. Very clear commitment, very tangible, evidence-based thing. It's a binary thing, right? You either do that or you do not. And if you do not: You're not being trustworthy. The second thing in any situation where people are exchanging information is the information itself. What's its quality? Where's this data from? How's it been compiled? What are its strengths and limitations? And the code has requirements on all of those areas. That is clarity of what the numbers are, what they mean, what they don't mean. And then thirdly, in that exchange of information, is the information of any use to the audience? It could be high, high quality, it could be very trustworthy, but it could, to use David's excellent phrase, it could just be dull”. It could be irrelevant, it could not be important. And the value pillar is all about that. It's all about the user having relevant, insightful information on a question that they care about. That's, Miles, what the Code of Practice is: it’s trustworthiness, it’s quality and it’s value. And those things we think are kind of pretty universal actually, which is why they don't just apply now to official statistics. We take them out and we apply them to all sorts of situations where Ministers and Departments are using numbers, we always want to ask those three questions. Is it trustworthy? Is it quality, is it value? That's the Code.

    MF

    And when they've satisfied your stringent requirements and been certified as good quality, there is of course a badge to tell the users that they have been.

    EH

    There's a badge - the badge means that we have accredited them as complying with that Code of Practice. It's called the National Statistics badge. The term is less important and what it means what it means is we have independently assessed that they comply in full with that Code.

    MF

    Most people would have heard, if they have heard of the OSR’s work, they'll have seen it perhaps in the media. They'll have seen you as the so-called data watchdog, the statistics watchdog. It's never gently explained as it it's usually ‘slammed’, ‘criticised’, despite the extremely measured and calm language you use, but you're seen as being the body that takes politicians to task. Is that really what you do? It seems more often that you're sort of gently helping people to be right.

    EH

    That's exactly right. I mean, it's not unhelpful, frankly, that there's a degree of respect for the role and that when we do make statements, they are taken seriously and they're seen as significant, but we are not, absolutely not, trying to generate those headlines. We are absolutely not trying to intimidate or scare or, you know, browbeat people. Our role is very simple. Something has been said, which is not consistent with the underlying evidence. We want to make that clear publicly. And a lot of time what our intervention does actually is it strengthens the hand of the analysts in government departments so that their advice is taken more seriously at the point when things are being communicated. Now, as I say, it's not unwelcome sometimes that our interventions do get reported on. But I always try and make these interventions in a very constructive and measured way. Because the goal is not column inches. Absolutely not. The goal is the change in the information that's available to the public.

    MF

    You're in the business of correcting the record and not giving people a public shaming.

    EH

    Exactly, exactly. And even correcting the record actually, there's some quite interesting stuff about whether parliamentarians correct the record. And in some ways, it'd be great if parliamentarians corrected the record when they have been shown to have misstated with statistics. But actually, you could end up in a world where people correct the record and in a sort of tokenistic way, it's sort of, you know, buried in the depths of the Hansard parliamentary report. What we want is for people not to be misled, for people to not think that, for example, the number of people in employment is different from what it actually is. So actually, it's the outcome that really matters most; not so much the correction as are people left understanding what the numbers actually say.

    MF

    Surveys show - I should be careful using that phrase, you know - nonetheless, but including the UKSA survey, show that the public were much less inclined to trust in the words of the survey. Politicians use of statistics and indeed, Chris Bryant the Labour MP said that politicians who have been who've been found to have erred statistically should be forced to apologise to Parliament. Did you take that on board? Is there much in that?

    EH

    When he said that, he was actually directly quoting instances we've been involved with and he talks about our role very directly in that sense. Oh, yeah, absolutely. We support that. It will be really, really good. I think the point about the correction, Miles, is that it shows it's a manifestation of a culture that takes fidelity to the evidence, truthfulness to the evidence, faithfulness to the evidence, it takes that seriously, as I say, what I don't want to get into is a world where you know, corrections are sort of tokenistic and buried. I think the key thing is that it's part of an environment in which all actors in public debate realise it's in everybody's interests or evidence; data and statistics to be used fairly and appropriately and part of that is that if they've misspoken, they correct the record. From our experience, by and large, when we deal with these issues, the politicians concerned want to get it right. What they want to do is, they want to communicate their policy vision, their idea of the policy or what the, you know, the state of the country is. They want to communicate that, sure, that's their job as politicians, but they don't want to do so in a way that is demonstrably not consistent with the underlying evidence. And in almost all cases, they are I wouldn’t say they're grateful, but they're respectful of the need to get it right and respect the intervention. And very often the things that we encounter are a result of more of a cockup than a conspiracy really - something wasn't signed off by the right person in the right place and a particular number gets blown out of proportion, it gets ripped from its context, it becomes sort of weaponized; it's not really as a deliberate attempt to mislead. Now, there are probably some exceptions to that generally positive picture I'm giving. but overall it's not really in their interests for the story to be about how they misuse the numbers. That's not really a very good look for them. They'd much rather the stories be about what they're trying to persuade the public of, and staying on the right side of all of the principles we set out helps that to happen.

    MF

    Your remit runs across the relatively controlled world sort of government, Parliament and so forth. And I think the UK is quite unusual in having a body that does this in an independent sort of way. Do you think the public expects you to be active in other areas, we mentioned earlier, you know, the wilder shores of social media where it's not cockup theories you're going to be hearing there, it's conspiracy theories based on misuse of data. Is there any role that a statistics regulator could possibly take on in that arena?

    EH

    Absolutely. So I mentioned earlier that the way we often get triggered into this environment is when members of the public raised things with us. And I always think that's quite a solemn sort of responsibility. You know, you have a member of the public who's concerned about something and they care about it enough to contact us - use the raise a concern part of our website - so I always try and take it seriously. And sometimes they're complaining about something which isn't actually an official statistic. And in those circumstances, even if we say to them, well, this isn't really an official statistic, we will say, but, applying our principles, this would be our judgement. Because I think we owe it to those people who who've taken the time to care about a statistical usage, we owe it to take them seriously. And we have stepped in. Only recently we're looking at some claims about the impact of gambling, which are not from a government department, but from parts of the gambling industry. We also look at things from local government, who are not part of central government. So we do we do look at those things, Miles. It's a relatively small part of our work, but, as I say, our principles are universal and you've got to take seriously a situation in which a member of the public is concerned about a piece of evidence.

    MF

    Professor Spiegelhalter, what do you make of this regulatory function that the OSR pursues, are we unusual in the UK in having something along those lines?

    DS

    Ed probably knows better than I do, but I haven't heard of anybody else and I get asked about it when I'm travelling and talking to other people. I have no conflict of interest. I'm Non-Executive Director for the UK Stats Authority, and I sit on the regulation committee that oversees the way it works. So of course, I'm a huge supporter of what they do. And as described, it's a subtle role because it's not to do with performing, you know, and making a big song and dance and going grabbing all that attention but working away just to try to improve the standard of stats in this country. I think we're incredibly fortunate to have such a body and in fact, we know things are never perfect and there's always room for improvement of course, but I think we're very lucky to have our statistical system.

    MF

    A final thought from you...we’re at a moment in time now where people are anticipating the widespread implementation of AI, artificial intelligence, large language models and all that sort of thing. Threat or opportunity for statistics, or both?

    DS

    Oh, my goodness me, it is very difficult to predict. I use GPT a lot in my work, you know, both for sort of research and making inquiries about stuff and also to help me do codings I'm not very good at. I haven't yet explored GPT-4's capacity for doing automated data analysis, but I want to, and actually, I'd welcome it. if it's good, if you can put some data in and it does stuff - that's great. However, I would love to see what guardrails are being put into it, to prevent it doing stupid misleading things. I hope that that does become an issue in the future, that if AI is automatically interpreting data for example, that it's actually got some idea of what it's doing. And I don't see that that's impossible. I mean, there were already a lot of guardrails in about sexist statements, racist statements, violent statements and so on. There's all sorts of protection already in there. Well, can’t we have protection against grossly misleading statistical analysis?

    MF

    A future over the statistics watchdog perhaps?

    DF

    Quite possibly.

    EH

    Miles, I never turn down suggestions for doing new work.

    MF

    So we’ve heard how statistics are regulated in the UK, and covered the role of the media in communicating data accurately, and now to give some insight into what that might all look like from a journalist’s perspective, it’s time to introduce our next guest, all the way from California, award-winning journalist and data editor at Google, Simon Rogers. Simon, welcome to Statistically Speaking. Now, before you took up the role at Google you were actually at the forefront of something of a data journalism movement here in the UK. Responsible for launching and editing The Guardian’s data blog, looking at where we are now and how things have come on since that period, to what extent do you reckon journalists can offer some kind of solution to online misinterpretation of information?

    Simon Rogers
    At a time when misinformation is pretty rampant, then you need people there who can make sense of the world and help you make sense of the world through data and facts and things that are true, as opposed to things that we feel might be right. And it's kind of like there is a battle between the heart and the head out there in the world right now. And there are the things that people feel might be right, but are completely wrong. And where, I think, Data Journalists can be the solution to solving that. Now, having said that, there are people as we know who will never believe something, and it doesn't matter. There are people for whom it literally doesn't matter, you can do all the fact checks that you want, and I think that is a bit of a shock for people, this realisation that sometimes it's just not enough, but I think honestly, the fact that there are more Data Journalists now than before...There was an EJC survey, the European Journalism Centre did a survey earlier this year about the state of data journalism. There are way more data journalists now than there were the last time they did the survey. It's becoming much more...it’s just a part of being a reporter now. You don't have to necessarily be identified as a separate data journalist to work with data. So we're definitely living in a world where there are more people doing this really important work, but the need, I would say it has never been greater.

    MF

    How do you think data journalists then tend to see their role? Is it simply a mission to explain, or do some of them see it as their role to actually prove some theories and vindicate a viewpoint, or is it a mixture, are there different types of data journalists?

    SR

    I would say there were as many types of data journalists as there are types of journalists. And that's the thing about the field, there's no standard form of data journalism, which is one of the things that I love about it. That your output at the end of the day can be anything, it can be a podcast or it can be an article or a number or something on social media. And because of the kind of variety, and the fact I think, that unlike almost any other role in the newsroom, there really isn't like a standard pattern to becoming a data journalist. As a result of that, I think what you get are very different kind of motivations among very different kinds of people. I mean, for me, personally, the thing that interested me when I started working in the field was the idea of understanding and explaining. That is my childhood, with Richard Scarry books and Dorling Kindersley. You know, like trying to understand the world a little bit better. I do think sometimes people have theories. Sometimes people come in from very sophisticated statistical backgrounds. I mean, my background certainly wasn't that and I would say a lot of the work, the stats and the way that we use data isn't necessarily that complicated. It's often things like, you know, is this thing bigger than that thing? Has this thing grown? You know, where in the world is this thing, the biggest and so on. But you can tell amazing stories that way. And I think this motivation to use a skill, but there are still those people who get inured by maths in the same way that I did when I was at school, you know, but I think the motivation to try and make it clear with people that definitely seems to me to be a kind of a common thread among most of the data journalists that I’ve met.

    MF

    Do you think that journalists therefore, people going into journalism, and mentioning no names, as an occupation...used to be seen as a bit less numerous, perhaps whose skills tended to be in the verbal domain. Do you think therefore these days you’ve got to have at least a feel for data and statistics to be able to be credible as a journalist?

    SR

    I think it is becoming a basic skill for lots of journalists who wouldn't necessarily consider themselves data journalists. We always said eventually it is just journalism. And the reason is because the amount of sources now that are out there, I don't think you can tell a full story unless you take account of those. COVID’s a great example of that, you know, here's a story that data journalists, I think, performed incredibly well. Someone like John Burn-Murdoch on the Financial Times say, where they’ve got a mission to explain what's going on and make it clear to people at a time when nothing was clear, we didn't really know what was going on down the road, never mind globally. So I think that is becoming a really important part being a journalist. I mean, I remember one of my first big data stories at the Guardian was around the release of the coins database – a big spending database from the government - and we had it on the list as a “data story and people would chuckle, snigger a little bit of the idea that there'll be a story on the front page of the paper about data, which they felt to be weird, and I don't think people would be snickering or chuckling now about that. It's just normal. So my feeling is that if you're a reporter now, not being afraid of data and understanding the tools that are there to help you, I think that's a basic part of the role and it's being reflected in the way that journalism schools are working. I teach here one semester a year at the San Francisco Campus of Medill. There's an introduction to data journalism course and we get people coming in there from all kinds of backgrounds. Often half the class are just, they put their hands up if they're worried about math or scared of data, but somehow at the end of the course they are all making visualisations and telling data stories, so you know, those concerns can always be overcome.

    MF

    I suppose it's not that radical a development really if you think back, particularly from where we're sitting in the ONS. Of course, many of the biggest news stories outside of COVID have been data driven. think only of inflation for example, the cost of living has been a big running story in this country, and internationally of course, over the last couple of years. Ultimately, that's a data driven story. People are relying on the statisticians to tell them what the rate of inflation is, confirming of course what they're seeing every day in the shops and when they're spending money.

    SR

    Yeah, no, I agree. Absolutely. And half of the stories that are probably about data, people don't realise they're writing about data. However, I think there is a tendency, or there has been in the past, a tendency to just believe all data without questioning it, in the way that as a reporter, you would question a human source and make sure you understood what they were saying. If we gave one thing and that thing is that reporters would then come back to you guys and say ask an informed question about this data and dive into a little bit more, then I think we've gained a lot.

    MF

    So this is perhaps what good data journalists are bringing to the table, perhaps and ability to actually sort out the good data from the bad data, and actually, to use it appropriately to understand uncertainty and understand how the number on the page might not be providing the full picture.

    SR

    Absolutely. I think it's that combination of traditional journalistic skills and data that to me always make the strongest storytelling. When you see somebody, you know, who knows a story inside out like a health correspondent, who knows everything there is to know about health policy, and then they're telling a human story perhaps about somebody in that condition, and then they've got data to back it up - it’s like the near and the far. This idea of the near view and the far view, and journalism being the thing that brings those two together. So there’s the view from 30,000 feet that the data gives you and then the individual view that the more kind of qualitative interview that you get with somebody who is in a situation gives you. The two things together - that’s incredibly powerful.

    MF

    And when choosing the data you use for a story I guess it’s about making sound judgements – you know, basic questions like “is this a big number?”, “is this an important number?”

    SR

    Yeah, a billion pounds sounds like a lot of money, but they need to know how much is a billion pounds, is it more about a rounding error for the government.

    MF

    Yes, and you still see as well, outside of data journalism I stress, you still see news organisations making much of percentage increases or what looks like a significant increase in something that's pretty rare to start with.

    SR

    Yeah, it's all relative. Understanding what something means relatively, without having to give them a math lesson, I think is important.

    MF

    So this talk about supply, the availability of data journalism, where do people go to find good data journalism, perhaps without having to subscribe? You know, some of the publications that do it best are after all behind paywalls, where do we find the good stuff that's freely available?

    SR

    If I was looking from scratch for the best data journalism, I think there are lots of places you can find it without having to subscribe to every service. Obviously, you have now the traditional big organisations like the Guardian, and New York Times, and De Spiegel in Germany, there is a tonne of data journalism now happening in other countries around the world that I work on supporting the Sigma Data Journalism Awards. And over half of those entries come from small one or two people units, you know, practising their data journalism in countries in the world where it's a lot more difficult than it is to do it in the UK. For example, Texty in Ukraine, which is a Ukrainian data journalism site, really, and they're in the middle of a war zone right now and they're producing data journalism. In fact, Anatoly Barranco, their data editor, is literally in the army and on the frontline, but he’s also producing data journalism and they produce incredible visualisations. They've used AI in interesting ways to analyse propaganda and social media posts and stuff. And the stuff happening everywhere is not just limited to those big partners behind paywalls. And what you do find also, often around big stories like what’s happened with COVID, people will put their work outside of the paywall. But um, yeah, data is like an attraction. I think visualisation is an attraction for readers. I'm not surprised people try and monetize that, but there is enough going on out there in the world.

    MF

    And all that acknowledged, could the producers of statistics like the ONS, and system bodies around the world, could we be doing more to make sure that people using this data in this way have it in forms have it available to be interpreted? Is there more than we can do?

    SR

    I mean, there was the JC survey that I mentioned earlier, it’s definitely worth checking out because one thing it shows is that 57% of data journalists say that getting access to data is still their biggest challenge. And then followed by kind of like lack of resources, time pressure, things like that. PDFs are still an issue out there in the world. There's two things to this for me, on one side it's like, how do I use the data, help me understand what I'm looking at. On the other side is that access, so you know, having more kind of API's and easy downloads, things that are not formatted to look pretty but formatted for use. Those kinds of things are still really important. I would say the ONS has made tremendous strides, certainly since I was working in the UK, on accessibility to data and that's a notable way, and I've seen the same thing with gov.us here in the States.

    MF

    Well it’s good to hear the way the ONS has been moving in the right direction. Certainly I think we've been tough on PDFs.

    SR
    Yes and to me it's noticeable. It's noticeable and you've obviously made a deliberate decision to do that, which is great. That makes the data more useful, right, and makes it more and more helpful for people.

    MF

    Yes, and at the other end of the chain, what about storing publishers and web platforms, particularly well you’re at Google currently, but generally, what can these big platforms do to promote good data journalism and combat misinformation? I mean, big question there.

    SR

    Obviously, I work with Google Trends data, which is probably the world's biggest publicly available data set. I think a big company like Google has a responsibility to make this data public, and the fact that it is, you can download reusable datasets, is incredibly powerful. I'm very proud to work on that. I think that all companies have a responsibility to be transparent, especially when you have a unique data set. That didn't exist 20 years earlier, and it's there now and it can tell you something about how the world works. I mean, for instance, when we look at something like I mean, I've mentioned COVID before, but it's such a big event in our recent history. How people were searching around COVID is incredibly fascinating and it was important information to get out there. Especially at a time when the official data is always going to be behind what's actually happening out there. And is there a way you can use that data to predict stuff, predict where cases are going to come up... We work with this data every day and we're still just scratching the surface of what's possible with it.

    MF

    And when it comes to combating misinformation we stand, so we're told, on the threshold of another revolution from artificial intelligence, large language models, and so forth. How do you see that future? Is AI friend, foe, or both?

    SR

    I work for a company that is a significant player in the AI area, so I give you that background. But I think in the field of data, we've seen a lot of data users use AI to really help produce incredible work, where instead of having to read through a million documents, they can get the system to do it for them and pull out stories. Yeah, like any other tool, it can be anything but the potential to help journalists do their jobs better, and for good, I think is pretty high. I'm going to be optimistic and hope that that's the way things go.

    MF

    Looking optimistically to the future then, thank you very much Simon for joining us. And thanks also to my other guests, Professor Sir David Spiegelhalter and Ed Humpherson. Taking their advice on board then, when we hear or read about data through the news or experience it on social media, perhaps we should first always ask ourselves – do we trust the source? Good advice indeed.

    You can subscribe to new episodes of this podcast on Spotify, Apple podcasts, and all the other major podcast platforms. You can also get more information, or ask us a question, by following the @ONSFocus on X, or Twitter, take your pick. I’m Miles Fletcher, from myself and our producer Steve Milne, thanks for listening.

    ENDS

  • In this episode, we explore ONS’s work with other countries to raise the world's statistical capabilities.

    Transcript

    MILES FLETCHER

    Hello and welcome again to ‘Statistically Speaking’, the Office for National Statistics’ Podcast. I'm Miles Fletcher, and in this episode we're going international.

    Now it hardly needs saying that global issues, climate change, population growth, inflation, to name a few are best understood and addressed with the benefit of good global statistics. So, to that end, the ONS works in partnership with a number of countries worldwide with the ultimate aim of raising the world's statistical capabilities. At the one end of Africa, for example, a continent where its deeply involved, that includes embedding state of the art inflation indices and other economic data in Ghana. On the other side of the continent, it's meant using AI and machine learning to track the movement of displaced populations in Somalia. How do you run a census in places where nobody has a permanent address?

    It's all fascinating work and here to tell us about it, Emily Poskett, Head of International Development at the ONS; Tim Harris of the ONS Data Science Campus’s international development team; and joining us from Accra, our special guest, Government Statistician of the Republic of Ghana and head of the Ghanaian Statistical Service Professor Samuel Annim.

    Emily then, to start give us the big overview if you would, set out for us the scale and the purpose of this international development work that the ONS is doing.

    EMILY POSKETT

    We work with countries around the developing world to support strong modern statistical systems wherever we see a suitable opportunity to do so.

    MF

    What form does that work take? Does it mean statisticians going out to these countries?

    EP

    Yes, it does, when that's the right way to go about things. So our work is usually through the form of medium-term partnerships with a small group of national statistical offices, or NSOs, from the developing world, and those partnerships are medium term over a number of years in order to build up a real understanding of the context in that country, that national statistical offices vision for modernization and how the ONS can be of most help to achieve their own goals under their own strategy.

    That relationship will normally be led by a particular individual who spends time getting to know the context and getting to know the people, getting to know what ONS can do to help. A partnership might cover a range of topic areas from census to data science to leadership training to economic statistics, and the lead point of contact, the strategic advisor in many cases, will bring in the relevant experts from across ONS, and they'll work through virtual collaboration but also through on-site visits, and they will work out the best timing for those and the best delivery modality in order to ensure that the gains are sustained. Our primary focus really is to make sure that changes that we support in the partner organisation are sustainable, and the work that ONS does using the UK’s aid budget is really impactful and leads to long term change.

    We don't always work through direct partnerships, for example where we see opportunities to work alongside other organisations, so international institutions like the World Bank or other national statistics offices like Statistics Canada or Statistics Sweden, they might choose to bring us in to deliver small pieces of focussed technical assistance alongside their own programmes. One of our medium-term partnerships is with the United Nations Economic Commission For Africa (UNECA), and they work with all 54 countries of Africa, and they can choose to bring in our expertise alongside their own to target particular needs in particular countries. But I would say that 70% of our effort is through these medium-term partnerships.

    MF

    So the ONS is providing one part of a large patchwork of work, going on right across the developing world, but what is the ultimate purpose of that? What are all these countries trying to achieve together?

    EP

    Well, strong statistical systems are essential in all countries to aid effective planning and informed decision making. And this is even more important in developing countries where resources are often scarce and you're trying to use scarce resources to target a wide range of needs across the population. And that resource might include UK aid for example, and aid from other countries. The UK has been statistical capacity building for many, many years through different modalities, working with partners, and the ONS is just one implementing partner who can be called upon to provide that technical expertise. We're really proud to be a partner of choice for a number of developing countries and the ONS is seen worldwide as being a leader. We're really proud that countries like Ghana would choose to work with us, and that we want to do our bit to help them to achieve their own strategy and their own goals.

    MF

    Well, this seems like an excellent moment to bring in Professor Samuel Annim. Our great pleasure, great honour, to have you with us professor. From your perspective, and what you're looking to achieve in the Ghana Statistical Service, how important how useful is the work with ONS been for you?

    PROFESSOR SAMUEL ANNIM

    From the perspective of how it has been important for us, I mean, I look at it from several aspects. I got into office in 2019, a year after the ONS and GSS collaboration had been established. And when I joined obviously, I had a sense of what I wanted to contribute to the office. Partnership that we've seen between National Statistical Offices over the years have always taken the dimension of statistical production partnerships, and what I simply mean by that is that they’re going in to help the service deliver on its core mandate. So for example, if price statistics are the priority, then that is the area you want to focus on, but our partnership with ONS took a different dimension. In addition to focusing on the traditional mandate of the Institute, which is the production of statistics, we really have over the period achieved some milestones from the perspective of transformation, which is of high priority to me, and secondly, from the perspective of injecting technology or contemporary ways of dispensing our duty as a National Statistical Office. So from an individual point of view, it has it has been beneficial to the mission that I have, and since then we have kept on working in the area of transformation.

    MF

    Listening to what you have to say there, it does sound as though some of the big challenges you face at the moment are not too dissimilar from the ones faced by ONS, all about modernising statistics, particularly using big data and new technology.

    SA

    Indeed, and I must say that it is a wave across all national statistical offices, because we are now trying to complement traditional surveys and censuses with non-traditional data sources i.e. Big Data, administrative data, citizens generated data and other geospatial resources. So collaborating is the key thing here, because this is new to the statistical community. So it's important we collaborate to learn how you are dealing with issues that are not consistent with the production of official statistics. Now as a global community, we are all thinking about how to use citizens generated science, I mean, getting citizens to provide us with data. And this is an area in which there isn't any National Statistical Office that can claim authority, because the approach and the processes are pretty not consistent with the guidelines for production of official statistics. So it's important to learn how countries are doing it and see how we can collaborate to get this done.

    MF

    Yes, in the last episode of our podcast, interestingly, we talked about the challenges of getting our citizens here in the UK to take part in surveys. Are Ghanaians friendly to what you're trying to achieve? Or are they perhaps sceptical as well and difficult to engage?

    SA

    I wouldn't say they're sceptical, I think they really feel part of it. And that is one of the strengths of citizen generated data, because if you package it in a way that it is more demand driven, rather than supply, you don't just go and tell them do this because I know how it's supposed to be done, but instead give them the platform to tell the National Statistical Office what their experiences are, provide them with platforms that they can easily engage so that they can feel part of the process and they really own the product. In our case, it is not a product that is owned by the statistical service but it is a product that is owned by the sub national agencies, and that is, as I said earlier, the beauty of citizen generated data. It is co-creation and co-ownership of the statistical product. So they are not sceptical, they are very receptive to it, and they are getting a better understanding of what we do as a National Statistical Office.

    MF

    Thinking internationally, thinking globally, what sort of shape do you think the world's statistical system is in now, as a result of partnerships like this or other developments, generally looking across Africa and looking beyond Africa, when we think about key issues, particularly climate change - how good is the statistical system now in tracking these very important changes, and the impacts they're having?

    SA

    We have as national statistical offices been very content with the traditional statistics - labour statistics, price statistics, GDP - and you do that either monthly, quarterly, or in some instances annually, and even the social indicators, I mean, it's only a few countries like the UK that has been able to do social indicators annually, for those of us in the Global South, a lot of the social indicators are being collected every five years, or every seven to eight years. So this was the way national statistical offices, up until about 2017 or 2018, were shaped. But with the data revolution that we saw around 2014, and since the World Development Report, the data for better lives document, that came out in 2021, clearly, we now have to approach statistics from a different point of view.

    And this is simply asking the question, how do I contextualise the statistics beyond what international communities would be expecting national statistical offices to do? I mean, now we are doing everything possible to ensure that we have a monthly GDP, and this is something that we are also learning from the partnership with ONS, because we are aware that they are developing models to ensure that beyond GDP they have some indicators that would readily give us insight on economic performance.

    And related to the issue of climate change that you are you talking about Miles, it's one of the areas that you cannot simply dispense your statistics in that one area as a standalone National Statistical Office, because this is something that has a continental dimension, something that has a global dimension. And at the moment we have data sitting in different silos, and the only thing that we can do is through partnership, see how we can bring these datasets together to help us get a better understanding of issues around climate change.

    So going forward, in my point of view, if we really want to sustain the transformation that we are seeing as a global Statistical Office, the only way out is through partnership, is through collaboration. And one of the things that I'm putting on the table is that we better begin to measure partnerships. Because we've treated partnerships as a qualitative engagement. And really, nobody knows which partnerships are working and which are not working. So if we're able to measure it, we can more clearly see the benefits of partnerships, although we all hold the view that it is the way to go.

    MF

    Interesting what you said about how we've traditionally concentrated on those classical measures of economic progress, and notably GDP. You might be interested to hear that the charity Oxfam, the big NGO, was in the news here in the UK recently when they said that GDP was colonialist, and it was ‘anti-feminist’, because it ignored the huge economic value of unpaid work, which they said is largely undertaken by women.

    Well, whether you agree with that or not, it does perhaps highlight the need for going beyond GDP and producing these alternative, and perhaps richer, wider measures of economic progress and economic value.

    SA

    I mean, I clearly associate with that submission, and we currently doing some work with the United Nations Development Programme on the National Human Development Report. And the focus of this report is exactly what you are talking about, Miles. We are looking at the current value of work, and we are looking at the future value of work. And we are going beyond the definition of who is employed, which strictly looks at whether the work that you are doing comes with remuneration or not, because once you broaden it and look at the value of work, you definitely have the opportunity to look at people who are doing unpaid work, and indeed their contribution to the progress that we are seeing as a human society, and the National Human Development Report has a sharp focus on this gender issue. They're going to look at that closely. And again, this is coming on the backdrop of an ongoing annual household income and expenditure survey that we are doing. So traditionally, government and international organisations would ask what is your employment and what is your unemployment rate? And then in this report, we tell them that we need to begin to look at those who are working, but we see they're not employed, simply because they are not working for pay or profit, and the proportion of people who are in there, and then once you disaggregate based on sex, age and geography, it's so revealing that we are losing a number of insights from the perspective of unpaid work. And so I fully subscribe to that view.

    MF

    That's interesting. Professor, for now thank you very much, and I hope you'll join the conversation again later, but we're going to broaden out to talk about, well, it’s actually a related development, Emily, talking about women and unpaid work, that's been another theme of ONS’s work with the UN Economic Commission for Africa.

    EMILY POSKETT

    There was a request put forward by national statistic offices around Africa to undertake leadership training, and this was part of the country's modernization vision. Countries recognise that in order to achieve modernisation, they need to have strong leadership. So they asked the UNECA to deliver leadership training and ONS partnered with UNECA to design and pilot this leadership training programme in a range of countries. As part of delivering that we noticed and recognised a lack of female leaders in a number of National Statistics Offices around the continent, and thought with partners about what we can do to help support that, so now as well as running a leadership training for the top tier of leadership in in each organisation, we also run a women into leadership training for potential future female leaders from within the staff. And it's been really, really successful. Some of the feedback that we've had from leaders in those organisations is that they've seen their female staff becoming more confident, more outspoken, more ambitious, putting themselves forward for positions, putting their ideas forwards as well, and generally feeling more confident to contribute in the workplace. We're really proud of that success and hope to roll it out in many more countries around the continent.

    MF

    A country that's the other side of Africa in a number of important senses, and that is Somalia, which of course if you've followed the news to any extent over the last few decades, you'll know the serious turmoil that's affected that part of Africa, Tim Harris, bringing you in, what's been going on in Somalia that the ONS has been involved in, particularly when it comes to measuring population and population movement. Tell us about that.

    TIM HARRIS

    Well, as you say, Somalia is a very fragile context. It's been affected by conflict and climate change and environmental issues for many years. And that's made it very challenging to collect information, statistical information, on a range of things. But particularly on population, which is a key underpinning piece of statistics which any country needs, and in fact, there hasn't been a census in Somalia since the 1970s, almost 50 years now, but there are plans to do a census next year, with support from the UN Population Fund, UNFPA, and other various institutions in Somalia and development partners, as well as the Foreign Commonwealth and Development Office. And Emily's international development team are also trying to form a partnership, or are forming a new partnership, with the Somalia National Bureau of Statistics. So there are plans to do a census next year, and we're really in the preparatory phase for that at the moment. And we're looking to see in our team how we can use data science, new techniques, new data sources, to try and help prepare to run that census.

    One of the particular issues in Somalia is that there are significant numbers of people who are displaced from where they usually live, by the conflict, by climate change. There's also been a drought for the last few years, and so there are hundreds of thousands, in fact millions, of people who are displaced from where they usually live. They tend to congregate in what we call Internally Displaced People Camps, or IDP camps. So they're not refugees, they haven't crossed an international boundary, but they are displaced from where they usually live. And these IDP camps tend to be quite fluid and dynamic. They're often in areas that are difficult to get to, so information on them is very difficult to obtain. They change very quickly, they grow, they contract, and a lot of them are on private land. So we're looking to see whether we can use new data techniques, and new data sources, to give us information about the broad scale of population in that area.

    MF

    And those new data sources are necessary presumably because it's very difficult to actually get out and physically see these people in those areas and count them physically.

    TH

    That's right, and they change very quickly. So if you're running a census, you want to know where your people are, so you can send the right number of enumerators to the right places, you can draw boundaries of enumeration areas and so on.

    MF

    You need an address register essentially, but these are people effectively without addresses.

    TH

    That's right. That's the way that you do it in the UK. It's not possible in many of these IDP camps in Somalia. So we're looking to see whether we can use high resolution satellite imagery, which we can task for a particular period of time, say in the next week or the next month. And we use that satellite imagery to see whether we can identify structures on the ground in these camps. And in fact, the UN Population Fund has been doing this in a very manual way for some time now. So someone looks at the satellite imagery, and they put a point on each tent, and they try and count the tents and the structures within these camps. That's obviously a very time-consuming way of doing it. So we're looking to see can we do it in a more automated way.

    So we've procured some satellite data and we've developed what we call training data. So in certain parts of the camps, we manually draw around the outlines of the tents, and we use techniques called machine learning. We show computers what areas are tents and what areas are not tents. And we try and train them using these algorithms to be able to predict in areas they haven't seen which areas are tents and which areas are not tents. So we're trying to develop models where we can use high resolution satellite imagery to predict the areas where there are tents, to produce numbers of tents, and in this way, we can help to estimate the broad numbers of people in these areas, and that can feed into the preparations for the census to help it run more smoothly and more efficiently.

    MF

    And trying to count an ever-shifting population allowance, in that we've got seasonal variations going on, some people unfortunately being evicted, and then you've got a population that would be nomadic anyway.

    TH

    People in these camps, they’re a whole mixture of people, some who've been forced to move because of drought, some have been forced to move because of conflict, and as you say, there are large numbers of nomadic people as well. And they have tended to also congregate in these IDP camps in recent times because of the drought and other climate conditions.

    MF

    And it's thought there could be up to 3 million people at the moment living under those conditions in Somalia.

    TH

    That's right. I mean, that I think highlights one of the particular issues, in that the numbers are very uncertain. So there is some information from camp management administrative data, there is some information from some limited surveys that the Somalia National Bureau of Statistics has undertaken, but the estimates from those two different sources produce very, very different results. And so this is what we're trying to do, to see whether objectively we can count the number of tents, and therefore have some objective measure of at least the number of tents and structures, obviously then we need to move to how many of those tents and structures might be occupied. How many people on average might be in each of those tents and structures. But can we add something to the information context that produces some more objective measures, at least of the number of tents and the number of structures in those areas.

    MF

    Well, that's the kind of cutting-edge stuff that the Data Science Campus is all about. But Emily, the ONS has been involved in other censuses in Africa over the longer term, hasn't it?

    EMILY POSKETT

    Yes, that's right, Miles. So we've been involved in a number of censuses around Africa, including Ghana, also Kenya, Rwanda and a number of other countries through our partnership with the UNECA. And we've been able to really support countries to move from using paper for data collection to using tablets for data collection, during what they call the 2020 census round. And between ONS and the UNECA, we've been able to support on a number of different aspects, including how to make the most of that tablet technology. So you don't just move from using paper to using tablets and do the same processes. There are a number of advantages to using tablets in terms of how you can monitor the quality of the data coming in in real time, and how you can speed up that data collection and that data processing, and we've been able to work with NSOs around the continent on that.

    We've also been using modern data matching techniques to support countries with their post enumeration surveys, which is a way of testing and improving the quality of the census. We've also been working with partners on using data visualisation and new techniques for improving the dissemination and user engagement with the products coming out of the census and therefore increasing the value of the census data products.

    MF

    It's interesting what you say about introducing tablet technology for data gathering in the field, to be honest, it’s not that long ago that the ONS actually moved to use that, rather than the traditional what was once described as ‘well-meaning people with clipboards going around asking questions. And it strikes me that in developing and working with these partner countries, the sort of methods the sort of technology being introduced, is not far behind where we're at really is it?

    EP

    No, absolutely. The ONS experts that get involved in these projects really learn a huge amount from the partners that they're working with as well, because often the partners we're working with have far fewer resources to deliver on similar goals. So the staff have to be incredibly innovative and use all sorts of different techniques and resources in order to achieve those goals. And people coming from ONS will learn a huge amount by engaging with partners.

    MF

    Well, on the modernisation theme, the census is another area where we've been working with Ghana isn't it, Professor?

    PROFESSOR SAMUEL ANNIM

    That's correct. We had support in all three phases of the census engagement, that is before the data collection, during the data collection, and after the data collection. We were very clear in our minds that we were going to use tablets for the data collection. And one of the things that we didn't know, or struggled with, had to do with the loading of the materials onto the tablet for the data collection.

    Our original plan would have taken us about six months or four months to do that. And it wasn't going to be new for Ghana. We had other countries that had taken that length of time just to load the materials onto the tablet to enable the data collection exercise. And through the ONS and UNECA collaboration, we got technical assistance to provision the tablets in a much shorter duration. If I recall correctly, it took about six weeks to get all the items on the tablet. And we had been using tablets for data collection, but we hadn’t been able to do remote real-time data monitoring because we didn't have a dashboard. We didn't know how to develop it. And through the partnership we were able to get a dashboard. The benefit of that was that after 44 days of exiting the field, we were able to put out a preliminary report on the census because during every day of the census we had a good sense of what the numbers were, whatever corrections we had to make we were making them. So after 44 days of exiting the field, we were able to announce the preliminary result.

    MF

    Wow, you had a provisional population total after 40 days?

    SA

    44 days, yes.

    MF

    44 days! Well, that puts certain countries to shame, I think, but anyway, let's not dwell on that. That's very impressive, Professor. And there's another project you've been working on, which I suppose is close to your heart as an economist, and that's the production of CPI? Modernising that?

    SA

    Absolutely. Absolutely. One of the first things that happened when I took office in 2019 was as part of the partnership, I visit ONS to understand what they are doing and how the collaboration can be deepened. And one of the things that we explored, and that was the first time I had heard of it, was how to produce a reproducible analytical pipeline. And all that simply means is that if you keep on doing something over and over again, you should think about automating the process. And that is the relationship that we have when it comes to CPI now. We have completely moved away from Excel. When I got in I said in addition to excel, let's use data, because when the process is not automated, and you have heavy dependence on human beings doing it, the likelihood of error is high. So we really bought into this and now we do our traditional ways, Excel, data, and then we do the reproducible analytical pipeline to compare the results. And ultimately, we're going to move away from the traditional XLS database and rely on this automated process. And this again, would allow us to hopefully reduce the length of time that it takes. So that is the extent to which we are modernising our CPI based on our collaboration.

    MF

    That’s very impressive. And so through the process of speeding up the lag time of those regular indicators, you get a much timelier picture of what's going on in the economy.

    EMILY POSKETT

    This is one of the areas we will be working with a number of partners on. This idea of using new technology to deliver reproducible analytical pipelines and really, this is where national statistics offices around the world, but particularly those with low resources, can really utilise new technology to save time and improve quality. And this is something that we're really excited to be working with a number of different offices on, on a number of different topics, to really save human resources and ultimately improve quality.

    MF

    Tim, bringing you in...

    TIM HARRIS

    I think this really illustrates one of the other benefits of data science. We've talked a lot about the mobile phone records, call detail records, and their use in Ghana for producing mobility statistics, talked about using satellite imagery and machine learning in Somalia. But data science, and the tools of the digital age, can do a lot more of the basic underpinning work in statistical modernisation really well, and I think we really need to focus on that and see where we can benefit from that. And the work that Professor Annim talks about, about automating the CPI, I think that is really important. For that we can use the tools of coding, lessons from software engineering, like version control, and auditing processes, to really help us to get much greater efficiencies in these key statistical processes which any statistics office undertakes.

    And we've been very pleased to work with the Ghana Statistics Service on automating their consumer price index. I think that we're seeing that it's speeded up the process, it's reduced the scope for human error, it’s enabled us to put in quality assurance checks. The process has enabled us to produce much more transparent processes, and processes that can be maintained over time, because people can understand and see what's been done rather than things being hidden in a black box. So this process of automating statistical processes is really important.

    I think the way we've engaged with the Ghana Statistics Service also highlights what we're trying to do in terms of building capacity for people within statistics offices to do this work for themselves. So partly we've done some of the work to help them automate. But we've also tried to build the capacity of Professor Annim and his colleagues so that they can then do this work themselves and take it forward, and not only within the consumer price index, but also seeing how they can plan more strategically about how this work can be done in other areas of statistics production.

    MILES FLETCHER

    Emily, what are the priorities for the future of this international development work of the ONS?

    EMILY POSKETT

    So our priority for the next phase of this work is to continue with the partnerships that we have and to build new partnerships. So Tim mentioned that we are working towards a new partnership with the Somalia National Bureau of Statistics. We're also considering new partnerships in Tanzania and Zimbabwe, to add to the ones we already have in Ghana, Rwanda, Kenya, Namibia and with the UNECA. And that's just in Africa. We're also looking to see what we can do to support in other regions. We have a partnership with Jordan, and a new one with the Palestinian Central Bureau of Statistics, and we're looking to do more in that region and beyond into Asia and the Pacific as well, but also looking to consolidate the kind of topics which we've worked on previously. So we've mentioned census, data science, women into leadership training, open SDG platform, and we're also looking to do more in new topic areas. So we're looking to do more in climate and environment statistics. We think this is a really important area that we're looking to do more in, in geography and geographical disaggregation of data.

    And I think we're looking to do more really on the usability of statistical outputs and dissemination of statistical outputs. I think a number of our partners do a really great job of collecting data, but there's a lot more that can be done to make use of new technology to better disseminate and improve the use of that data. So we're ambitious in the reach that we have with our small budget, but we want to make sure that we don't lose sight of sustainability, and by spreading ourselves too thinly, we could reduce the sustainability of the work that we do, and I think we're forever trying to balance off those two things.

    MF

    Professor Annim, perhaps I could give you the last word on this. How do you see the future of collaboration between the ONS and Ghana?

    PROFESSOR SAMUEL ANNIM

    We really want to push the collaboration beyond the two statistical agencies, and let me indicate that that’s started already. One of the things that we want to achieve is more utilisation of our data. I mean, we are fine with the production of it. We are technical people. We can continue to improve on it. But what I see with this partnership is to scale our relationship as two national statistical offices. Our relationship should be scaled up to the data users. So we don't want to just sit as two statistical offices, improving the production of statistics, but really getting into the realm of the utilisation of statistics, and that is where we need to bring in other government agencies, based on what ONS and GSS are nurturing.

    MF

    There you have it, statistics are important, but it's outcomes that really matter.

    That's it for another episode of ‘Statistically Speaking’, thanks once again for listening.

    You can find out more about our international development work, read case studies and view our ambitious strategy, setting up the ONS’s vision for high quality statistics to improve lives globally, on the ONS website, ONS.gov.uk, and you can subscribe to new episodes of this podcast on Spotify, Apple podcasts and all the other major platforms.

    You can also get more information, or ask us a question, by following @ONSFocus on Twitter.

    I'm Miles Fletcher and our producer at the ONS is Alisha Arthur. Until next time, goodbye

    ENDS

  • In this episode we chat to members of the ONS Social Survey Collection Division about the importance, and challenges, of getting the general public to take part in crucial surveys that help paint a picture of what life is like across Britain.

    Transcript

    MILES FLETCHER

    Welcome again to ‘Statistically Speaking’, the official podcast of the UK’s Office for National Statistics. I’m Miles Fletcher.

    Now I don’t know about you - but it seems hardly a moment passes these days when we are not being asked to feed back. How was our service today? Are you satisfied with this product? Please fill in this short survey. Your responses matter.

    Well, forgive the natural bias, but today we’re talking about surveys that really do matter.

    ONS surveys – some of which are the very largest conducted regularly in the UK don’t just inform economic and social policy, though they are hugely important to it. The data they gathered also represent a public resource of immense and unique value.

    But persuading people some unaware, some sceptical and even hostile, others just very busy – to take part in them is a growing challenge for statistical institutions worldwide.

    In this episode then we’ll be discussing how the ONS gathers often personal data from members of the public right up and down the country.

    Taking time out of their day to answer my questions, and to explain why it’s absolutely crucial that you participate in our surveys if you get the opportunity to do so, are Emma Pendre and Beth Ferguson, who head up the ONS’s face-to-face Field Operations;

    and sharing their own personal experiences of life on other people’s doorsteps we have two of the ONS’s top Field Interviewers, Tammy Fullelove and Benjamin Land.

    Welcome to you all.

    Emma, if I come to you first – give us an idea of what exactly the field community in ONS is. Who are you and what do you do?

    EMMA PENDRE

    The Social Survey Collection Division is the largest division in ONS. We primarily collect data from households either online, face to face or by telephone using computer assisted interviewing, and also work at air, sea and rail ports collecting data from passengers. All the data collected is used to produce quite a number of our key ONS publications which help to paint a picture of what life is like in the UK. And these can include things like estimates of employment and unemployment, how we measure inflation, how we measure migration, and a key topic of interest at the moment is the cost of living. So while most of ONS relies on the data that we collect for our outputs and statistical bulletins, the statistics that we particularly generate also support research, policy development and decision making across government and other private sector businesses as well.

    MF

    Now Beth, bringing you in here, when it comes to household surveys, presumably someone's deciding which households are going to be approached to take part. Who makes those decisions and how is it done?

    BETH FERGUSON
    So I'm not going to pretend to understand the clever people in the statistics team who work out how we get the right people to cover a broad spectrum of society. But yes, that's done by the sampling team and they choose a random sample for the surveys.

    MF

    And that's generated presumably from using the electoral roll.

    BF

    It's generated from something called PAF which is the Post Office Address Finder. I'll have to confirm exactly what that stands for. Yes, but essentially, it's a list of addresses across England, Scotland and Wales.

    MF

    And when it comes to the passenger survey, it's a question of stopping what we hope will be a representative random sample of people as they pass through those ports.

    BF

    Yes, it is. Yeah. But at the moment we're currently working on departures and arrivals. So yes, it's a random sample of individuals stopped and asked questions.

    MF

    But to make the data really representative and really valid, of course, we've got to be covering the whole of the country. The country in this case being Great Britain. How do we ensure that that coverage is working day in day out?

    BF

    That's our role as the kind of management of the face-to-face field interviewers. Different surveys are done over different frequencies. So we've got the Labour Force Survey and the transformed Labour Force Survey which addresses are issued for on a weekly basis and those surveys are delivered on a weekly basis. And then we've also got our other longer, more detailed financial surveys that were issued with a quota for on a monthly basis. So our job is to make sure we've got the right people, in the right places, to knock on the right doors, to get hold of those members of the public and, you know, encourage them to agree to complete surveys for us.

    MF

    And luckily for us we’re joined by two of those “right people” here today. Tammy and Benjamin, welcome to our humble podcast. Now you are both at the sharp end of our survey data collection, working as field interviewers. I'm obviously really interested in what you do day to day, but first off tell us how you got into this line of work. What was the attraction for you Tammy, how did you become a field interviewer?

    TAMMY FULLELOVE

    So prior to working for the ONS - I've never worked in public sector before, I've always worked in the private sector - and I've actually got a finance background. But then after being on maternity leave, having a young family, seeing the job advertised and the flexibility working with people in a very, varied job sort of pulled me to it to apply to be honest. And that was seven years ago, and I can honestly say I enjoy every single day I'm out in the field. It's great.

    MF

    And Benjamin how about you, what was your background?

    BENJAMIN LAND

    Well, I've done a variety of hospitality jobs in the past. I then applied to work on the Census at the start of 2021. And my manager at the time she had worked previously for the ONS on the basket of goods figures, and she recommended it as a really great place to work. It's funny how timing happened I saw a vacancy for a field interviewer, which I applied. And then I started in May 2021. So almost two and a half years ago now.

    MF

    Okay, so you've both got quite a bit of experience already under your belt. I was wondering of both of you, is there such a thing as a typical day for a field interviewer?

    TF

    I can honestly say no, every day is completely different. Depending on the area where you go into, where you may be working, streets apart, houses apart. You never know what door you knock on who can be behind that door, which makes every day completely varied, especially with the studies that you may be interviewing for, that they can be very different with the content. So yeah, two days are never the same.

    BL

    I totally agree with Tammy. It varies. My week has a sort of flow to it. So I tend to get out quite a lot at the start of the week to visit various addresses. If it’s LFS they change every week. On the financial surveys it's monthly so you've got longer to familiarise yourself with the area. We tend to have a team meeting most Tuesday mornings just to check in and see how we're doing. And then obviously interviews are scheduled around respondents timetable so that can be any time up to sort of eight, nine o'clock at night and sometimes Saturdays, if that's when they're available.

    MF

    Going out to people's houses on a daily basis, you no doubt encounter a wide variety of people. That must have led to one or two amusing episodes.

    TF
    I've had occasion where people will answer the door in not the most suitable attire, shall we say, for public viewing. I don't know how much further to go into this, but yeah, definitely opening the door in towels which have fallen off and dressing gowns which haven't been completely covered. It definitely happened a couple of times over the past few years.

    MF

    Perhaps that's what they mean by raw data.

    Beth, if I can come back to you, are there particular surveys which are considered to be especially important for us to be speaking to people in their homes in the way we’ve just been talking about? Ones that perhaps can’t be carried out in other ways.

    BETH FERGUSON

    It’s the more detailed financial surveys. So we've got the Family Resources Survey, the Living Costs and Food Survey, the Survey of Living Conditions, and the Household and Assets survey. They are quite long, more detailed surveys. The living costs and food survey, that requires the respondent to complete an interview, but then they also have to get hold of all their receipts of any expenditure for a two week period and annotate them and hand those over to the interviewer. So it's quite a detailed, involved survey. The Household and Assets survey, again it’s dependent on how many people in the household can, you know, take up to two hours to complete and ask lots of detailed financial questions around savings and pensions and other things. If you're in the home, you can ask them to get the documents, support them to review the documents, make sure that they're actually giving the right information which, if they were to go online and do it themselves, there's no guarantee that they would get the right detail that we're actually looking for.

    MF

    So it's quite an intensive experience really, isn't it compared with simply asking someone to tick the boxes on a webpage? And I guess it very much depends on building a personal rapport with the survey participants?

    BF

    Absolutely. And that's the key. That's the key to a really successful interviewer is that ability to build rapport in a snapshot on the doorstep. You know, before they've had the opportunity to give a polite no, no thank you or sorry, not today. They reckon it is approximately 10 seconds on the doorstep to get that engagement and build that rapport, and then maintain that through what can sometimes be quite a lengthy interview. Keep that friendliness, that rapport going so that the person being interviewed remains engaged and keen to do it.

    MF

    Now Tammy, you’ve already told us about your previous financial background. Do you find that helps you when you're collecting data on economics or topics around money?

    TAMMY FULLELOVE

    Yes, I do. Like Beth’s already mentioned, a couple of our financial studies go into people's income and expenditure. So having that sort of background I feel does help me, especially when they're speaking about what benefits they receive, what sort of things they pay out. It definitely does sort of give me the edge I do feel.

    MF

    That’s great, because it’s no secret is it Beth that the ONS, like other statistical organisations around the world, are finding it increasingly challenging to get people to take part in surveys.

    BETH FERGUSON

    Yeah, absolutely. I think it's got more and more challenging. Pre-pandemic it was getting more challenging, but the shift during and post-pandemic has been quite significant in terms of the number of willing people to do surveys for us.

    MF

    A shift in what direction?

    BF

    Fewer members of the public are willing to actually do surveys for us. Now whether that's because there's less trust in the government or actually, because of the pandemic, everybody's working from home and time is more limited. But no, it's definitely harder to get a response now.

    MF

    What techniques do we use then to try and change people's minds to get them to take part?

    BF

    At the moment we're doing a lot of work, certainly with the face-to-face field community - we're calling it a Respondent Engagement Programme. So looking out for clues and signs from, you know, when you approach the doorstep in the area, identifying the kind of things that may be key to them. Our statistics on things like CPI and RPI and, you know, the change in cost of food - that being constantly in the news gives us, kind of like, a lever to start an open conversation on the doorstep, particularly when we're looking at the financial surveys.

    EMMA PENDRE

    And also Miles. It's worth noting that all the surveys are voluntary, so the offer of incentives such as vouchers in exchange for the time taken to complete a survey will also continue to be significantly influential in maintaining our response rates.

    MF

    Absolutely Emma - offering people a small incentive has actually been proven to work hasn’t it, and I guess in cost terms, it's better to spend some money on that rather than wasting it on chasing people who are never going to take part.

    EP

    Yes, that's right. The vouchers are very significant. They do help maintain our response rates. And again, being in a cost of living crisis at the moment. Our respondents see them as very helpful.

    MF

    But even with incentives, and as Beth has suggested, there’s still a reluctance by some people to be involved in our surveys. Coming to you Tammy and Benjamin, as our people on the front line every day - upon your shoulders falls the responsibility for persuading people in many cases to actually take part. Do you have a standard approach, or do you tailor what you do according to particular circumstances?

    TAMMY FULLELOVE

    We definitely have a doorstep introduction, which has to cover a few different points to obviously make sure respondents are aware of the confidentialness of obviously the answers that that will be providing. But I do believe having a smile as soon as they open the door is the biggest thing - you're obviously trying to get them on board and trying to get them to either go online to complete the study or to make an appointment if they can't do it there and then to do the interview. It definitely has to be tailored I think, compared to who answers the door and obviously what reasoning they would like to help complete the study. Whereas some people as soon as you knock on the door, they've had the letters, they're waiting for you. They really want to help. So yeah, it definitely does depend on who's behind that door and obviously why they would like to help the Office for National Statistics.

    MF

    We live in a suspicious age and some people might think that there's something fishy afoot.

    BENJAMIN LAND

    That's the challenge Miles is people often initially they think it's a scam. I turn up with my badge and they're like, Oh, you are real. And taking the time to explain to people once we've done the doorstep introduction that it’s not a scam and it is legitimate, valuable research that we're carrying out and it certainly impacts everyone.

    MF

    I can imagine how tricky it must be to convince people sometimes, but you strike me as someone who isn’t likely to be put off by that.


    BL
    Yes, yeah, I love a challenge. There was one lady last summer and every time she was like, Oh, I'm busy. I've just come back from holiday, can you pop round the next week?” And it got to the point where she's like “I'm decorating my house. I said, Ma'am, that's fine. I'll come and help you decorate your house if you complete this survey. And she's like, Oh, you're so persistent. I managed to get an interview and I was really pleased about that. So there's a little, you know, a little win in the bag.

    MF

    Well done, though I should point out that painting and decorating is not officially one of the ONS’s services for getting people to take part in surveys. Tammy have you got experiences like that?

    TF

    Yeah, I've never got into painting and decorating, I'm gonna admit that. But it is a great feeling when the first time you knock on the door people don't want to help they're too busy, especially now post-COVID, with the amount of people working at home. So like Benjamin said, you're interrupting a Teams call. You're interrupting them doing some work. So you have to get over that first hurdle. But, you know, making that appointment, and sometimes they will make the appointment but then they either won't answer the phone, or they won't be in when you turn up, which can be frustrating. But yeah, when you actually do complete that study and they do feel like, you know, they have helped and you've gone above and beyond to secure that interview, it is definitely a great feeling. So maybe I should be offering painting and decorating services, maybe that would help.

    BL

    Don’t take my tricks. No, the sense of achievement or, like Tammy says, you do get people that break appointments, you know, due to personal circumstances, and you somehow have to chase people and encourage them, but when you do secure the interview, and you get the data. There's something about when something’s hard won you value it more.

    MF

    Yes, indeed. But how many people have heard of the ONS would you say?

    BL

    A lot of people now, because we were quoted a lot during the COVID statistics, regularly on the news and is quoted... I read the newspaper I appreciate not everyone does. But a lot of the data in newspapers it will state that it's been sourced from ONS.

    MF

    That recognition factor has helped help you on the doorstep. Do people get that the ONS is an impartial organisation operating at arm's length, certainly from ministerial government?

    BL

    No, no, I think we’re often tarnished with the same brush as the TV licencing people that come round, especially in certain areas where I knock on doors. You know that they were met sometimes with hostility, to put it politely.

    MF

    Clearly some persuading to be done in a wider sense there as well. But is that your experience too Tammy?

    TF

    Yeah, they do believe that we are a government body and that we are influenced by a particular minister, or by the government that's in power at the time. If people are very anti-government on the doorstep it does create that hostility as the first sort of part of your introduction.

    MF

    Do you try to talk them around on that?

    TF

    Yeah, exactly like Benjamin said that really, you know, we don't have a minister in control of us. We are separate to the government. Everything is private, confidential. We don't share the information. You know, there's different a few things that you have to try on the doorsteps to try and get that buy-in from the respondents.

    MF

    Those of us who live and breathe statistics, of course, we wouldn't need to be persuaded to the value of taking part, but the challenge is to convince the whole population, or at least a representative sample of the whole population. It seems well removed from everyday life for a lot of people, but how many people do you think get it in terms of the value of statistics, you know, particularly economic indicators and high-level population data?

    TF

    I think it's great if you can get to an area. And you know, that statistics, whether it's been from some sort of government funding, have helped in the area. So you can say to someone on the doorstep, well, the reason why this school was built or this doctor surgery, or this park, or some sort of local information, really does that help to sort of say why they are important to provide the information. But on the flip side, speaking to students who will obviously do research looking at the ONS data, they might be using, obviously, in their own work and people who work in sort of the public sector, I think, do understand to a degree how important it is. But then, I think the vast majority and I think Benjamin will help me with this, don't really understand why we’re collecting this or what benefit the information could have to them and where they live.

    BL

    That's right. I think a lot of people they have a global sense of it, but they don't understand the impact it has on their life and I work quite a lot in Bournemouth. And there's a lot there's a big student population as we've got Bournemouth University and the Arts University College, and a lot of the students do actually know or use the ONS data. I was actually at a student house yesterday in Winton and that makes my life much easier if I can link it to their own studies.

    MF

    In covering the whole of the country, of course, that means covering areas, which in statistical language are hard to reach communities - that’s the phrase that's used. And frankly, of course, that often means areas of considerable social deprivation. Emma... How do we target those areas in particular, does that require extra attention or special techniques?

    EMMA PENDRE

    So our vision is to be fully inclusive by design. So that ensures that both the data and our workforce are fully representative of the population that we serve. The pandemic actually opened up opportunities and challenged how we have historically done things in the ONS. So to give a specific example here is around one of our key data sources, which is the Labour Force Survey. Before the pandemic we would write to addresses randomly, selected from our database of all UK households, and invite people to take part, and then knock on their doors to follow up if we didn't hear back from them. During the pandemic, when face to face interviews became impossible. We had to rely on people responding to the letter and taking part in the telephone interview. We saw pretty quickly that this was leading to bias in the responses, with particular demographics, such as the older population being more likely to respond where we were less likely to hear from people who sort of rented their properties. We knew we needed to speed up the work already underway to improve the survey. So fast forward two years and we now have transformed the Labour Force Survey and making it an online first survey which is now supported by telephone collection where needed. We've proven that to make the survey inclusive and reduce bias we also need to be knocking on doors. So for households invited to complete the survey from November 2022 They now might get a visit from a field interviewer who encourages them to complete the survey online or via a telephone interview, and we call this mode of field work “knock to nudge”.

    MF

    So in other words, it's not enough just to send somebody a letter inviting them to take part - that's likely to go unheeded. But a friendly face at the door and a little bit of gentle persuasion, can have a really useful effect.

    EP

    Absolutely. Right.

    MF

    And this is very important, because the ONS has committed with the Inclusive Data Task Force to make a special effort to ensure everybody is represented in official statistics, and field communities have been involved in that work.

    Tammy, you operate in an area that's quite ethnically diverse. How do you bridge barriers in communities where English perhaps is not the first language for a significant number of people?

    TAMMY FULLELOVE

    So in the North West, we do have a number of regions where it's densely populated, very different cultural diversity, I suppose obviously, London would cover the same. And we do rely on interviewers who speak second languages, who can then translate the languages of the people on the doorstep to go through the interview, or even just to help on the actual doorstep to speak to people and advise what the study is about and to make the appointments.

    MF

    And Beth... when it comes to choosing field workers I guess it's very important as well that you've got people who are not only representative in the general sense of those communities, but actually have got some understanding, some feel, for the people they're dealing with.

    BETH FERGUSON

    That's part of the skill of a field interviewer. And I guess it comes from the fact that we've got interviewers from all different backgrounds, but it also comes as they learn the role, understanding which areas you know are going to be more challenging, where you're going to have to put a bit more effort in and understanding that actually... as an interviewer you can knock on a number of doors and, you know, you know who's going to be easy, you can get interviews relatively easy from various different sections of society. And you know that's going to be easy, but you also know that if you're going into an area that's more deprived, you're going to have to put more effort in, you know, and for some interviewers it comes immediately and, for others, it's learned over a period of time, where those more challenging areas are, what's actually going to work, what's going to resonate with the people behind the door that, you know, you're going to need to get that interview from to make sure the data is representative of everyone.

    MF
    Now, Emma, let's talk a little about the future of the field community, because obviously we hear so much now about big data and the ability to discover and gather insights from that. A mountainous array of data sources that can now give us rapid, fast data, covering just about every topic you can think of. But, nevertheless, the ONS sees value in continuing to run these very large, and very personal surveys face-to-face and over the phone.

    EMMA PENDRE

    Yeah, social surveys will continue to have an essential role to play in ONS’s future, but also as part of a joined up data acquisition approach as well. I don't feel it's any longer a competition between whether we use surveys or other data sources. We have now come to realise that we actually need to work together and complement each other. So surveys are still fundamental in collecting the data that other sources cannot provide. And whilst new types of data sources are allowing us to more rapidly take stock of what's happening in our society and economy, they can't tell us everything or provide insights on things like personal opinions, attitudes, or exactly how people might be feeling at a given point in time. That will only ever be possible from talking to people.

    MF

    And on that note, can I just say that it’s been a pleasure talking to all of you today.

    [OUTRO MUSIC]

    That’s it for another episode of Statistically Speaking.

    Thanks once again for listening, and also thank you for taking part in our surveys. Without you all the incredibly valuable information we get from our surveys – which help to inform better decisions by your local council, for instance – would simply not exist.

    If you haven’t yet had the opportunity to take part, and you get a knock on the door in future from one of our field interviewers, please do answer and take the time to respond.

    And if you happen to be in the Bournemouth area of course, and need some painting and decorating doing, then Benjamin’s your man!

    You can subscribe to new episodes of this podcast on Spotify, Apple podcasts and all the other major platforms. You can also get more information by following the @ONSfocus feed on Twitter.

    Special thanks to producers Steve Milne and Julia Short.

    I’m Miles Fletcher and until next time... goodbye.

    ENDS

  • In this episode we discuss how the ONS has been working to transform the way we count the population, using new datasets to give more accurate, timely, and detailed measurements.

    On 29 June 2023, the ONS will be launching a public consultation on its proposals for a transformed population and migration statistics system. Understanding user needs will be essential evidence in making its recommendations to Government on the future of population statistics.

    More detail available at: www.ons.gov.uk

    To explain more about the public consultation, and answer your questions, the ONS is holding a series of free events in July 2023:

    National Statistician’s launch event, London, 4 July 2023. (Online attendance also available)

    National Statistician’s launch event, Cardiff, 6 July 2023. (Online attendance also available)

    Launch webinar, 13 July 2023. (Online only)

    You can also watch our transformation journey video, which is also available with British Sign Language (BSL), and in Welsh, with BSL.

    TRANSCRIPT

    MILES FLETCHER

    Welcome again to ‘Statistically Speaking’, the official podcast of the UK’s Office for National Statistics. I’m Miles Fletcher and this time we're looking at the future of our population statistics. How best to count all of the people, all of the time, and provide the most valuable information on changing characteristics that can drive excellent research and sound public policy. All of that is the subject of a major consultation exercise that's running during the summer of 2023. It's all about the Office for National Statistics proposals to create what's described as a sustainable and future proof system for producing essential statistics on the population.

    Joining me to unpack all that and explain how you can get involved in the consultation process is Jen Woolford, Director of population statistics here at the ONS. And we're joined once again by Pete Benton, Deputy National Statistician.

    Pete in a previous episode, you described how the once in a decade census has been the bedrock of our population statistics for a very long time, but now it looks like some pretty fundamental change could be on the way?

    PETE BENTON
    Well, that's the question. What's the future hold? We've been doing a census for over 200 years now once a decade, and it paints a beautiful, rich picture of our population that's fundamental to planning all of our services that we use: health care, education, transport, they all depend on the number and type of people living in a given area. But the question is, can we get more detail from other data sources every year, and might that mean that we don't need a census in 2031? Because we've got enough and that's the question that we are now talking about.

    MF
    Okay, so before we go into the detail of how we might achieve that, then paint a picture for our listeners. When we talk about population statistics, what are they exactly? And why are they so important and to whom?

    PB
    Well in between a census, we estimate the total population, by age and by sex and we do it nationally and we do it for local authorities. We estimate migration, how many people have moved into the country and how many people have moved out and also how people move around the country because that affects the population at any given area. And of course, we also do surveys that give us top level national level statistics about all kinds of things whether it's the labour market, or our health, things that the census asks and gives us detailed information for small areas, surveys, kind of paint a top level picture in between times.

    MF
    So to date, how have we gone about getting those numbers, and how good has that information been?

    PB
    So the census gives us the baseline once every 10 years. And we take that and we add births, we subtract deaths, we make an estimate of international migration. And we use that to adjust the data and we make an estimate of migration around the country, and that gives us those population estimates and those migration statistics.

    MF
    So to do that you need, or you’d have had to have drawn on something like the census, that universal survey of the whole population.

    PB
    That's right. The census is the benchmark by which we reset the system once a decade. But of course, after nine years, that information is getting more out of date and we do a census again, 10 years on to reset those statistics. And again, give us that rich picture. The question we're looking at now is how much can we get in between times? And how much do we then still need all the detail that a census would give us once a decade?

    MF
    So Jen, the world has moved on in those decades since the census in its present form has been going. You would think there's an opportunity out there to transform how we go about counting the nation. Give us the background to that.

    JEN WOOLFORD
    So we've been looking over decades to bring more and more data together to improve our population statistics. So Pete talked about how we look at the movement of people between censuses both in and out of the country and between different areas. And for some time now, we've been using what we call administrative data to understand those movements in the population. But now we have access to lots more data than we have in the past, and it gives us lots of opportunities to change how we're producing population statistics. So back in 2014, government first set out its ambition for us to build a population and migration Statistics System with administrative data at its heart. In 2018, we published a white paper, which set out our plans for a digital first census in 2021. But also that we should be making a recommendation to government about what the future of Population Statistics looks like, and that that recommendation should be based on a public consultation. And that's the consultation that we are going to be launching at the end of June.

    MF
    The challenge therefore, is to come up with something as least as good if not, preferably better, but without using a census.

    JW
    Absolutely. And people's needs are changing. So whatever we do has to respond to whatever the user needs are of the day. So in the past, where maybe populations didn't change so much at a local level so quickly, then having a census once a decade that gave you that detail, that detail would still be quite relevant 10 years later. But the population is changing so rapidly now that that decade old data can quite quickly become out of date. And an example of where this could be a problem for us and for policymakers is if we look at the COVID pandemic. During the pandemic, we saw really localised outbreaks of COVID infections, and we really wanted to understand what was going on in those areas and what the characteristics of people in those areas was to try and understand what might be leading to those outbreaks. But we didn't have census data, the 2021 census data then, we were having to go back to what those areas look like in 2011. So by transforming what we do, and having more up to date information about those local populations, it would have given us a much better idea of what might have been driving those local outbreaks.

    MF
    And there was another example perhaps during the pandemic when the government was trying to work out what proportion of the population had been vaccinated at local level relying on population statistics that because they were backed up by the census was subject to quite significant margins of error.

    JW
    That's right. So if you want to know what proportion of people in an area have been vaccinated, you need to know how many people are in that area in the first place. And if you're looking at a vaccination rate that's really high say kind of 90% that 10% is what's important, the 10% that aren't vaccinated. Now, you might only have a 5% error in your population estimates. But that could mean that you're thinking you've got 15% of the population to look at rather than the 10%.

    MF
    Pete, we've heard this term admin data (administrative data) already. And in that we're talking about all the information that gets collected whenever someone engages with public services, tax bills, benefits, going to the dentist, that kind of thing. Now, presumably that information has been collected for quite some time. So why is it only in the last few years that we're really starting to see and begin to use the potential of that data?

    PETE BENTON
    It takes time to develop the methods for doing it. So we've put a lot of effort into understanding the data sources and understanding the quality of the statistics that result so that we can be clear what we can and can't do, and that we can show that to the people that use the data to make decisions in order to understand the quality of what they're getting and give us their views of that.

    MF
    Can you think of some examples of administrative data as already being used effectively in official statistics, the sort of things that the ONS produces.

    PB
    Well we've always used them actually, when we produce our population statistics. We estimate the local population using the number of people registered with a GP and how that changes over time. So it’s not new, it's just that we're expanding what we might be able to do here to try and get so much greater benefit every year, to improve decision making every year for all of our public service planning.

    MF
    And the opportunity, as Jen has already suggested, to link that data to understand how different groups, down to really quite small groups and local level and by different characteristics, are being affected by certain issues.

    PB
    That's right. Different datasets tell us different things. So there are datasets that tell us about educational achievement and there are datasets that tell us about household income, for example. And by bringing those together, we can understand the implications of education per outcomes of household earnings so we can really start to tie together the kind of public services that we get and the outcomes that we get as households.

    MF
    Now the possibilities of all this, of course, of being able to bring all this data into one place is a very exciting one from an analytical point of view, but from the point of view of the public and individual citizens at the same time, you could see why some people might be concerned about this, both from an ethical and a secure point of view.

    PB
    Well, when you think about it, this is nothing new for ONS. We've been doing a census for over 200 years and we keep those data safe we always have done, and we also do surveys every year of households on very sensitive topics. Some of them are people's experiences of crime or their health for example, and we do surveys of businesses to understand the economy and produce our statistics about GDP and inflation. Those data are all sensitive, and we keep them all very securely. So in one sense, there's nothing new here. We are good at this. We know how to keep data secure. It's all anonymized. So there is never anything published that identifies an individual and even within ONS, the analysts only get to see anonymous data.

    MF
    And very important to state, is it not, that it's not a question of building up pictures of individuals. It's always from a statistical point of view. It's the numbers we're interested in and not the people.

    PB
    Absolutely! We don't care about Peter Benton or Miles Fletcher, we care about the picture it paints of the nation. It's the statistics that come from it. And we are absolutely strict about confidentiality.

    MF
    Jen, other countries of course are wrestling with this as well and adopting and trying new kinds of systems. What's been the experience internationally?

    JEN WOOLFORD
    So you're right, lots of countries are looking at new and innovative ways to create the population statistics bringing lots of different sources together. We all operate in slightly different contexts. So in Scandinavia, for example, they've been producing population statistics like this for a long time. But those are countries that have population registers, which means their context is very different from ours. And to be absolutely clear here, we're not looking at building a Population Register. We're looking at creating statistics from bringing together different data sources. And there are a number of countries who are in the same position as us. So for example, Australia and New Zealand, and they are looking to try and develop similar systems for producing population statistics as we are and we're working very closely with those countries to share our learning and to share the methods as we're developing them so that we're all learning from each other.

    MF
    So talking about the potential of these new data sources, including all the administrative data, can you give us some examples of what we're not doing that we might be able to do much better in future?

    JW
    There are a number of advantages and improvements we can make for greater use of data. Firstly, in the existing system, we use the census to benchmark our population estimates. So in between censuses, we estimate population change with births and deaths and migration, but we tend to get a bit of a drift in those population estimates. So we use the census then to benchmark it and bring those estimates back in line. With this new system, we're looking at not just estimating the change but also estimating the number of people at a point in time, so that hopefully will reduce that drift that we get in population estimates and mean that over the 10 year period, our estimates are more accurate. The other thing that can happen between censuses is you can get quite a lot of change in local areas and the data we have doesn't reflect that change, because it's based on the previous census. So an example here could be that the conflict in Ukraine has led to a number of Ukrainian refugees moving to England and Wales since we conducted the census. So in some areas, the makeup of the population there will have changed significantly since we conducted the census. And in our existing system, we wouldn't be able to pick that up. With our new system, we'd be able to pick up that localised population change much more quickly than we can at the moment.

    MF
    And presumably that would be of enormous benefit for local authorities, where everyone would be trying to provide services down to local level, because you've got a much more up to date picture of how many people are there, and we saw recently when the census results were published, some local authority areas have experienced big changes in population.

    JW
    Absolutely. The other thing to be aware of with the census is that it was conducted during the pandemic and it was conducted during a period of lockdown. What we saw was that people moved out of some of the metropolitan areas during that period of lockdown, back to whether that's the kind of parental homes for students or for young members of the workforce. So the populations in those metropolitan areas will have changed quite rapidly as the country opens back up and as people move back into those metropolitan centres. The approach that we're taking now should be able to pick up that change much more quickly, not just the numbers of people, but also the characteristics of people who are moving within the UK.

    MF
    And how does this benefit individual citizens? What's this going to mean for the public generally?

    JW
    So better data means better decisions. It means that better planning can be made for things like school places, better planning for public transport, where to put hospitals, where to put sports centres. All of these decisions are based on our data about the population and by having better data, you'll have better decisions.

    MF
    And you’ll be able to target services and be able to target spending as well on a much more short term basis, rather than having to make decisions coming along into the future when circumstances could be changing.

    JW
    Absolutely. Or the decisions might still be long term, but you'll be able to monitor the impact that those decisions are having much more closely than you can at the moment.

    MF
    So Jen, is there anything we won't be able to get from such a system? And we've heard some people suggest, for example, that we wouldn't be able to get that very small level data, the street level data that's so useful from a census, and survey purists point, of the census as a great way of capturing social history.

    JW
    We're always faced with trade offs when we make decisions about things like our methods, or anything in life, and there are likely to be trade offs here. What we've done to date is we've done lots of research that shows that there's bags of potential here with what we can do with administrative data and the understanding of the population we can get from administrative data. There are still outstanding questions for us. So there are some characteristics, for example, people who provide unpaid care, that isn't available from administrative data and we still need to work out how we will provide that level of data. The census gives such a wealth of information about things like ethnicity where we get down to really granular classifications of ethnicity, it may not be possible to do that with administrative data. However, on the flip side, we can produce statistics that we didn't get from the census using administrative data. So on the 2021 census, we didn't collect information about income. But we've published research that shows that we can get down to small area estimates of household income using a combination of administrative data. We've also published research which shows that we can produce the kind of variables that we do get from the census. So we've published research on ethnic group and also on housing stock, types of housing, and we've also managed to get to linking different admin data together so that we can look at income by ethnic group, and housing type by ethnic group. So producing what we call multivariate statistics through linked administrative data. We still have a programme of research to really understand how far we can replicate what we get out of the census. But the consultation that we're about to launch is really about understanding whether what we can demonstrate and deliver with administrative data answers user needs. And if it doesn't answer some of our user needs, what are those needs, and so we can then plan our future research to make sure we're focused on the right things.

    MF
    And of course, it's genealogists - people who love to trace family trees - who find the census data so valuable.

    JW
    Absolutely. And in the existing system census data is archived for 100 years and then made available to genealogists and others to really explore their family history. In the new system we have a wealth of data that we could be using to understand the population and we need to work with genealogists to understand exactly what it is that would be useful for us to archive for future posterity. So although that's not the focus of the consultation, genealogists are very welcome to respond to the consultation and let us know more about their needs, or we'll have future conversations to make sure that we're clear on what the need is here and how we can best answer it.

    MF
    And that's what the consultation is, to a large extent, all about.

    JW

    Absolutely.

    MF

    And it's important to understand that these proposals haven't just been whipped out of thin air, a considerable amount of work has already gone on getting us to this point in time hasn't it. Can you talk through some of the research that's already happened and some of the evidence that has been provided to suggest that a new and transformed system might well be the way forward?

    JW
    Yes, this has been a long programme of work where we have focused on two different types of research. One is around improving our estimates of the population and being able to get to small area population estimates more frequently than we can at the moment. And the other is around the characteristics of the population. So what can we say about ethnicity or employment down to local areas. On the first of those, we've done a lot of work talking to local authorities about the estimates that we've produced and their understanding of our outputs and whether they match with what they see on the ground. We have compared what we get through administrative data to the figures that we got from the 2021 census. So lots of work comparing the outputs and talking to our users about how credible those outputs are. We're also looking at how can we improve our estimates of migration, in particular international migration, and we've been working very closely with the Home Office and the data that they hold to understand more about the flows of people in and out of the country and the reasons for those flows. So people coming as international students, people coming to work, people coming along humanitarian routes, and we've built already lots of improvements into our migration statistics using administrative data and we've got lots of plans going forward for even more improvements that we can deliver there. We also have an expert panel, the methodological assurance review panel, who quality assure our methods. So these are people who are real experts in statistics and methodology, who have looked at the detail of the methods that we're using to produce those outputs and check that those are sensible and the best methods that we could be using.

    MF
    So to sum up then Jen, how far ultimately could this new system take us?

    JW
    Well, the sky's the limit, really. As more and more data become available, there's more and more we can do, as our methods improve. As our computing power improves, there's more and more we can do to really understand the population, its characteristics, how it moves around. So this is going to be an ongoing programme of work for years to come.

    MF
    So Pete, tell us then about the specifics of the consultation. Who is it for and what do we hope to get out of it?

    PETE BENTON
    Well, it's for anybody who would like to respond. We in particular, want to hear from people who use the statistics to get their view on the balance between all that detail that the census gives us once a decade compared with the frequency of having more information every year, and we want to understand people's perspectives on those trade offs, but anybody is welcome to respond to it. And of course, this is just the continuation of a conversation that we've been having for years. We're continually talking to the big stakeholders, the big users of our statistics across government, in local government, in the commercial sector to understand their needs for statistics. So this is a culmination of a conversation that's been going on for years.

    MF
    Okay, so when does the consultation start? And how exactly do people go about taking part?

    PB

    Well, it'll be an online consultation. It'll start in June and it will end in October.

    MF
    Okay. So the consultation completes in the autumn. Big question - what happens then?

    PB
    So we will take a good look at all those responses we will understand what people have told us and then 12 weeks later, we will put out our response to that consultation summarising what we've heard. Following this, the National Statistician will make recommendations to government based on all of ONS’ research and the findings of the consultation to put administrative data at the core of a transformed population and social statistics system, and that recommendation will also consider the future of the census arrangements.

    MF

    So there you have it, a one in a million opportunity – or more pedantically, one in 59.6 million, given that’s the accurate population of England and Wales according to the last census - to share your views on an incredibly important piece of work.

    Consultation opens on June 29th and runs through to the end of October. If you'd like to find out more about it and all of our transformation plans for population and migration statistics, you can do so by visiting the ONS website: www.ons.gov.uk Or you can attend one of the free in person and online consultation events that the ONS has organised in July, details for which you can find on this episode's podcast page, as well as online through our social media channels, and the ONS website.

    Thanks to Jen Woolford and Pete Benton for taking us through everything today. And thanks as always to you for listening.

    You can subscribe to new episodes of this podcast on Spotify, Apple podcasts and all the other major platforms. You can also get more information by following the @ONSfocus feed on Twitter.

    I’m Miles Fletcher and from myself, and our producer Steve Milne, thanks for listening.

    ENDS

  • In this episode of Statistically Speaking we shine the spotlight on local data and look at how good statistics for small areas make for better targeted policy interventions, and more effective use of valuable public resources.

    Transcript

    MILES FLETCHER

    Welcome again to Statistically Speaking, the Office for National Statistics podcast. I'm Miles Fletcher and in this episode we're talking about local data for local people - How good statistics for small areas make for better targeted policy interventions, and more effective use of valuable public resources.

    We're going to explore, for example, how new data sources are helping to precisely calibrate economic circumstances and local communities. How we may even be able to calculate the GDP of your street or village. Now many economic forces are of course global. Some of the solutions to issues like competitiveness, productivity and inequality might begin on our doorsteps.

    As ever, we have the cream of ONS expertise here on hand, this time in the shape of Emma Hickman, Deputy Director of the ONS sub national stats division, and Libby Richards, Deputy Director for UK wide coherence and head of an important new initiative called ONS Local, which we'll be hearing about in full. Also joining us is Stephen Jones, Director of Core Cities UK. Its aim is to promote the role of our great cities in creating a stronger fairer economy and society.

    So Emma, to set the scene for us first then please explain precisely if you would, the value of really good local stats.

    EMMA HICKMAN
    So the needs are multiple, really. I think the most important thing is that we are seeing a huge increase in locally targeted policymaking and that’s at a range of different levels across government. So in central government, we see near the department for levelling up Housing and Communities kind of really wanting to think about how do they target policies that are going to help to level up the country but equally what we're also seeing is an increase in devolution which is giving more power to local areas and local policymakers. And so it's really also important that they have the statistics and the data that they need and the evidence that they need to make really, really good decisions for their local areas. And they can do that in a really powerful way because they also have knowledge of their local areas. And then finally, you know, actually for citizen kind of uses of our data and statistics really one of the inclusive data principles that people are able to see themselves in the data and that they feel that the data and the statistics that we're producing as an office represent them. And so having statistics and data available at really geographies that are very meaningful to people is hugely helpful in making sure that as a country, right across the UK that we are kind of reflective of the experiences of really kind of a wide range of people and you know, local economies and end users and understand kind of how they're experiencing that as well.

    MF

    I guess one of the fundamental principles here is that it's it's local knowledge. It's all very well and everybody thinks they know that local area, but to understand all local areas, we need comparable statistics and data produced to consistent standards.

    EH
    Yes, absolutely. And that's, I mean, that's one of the key challenges. I think we'll probably kind of come to talk about a little bit later, but you know, absolutely. And that's really about understanding you know, where are the where are the inequalities within regions, as well as between regions? I think we have a lot of information available about, you know, kind of regions, but actually, we also know that some of the inequalities that people really feel are much greater actually within regions and between them and kind of being able to draw that out of data and statistics in a comparable way I think is really important for helping sort of policymakers and decision makers to understand where best to target resources.

    MF
    Stephen, from a policy perspective, describe the demand for local data at the moment, what sorts of policy solutions are policy makers coming up with and how are those best informed by really good data?

    STEPHEN JONES
    I think it covers all branches really of policymaking. I think as Emma was saying, the kind of need for really understanding and having a kind of quantitative basis for what's happening in a place is, is actually absolutely crucial for designing policy, whether that's policy about trying to make the economy grow, whether that's policies aimed at trying to reduce disadvantage and challenge facing individuals, whether that's policy about delivering the most effective and efficient public services in the right places at the right times, all of those things, whether that's done in public or private sector need to be built on a good evidence base, good understanding. I think the other thing I would add to the richness of local data can do you can kind of contextualise and understand, you know, a number on its own doesn't mean a huge amount, but if you know that you are 10% higher or 20% lower than your neighbouring place. Or the city of the same size. It's those kinds of contextual dimensions that really help nuance and finesse your policymaking.

    MF
    And it does come back to that question of trust in data than to make those comparisons in a really reliable and meaningful way. Which I guess is where the ONS, the Office for National Statistics, where we come in. Now Libby tell us about ONS Local. This is an initiative which is all about making sure that that really high quality data is available for the policy makers

    LIBBY RICHARDS
    ONS Local is our advisory service that is staffed by ONS analysts who are based in every nation of the UK and every region of England. And the idea is that we are here to help local policy makers, regional observatories, and lots and lots of different users of sub national data to really understand the enormous offer from ONS in terms of local data. Having said that, it's also very much about those working relationships as well. Stephens talked a lot about context and understanding the nuances and so understanding the situations and challenges that are happening locally is absolutely key to ONS Local helping local areas understand that context better.

    MF
    The big ONS surveys of course have long carried, many of them are typically think about the Labour Force Survey over a very long period of time, carried a great wealth and local data that obviously gets lost in the national headlines that these data releases generate. But is it a question of getting better value out of what the ONS is already creating or actually about sourcing new data from different sources?

    LR
    It's a bit of both, very much, in being able to take people through what we already have when understanding their questions, particularly when multiple local areas are asking the same question that's really maximising what ONS already do. However, Emma's side of the house in particular, less so in the regionally and nationally distributed ONS Local is really about developing those new statistics getting into how do we get down to hyper localised sort of 400 to 1200 household building block data that then allow people to build those areas that means something to them. Emma, I don't know if you want to chip in?

    EH
    Yeah, very happy to. There's two strands I think to that Miles. I think there's one which is about, you know, how do we make the most of survey data and kind of new administrative data sources together to enable that level of granularity? And then the second part is actually when we talk about administrative data probably, that might not really mean things to lots of people. That's data that is collected for a different purpose, but collected on a on a very, very routine basis. And there are actually a fair number of new sources of that kind of data that we're able to get into the ONS.

    MF
    That's interesting. Can you give us an example of that?

    EH
    So, I say relatively new. I mean, I think ONS have had this data for quite some time now. But in order to get the level of granularity that we need on Gross Value Added statistics, for example, which is a measure of productivity, we use HMRC’s VAT data for businesses and then we can link that to kind of our survey data and think about how can we then apportion estimates down to the level of geography that we need, knowing that the survey is the place where we've been able to ask the question that we really want to know the answer to and then we can use the other data to model sort of some of the other granularity that we need. The other thing is we've been really successful and using card payments data throughout the pandemic to inform the government's response. And we've recently successfully acquired a really exciting new data source from Visa, it's aggregated, so there's absolutely no way of identifying people in the data, but they've aggregated it at a really granular level of geography for us. So again, it would be in the region of probably hundreds of households, but actually that's granular enough for us to get some really, really good insights into kind of how you know, consumer spending is kind of playing out in the local economy. And there are all sorts of applications for that, that we're really excited to be to be able to start taking forwards now that we've got that data in the office.

    MF
    So just with those three very important data sources, suddenly we're creating right down to that very micro level, as you say, 400 to 1200 households really quite a full picture of local economic activity.

    EH
    And the really exciting thing about that is that people can then build their own geographies as well from that. So you know, traditionally in statistics, we tend to produce data at the level of an authoritative boundary like a local authority, but actually you might really want to know about, I don't know, West Midlands Metro, for example, they extended the line a few years ago, you might really want to know about local economic activity around that and actually, that's not going to be captured in the sort of administrative boundaries and so having the data at that level of granularity really allows people to build a geography that sort of area of interest or importance to them in some way.

    MF
    Creating a GDP of your street or village.

    EH

    Indeed.

    MF

    Okay, that's the project for now, but it comes across with some pretty significant challenges. It comes back to this problem of comparability doesn't it, and particularly if you're looking across the UK contexts there. We've got different government structures, we've got some devolved areas, we've got areas and we've got big metropolitan authorities as well. How difficult is it to be able to standardise and to make uniform the data right across that rather complex government picture?

    EH
    Incredibly so. To the point where we don't necessarily aim for uniformity. It's very much about how do we make sure that we're able to tell stories that are coherent and consider that UK wide angle when thinking about the nations but also thinking about how do you enable that comparability that's very tricky. And the more and more devolution happens, the more and more difficult that actually can become, particularly when you're looking, for example, at health data where it is a devolved policy area across the four nations. But actually, if you live on the border, let's say between Wales and England, actually, you may well be getting your health care on the opposite side of the border from which you live and therefore you've got to be able to have an opportunity to consider that.

    MF
    There's the issue then of course of samples as well. And the more local you go, of course the less representative your sample is going to be.

    EH
    Absolutely. And that gets particularly tricky. Even at a nation level where we're thinking about Scotland, Wales or Northern Ireland, for example, the opinions and lifestyle survey, actually, it's quite difficult to find out what that looks like for Northern Ireland. And ideally, we'd want to be able to get more granular than the nation level, but sample sizes make that really tricky to still be representative. And so either we'd need to expand the survey to get that level of granularity or we have to actually say the best we can do is this.

    MF
    Yes, because there is only one holy universal survey of course and that is the census and that only happens once every 10 years. I recall when we were running the big COVID infection survey at the height of the pandemic, even a massive data gathering operation like that. We could still only end up getting it down to sub regional level which is what units are for half a million people. So it does show doesn't it how important it is to make the most of that admin data which can be extremely comprehensive sometimes

    EH
    I, you know, completely agree with you there Miles on administrative data and how important it is to be able to kind of think about innovative ways to combine that data with our survey data to get a more granular level of information. I talked a bit earlier about kind of estimates of gross value added and I can say that's just that's a measure of productivity and it feeds into the largest component of GDP and in local areas. What we were able to do there as I mentioned kind of earlier, we took HMRC’s VAT tax data which is collected for all businesses that pay VAT, we were able to link that to a data set that ONS hold called the interdepartmental business register and the information that's held on that is all of the information about business structure, so has a VAT reference in there so we can link it to HMRC data. But the most important information on there for us was actually that where the local units are, so for example, Tescos will have a headquarters somewhere but you probably have a Tesco Express quite close to where you live. And that's one of the local units so tells us where the local units are and their postcodes and it also tells us how many employees work in those local units. And so we can make an assumption like productivity for all employees in the organisation is the same, and then we can look at actually what the productivity for that firm is top level and then divide that by the number of employees to kind of say, well, actually, if all employees are equally productive, this local unit has a productivity sort of measure of this much, and then we can aggregate that back up again to the sort of area so you know, really kind of key to be able to understand those methods, but there are some other challenges as well, but I can probably come back to those.

    MF
    That's fascinating stuff. I mean, you could point to a certain, perhaps a certain enterprise, a certain employer, that is considered to be, you know, fundamental to a local economy. But this way, you can actually really press precisely quantify what that importance is.

    EH
    And I think that's one of the challenges because actually as a as an office, we don't want to be disclosing the productivity of any single firm or any single business because that is personal information. So one of the things that we've had to do in very local areas where there are what we call dominant businesses or dominant organisations who have like most of the productivity for that area, is we've actually, you know, I'm gonna be honest, we've we've sort of masked it a bit. And so we've kind of averaged a few local areas together so that you still have a building block level of data, you still have a building block so you can build a bigger area, but you don't actually have any businesses that are considered dominant within the statistics that we produce. That's taken quite a complex algorithm to be able to achieve that. I won't go into too many details just to say that it is a consideration and the challenge that we've had to really innovate to be able to be able to publish that information.

    MF
    It's important to stress Isn't it that all the usual principles of non-identification and confidentiality apply in this work as much as they do anywhere else across the ONS.

    EH
    Yeah, absolutely.

    MF
    Give me a couple of examples of some specific bits of work that you've been doing then. There's been an analysis of towns and out of town locations particularly and how local employment growth is happening outside of town and city centres.

    EH
    My team kind of over the last sort of couple of years have been doing a whole series of analysis of towns in particular, like I say, that's a geography that people can really relate to, you know, lots of people kind of live in a town or a city. And that's something that's a bit more understandable than maybe a local authority and is a bit closer to them than the region for example. Our recent analysis on towns and out of town locations when we looked at employment growth, I think has some quite important findings actually for transport planning. For example, what we found is that actually employment growth is not happening the most in town centres, it's happening more and faster within two kilometres of the edges of a town of the town boundaries. And so what we think it might be happening is that kind of employment growth is actually happening in industrial parks are situated on that cusp between town and kind of rural areas. And when you're thinking about, you know, how people might travel to work, for example, I think it's really, really important to have those insights so that we're not just planning transport routes, for example, that go into town centres

    MF

    And what other insights have we been generating?

    EH

    So another recent piece was a new piece of analysis on the nighttime economy. So I think lots of people will think about the nighttime economy as being predominantly about bars and restaurants and obviously, you know, they will have a really, really big impact on those sort of industries during the pandemic. But in fact, what we find is that actually the nighttime economy in rural areas are surprisingly busy and that's because we also have a nighttime economy that is around health and health care. Nurses, for example, kind of working night shifts and that sort of thing. And then the other kind of aspect to it is sort of warehousing and transport as well. There's often kind of an overnight element to that, too. And again, having that understanding of like how that kind of plays out in different parts of the country is kind of a really, really useful. We originally did it just for London, interestingly, and then we've done this kind of new analysis looking at the whole country, which was really interesting. Other things produced quite recently as well are an expansion of job quality indicators of work across the UK, which is important because if you just look at kind of employment numbers, you're not really getting a sense of, you know, you get a sense of who's employed and who's unemployed in terms of characteristics of people, but what you don't get is like how good is the job quality for those people and actually, job quality is probably quite important for a lot of individuals and in terms of how good they feel about kind of going into work and how productive they are? And all of those those kinds of things,

    MF
    That also forms the understanding doesn't it of why some people have opted out of employment in recent years.

    EH
    Absolutely. And it also can tell us about things like how many people are working part time who want to be working full time for example. Or vice versa, you know, so there's kind of like a measure of underemployment in there. It tells us a little bit about what percentage of people are working on zero hours contracts versus permanent contracts, all those kinds of things, I think are quite, you know, sort of quite important.

    MF
    Some other developments well worth pulling out as well. I think we've been able to produce very interesting picture of comparative housing affordability down to quite local level.

    EH
    Yes, I think our main housing affordability release goes down to local authority level, but we have produced actually a range of housing affordability statistics, the local authority, one that we published recently probably been the most comprehensive, we're also doing a lot of work on the housing data that's collected through the census as well to understand dwellings and their characteristics as well. You know, how many dwellings are occupied and versus non occupied and how that varies by different parts of the country as well. Housing affordability in particular tells us about how people's earnings relate to what they spend on housing, and obviously that has huge impact on again, kind of, you know, people's disposable income at the end of the day. So I think it's certainly an important one.

    MF
    So lots of fresh insights that are coming from the ONS and local statistics, but it's important to point out that a lot of this you could be doing for yourself if you're so inclined, and we've brought forward a tool called and it's much more exciting than the name implies, actually. It's called the Sub National Indicator Explorer tool. Libby, can you explain how that operates? And some of the really interesting insights that you can generate with it.

    LR
    So the Sub National Indicators Explorer is something that we know and have known for a while that users desperately want. So often, if you are trying to understand a particular place, you have to go to lots of different sources to actually find information about one area. So for example, if you want health you have to go to one place. If you want to find out about education, you have to go to another and find your area and then collate that yourself. What the sub national indicators Explorer allows you to do is bring together all of those relevant indicators into one place so you can find your local authority and compare it with say up to three others across more than 40 different metrics ranging from gross median pay, right the way through to healthy life expectancy, and so you have this incredibly useful tool where you go, I want to know everything about place x and you get it all in one place. Our intention is to develop that a little bit further and eventually head into some of the developments that have come out recently around the census where you can build your own maps, build your own areas and flexibly bring different data things together. Alongside that we've also been thinking about how else we might be able to compare other areas and the team have recently done an analysis that clusters local areas together under metrics similar to and including some of the same from the sub national indicators tool and so that explores places that are statistically similar using things like regional growth metrics, and we can see what different parts of the country could potentially learn more from each other. They might be facing similar challenges and therefore getting beyond their local area to kind of join up with other areas across the country and this also gives some really weird potentially interesting insights.

    MF
    Yes, which shows that despite the north south divide, about which we continue to hear a great deal some places in North and South have a great deal in common with each other.

    LR
    Indeed, and actually places for example, in the south may be very different. So Portsmouth down on the south coast can look a lot more like places in the Northeast than possibly other areas on the south coast. Portsmouth is in a cluster of higher connectivity but lower health and well being whereas neighbouring Havant is in a much higher health and wellbeing and moderate educational performance cluster and you can see this all over the place. So for example, Newcastle upon Tyne is actually very similar to the New Forest and Havant and in fact, so is York and Great Yarmouth. And so they're actually disperate across the country, but mostly situated in particular areas. However, if Havant or the New Forest is facing a particular problem, maybe going and having a chat with York might actually be quite helpful depending on the problem.

    MF
    That seems an excellent moment to bring in Stephen Jones as director of Core Cities. Stephen, the local picture, of course, is much more complex than that old cliche about the north south divide. But what work are you doing with the ONS and with others, to produce a really informed picture which policymakers can then act on to deal with these issues of localised deprivation, economic disadvantage and so forth.

    SJ
    Firstly, we're doing a piece of work as Core Cities with the Royal Society of Arts called Urban Futures Commission, looking at the kind of like what's the long term potential and trajectory of our biggest cities in the UK and within that, you know, this is the sort of position of why do UK cities relatively underperform compared to the international peers in the developed world is quite a well established problem that's decades old. What some of the new data available is allowing us to kind of really get a better handle on is, why is that the case what is happening to for example, a fairly recent new release of fixed capital formation, so investment data, at a local authority level split by the different asset classes that the ONS have produced is really helpful to bring an understanding and a kind of richness to basically what both public and private investment we can see that our big cities outside of London have a relatively lower levels of public and private investment, particularly then if you strip out real estate investment. So investment in capital and business intangibles, those things are particularly low. So not all of our core cities, the total investment in Greater Manchester most recently was about 9000 pounds per head, central London, it's 55,000 pounds per head. If you go down to Newcastle I think it's down to 3000 pounds per head. You know, that's a dramatic difference in levels of public and private investment.

    MF
    Does having much more reliable local data, perhaps hold with it the promise that the policy interventions that result from it can be therefore much more effective?

    SJ
    So completely. You know, one of the things that I'm quite excited about in terms of using the local GVA data that Emma was talking about as a new release is there's been a whole host of different policy interventions over the last 10, 20, 30 years trying to kind of create economic activity within zones areas and whatever was saying about the ability to build your own geographies, I think is really has real potential in it. So whether it's the enterprise zones of the Heseltine era or the enterprise zones of the George Osborne era, whether it's free ports policy more recently, whether it's transport led regeneration schemes around new road junctions or new rail stations, whether it's the role of universities, science parks, investment in innovation zones, the government recently announced in the budget just a few weeks ago, the question of investment zones, all of these policies, they are some of the national ones – there's many more when you think locally are attempting to try and create concentrated economic activity within certain locations. One of the main criticisms in a policy sense is that that activity will just get displaced from elsewhere. If the business that is currently located three miles up the road will move to within the zonal boundary to gain sort of benefits and advantages that are being offered there. Well, we'll kind of be able to tell whether that's true or not, by actually looking to see whether the areas nearby have sort of reducing GVA compared to the areas that are growing and I think being able to properly evaluate policy interventions over the last 30 years to really then decide, well, is it worth pursuing policies like the investment zone announcement of recent weeks or actually should we be trying other approaches? I think that that kind of insight is going to be incredibly valuable.

    MF
    Indeed, and perhaps also with data at a much lower level and much more micro local level as well, perhaps much smaller, more precisely targeted interventions might be what's called for.

    SJ
    Exactly and I think that again, picking up some of what Emma was saying earlier, some of this data is a tool for local authorities. This has huge potential sort of exactly where are the jobs located? Are they in the town centre? Are they in the business park on the edge of town? What time of day is that activity happening? Is it shift patterns versus is it concentrated in the sort of 945 when we know these things, whether you're sitting there working out your local plan and working out where you're going to zone, your new employment land where you're working out whether you're going to offer any business rate incentives in a business improvement district when you're sitting there working out and what time of day do you need to have your trading standards officers available, these kinds of planning decisions day to day when you're trying to think about what your refuse collection plans and patterns are those things that local authorities are doing on just managing public services bringing together those different aspects having that sort of insight to know what's happening, when and what's most effective, we'll just make our policies more efficient. And in a world where public finances are constrained, particularly so for local authorities and have been for a while or be able to use the funding that is available more efficiently and the delivery of those services I think is hugely beneficial. The other thing that I'm interested in I think, is an area where we as Core Cities can can work with the ONS and others going forward is how do we make more advantage and take more advantage of the data, administrative data that is held locally? So if you think of an average local authority, they have huge amounts of data about that area. Whether that's through kind of council tax dates on collections, arrears, council tax discounts, whether that's through business rate data, whether that's through library card membership, planning applications, the list goes on. Obviously, for the same reasons, as we've talked about the need for protecting individuals and protecting data confidentiality, some of that data, you know, we'll need to be careful about how do we use but at the moment, it's largely sitting there on databases being under explored. If we can get to a world where we can start matching some of that data with some of the data sources that the ONS are making available, and then matching it with data sources such as Emma was talking about that the private sector can bring to the table like Visa and others. I think it's in bringing those sort of insights together. You can actually really, really develop the rich pictures. I can see Libby you would like to come in, so I might just pause there.

    LR
    Yeah. I was just gonna say Stephen there mentioned about utilising locally held local data alongside national level local data, sort of your ONS data, your government department data, and actually that is one of the things that we're really hoping that ONS Local can help with by having people locally with very good relationships with those individuals in local government, local authorities, regional observatories, actually, if we can pull together their administrative data with what we have at the national level and help with some of that analytical insight because also aware, as Stephen said, local governments are constrained and resources actually, if ONS can help in that analytical insight, then even better that we can help along the way.

    MF
    So Emma, an exciting vision of the future there and the possibility to be really improving local and regional policy interventions. What's coming next?

    EH
    The really big exciting development that I just wanted to mention is the kind of opportunity for collaboration and I think ONS as an organisation are on the cusp of opening up the Integrated Data Service more widely, and actually, we've been working really, really closely with that team over the last couple of years or so to understand what a good data asset would look like for subnational. And to kind of start to make sure that we can do some of the data engineering to make that micro data. So when I talk about micro data, I'm talking like response level information from surveys kind of available in a secure and safe way and also in a way that's easily linkable, so that you can easily pick up something about health and something about quality jobs and link them together in that service and do the analysis that you were talking about. That's one of the most exciting developments. I think that's on the horizon in terms of how we'll be able to collaborate and kind of use and share data more widely, keeping in mind that privacy aspect. So you know, the idea is that all of that data is anonymized before it goes into the service and then things will be in kind of really strictly controlled through it. But there is that opportunity for those wider collaborations. I don't know Libby, whether you wanted to come in a little bit on some of the other kind of future developments as well.

    LR

    Yes, so over the last 9 to 10 months we have co-designed the ONS Local service going out across the country, doing round tables, getting people together in the room, putting forward our vision of what ONS Local might look like but very much saying “tell us why we’re wrong, what doesn’t work for you, tell us what we’re missing”. So really building that service with our users, and now we’re really beginning to fly now that we have people across the country. Other bits of new work also on the horizon include new data looking at the effect of place on geographic mobility across towns and cities, so we can follow those trends as people move around the country and can help us build pictures of places, track educational outcomes and workforce trends by area, at a level that we’ve not been able to do in the past. We’ve also talked a lot today about the Gross Value Added (GVA) data, and that obviously focuses on businesses. The next innovation for those sorts of granular statistics is more looking at the households aspect, and therefore allowing more targeted policymaking for those bespoke areas, and understand those hyper-local affects that are so important at the moment, particularly when considering all those devolution aspects.

    MF

    Some insight there on the work underway here to ensure people across the UK see themselves in our data. Many thanks to our guests today Emma Hickman, Deputy Director of ons sub national stats division, Libby Richards, Deputy Director for ONS Local and UK wide coherence, and Stephen Jones, Director of Core Cities UK.

    I'm Miles Fletcher and thank you to you for listening. If you've got a question or comment about these ONS podcasts, you can find us on Twitter @ONSfocus. You can also subscribe to new episodes of the podcast on Spotify, Apple podcasts and all other major platforms.

    Many thanks to our producer for this episode at the ONS Alisha Arthur. Until next time, goodbye.

    ENDS

  • In this episode, we focus on a powerful example of when the numbers alone are simply not enough. The most recent Census has told us how many people have some form of disability but to really understand the nature of those disabilities and the needs of people reporting them we need to do a lot more work.

    Guiding us through this work, is Helen Colvin, joint lead for Census and Disability Analysis at the ONS; Shona Horter, Head of Qualitative Research at the ONS Centre for Equalities and Inclusion; David Ainslie, Principal Analyst in the Analytical Hub of ONS and Matt Mayhew, Senior Statistical Officer in the Policy Evidence and Analysis Team.

    Transcript

    MILES FLETCHER

    Hello and welcome again to another edition of Statistically Speaking, the Office for National Statistics podcast.

    In this series, we've spent a lot of time explaining how statistics can brilliantly illuminate important issues, and this time we're focusing on a powerful example of when the numbers alone are simply not enough.

    The most recent census has told us how many people have some form of disability and where they live. It's a good place to start of course, but to really understand the nature of those disabilities, and the needs of the people reporting them, we need to do a lot more work and that work is the subject of today's discussion.

    Here to guide us through it we have Helen Colvin, joint lead for Census disability analysis at the ONS; Shona Horter, head of qualitative research at the ONS Centre for equalities and inclusion; David Ainslie, Principal Analyst at the analytical hub of ONS; and Matt Mayhew, senior statistical officer in the policy evidence and analysis team.

    Helen to start with you, I mentioned the census there and those numbers showing us the scale of disability as defined by Census. Is it fair to say that census remains the sort of statistical bedrock of our understanding of disability - the single most important source?

    HELEN COLVIN
    Yes that’s right. I'd agree with that. So it's the main source that covers the whole of our population. So it's the best truth that you have, if you like, of what our population is like, and the proportion of disabled people within our population.

    MF
    And these were people, responding in their households, to the question which said what precisely?

    HC
    It said: Do you have any physical or mental health conditions or illnesses lasting or expected to last 12 months or more? And if people answered yes to that, they were asked: Do any of your conditions or illnesses reduce your ability to carry out day to day activities? A lot, a little, or not at all.

    MF
    What did you have to answer to that to be classified as disabled?

    HC
    To be classified as disabled - If you answered that you had a long term condition which affects your day to day activities a lot or a little then we regarded that as somebody as disabled. And the reason for that is that at ONS we measure disability against the Equality Act definition of disability and that really identifies somebody as disabled if they have a long term condition, and if it limits their day to day activities. And we do that so that we're able to report against the progress on the Equality Act in the UK.

    MF
    And the key element it would seem that - obviously we're talking about disability - is your ability to do day to day tasks and a sustained limitation.

    HC
    That's right, that needs to be... to be disabled under the Equality Act there needs to be a long term thing which affects you for up to 12 months or more. And it needs to be something which does impact you on your ability to carry out day to day activities. And that's really something that is arguably focusing on the medical model of disability, so focuses on how you can't do things because of your impairment because of the environment around you.

    MF
    Now that question is slightly different from the one asked in 2011. Why was that changed?

    HC
    So in 2011, we asked a very similar question, but we did remove a prompt which asked people to include problems specifically related to old age and this really was about bringing it more in line with the Equality Act, which doesn't have that emphasis. Problems related to older ages still classified as disability, but it wasn't making it the same kind of focus of the question, and another part that we changed was to remove the word disability because of course, disabled means different things to different people. And we tried to measure it slightly more objectively by using our own definition rather than asking about people's own opinions if they were disabled. And this time we also included mental health within the question, and we think that that could have influenced the raises that we saw among younger people.

    MF
    But how big an influence do we think that?

    HC
    So in census 2021, we did see an increase among younger people being classified as disabled compared to 2011. And this did stand out particularly for females slightly more than males. We think there was also possibly a real change in population at that time down to the pandemic, with more people showing signs around depression and mental health problems, particularly at the period that the census was conducted.

    MF
    And there remains of course, underlying all this. This is census data. This is people's own assessment of their ability. How is that benchmarked perhaps against other sources?

    HC
    So it's obviously a different measure from other sources. Other data might be more medically based, so GP records, that kind of thing, which is more based on actual conditions as opposed to disability.

    MF
    And do we think that some people perhaps consider themselves disabled who might not be defined as disabled under other circumstances?

    HC
    Absolutely. I think disability means different things to different people and some people who might be regarded as disabled under the Equality Act specifically wouldn't want themselves to be looked at that way. And conversely, some people which may not be captured by that definition, may want themselves to be, so there are many different ways you can conceptualise and define disability. So this is one way to try and do that and to measure disability in a slightly more objective way.

    MF
    And being defined as disabled within that census definition that you've set out for us, how does that match against other criteria of disability, perhaps when it comes to gaining access to benefits or services?

    HC
    So it has a different definition and a different way of being assessed. So for instance, if somebody wanted to access benefits, then there's a completely different threshold and set of criteria that they would need to meet through Work and Pensions.

    MF
    There's a tension there isn’t there, perhaps between people who answered in the affirmative on this on the Census but then wouldn't qualify as a disabled in the eyes of officialdom for want of a better word.

    HC
    Possibly, but we don't have that data within the ONS or around the DWP benefits data for this kind of use, to look at the match between our definition and the DWP assessment criteria.

    MF
    So you’ve shown us a complex picture there, tell us about the harmonisation work that's been going on across ONS to really develop and refine our understanding of disability as a concept.

    HC
    Yeah, so there's an ongoing programme of work, taking place to review the current harmonised standards and update them so that they can be more aligned with current conceptualization of disability impairments and conditions and try and ensure that they really relate to and reflect people's experiences. There's been a programme of research and engagement to find out the ways in which the standards are not currently performing, and what some of the key issues and gaps are, and that's due to be published in the end of March. And then the next step will be to outline in detail the plan over the coming year. So so far, the engagement activities have included speaking to data users, a variety of different organisations, government departments, charities, really including everyone across the spectrum, who are people who would use and engage with those harmonised standards to understand a bit more about the needs. And like I said, the kind of priorities and gaps and then the next step will be undertaking research to think about how best can we change and update those standards so that they, like I said, are really reflective and current. And one thing in particular that needs to be looked at being included is adding neurodiversity as a potential category. So at the moment that's not currently listed within the impairment categories. And so feedback has been that many people who are neurodiverse don't identify with the current kind of categorization and wording that's used. So that will be really important going forward.

    MF
    And that’s also an important reflection of the constantly changing perception of what disability is in society. And from that the challenge of assessing and measuring it, Helen on the Census we've recently published our results as we've already mentioned in this discussion, but would you like to unpack those for us? We know that the number of disabled people went up since 2011.

    HC
    Yes, that's right. So the number of disabled people went up, but the actual overall proportion of disabled people fell in the population. And it's important to state that we standardise this data. And that's a statistical method which enables us to, to kind of compare like with like, so it accounts for the different population age structure between 2011 and 2021. So in 2021, we saw a slight fall in the proportion of disabled people in the population. So it's currently 18% in England falling from 19% in 2011. And in Wales, it's now 21% falling from 23% in 2011.

    MF
    And what were the drivers of that? That's a fascinating find.

    HC
    That’s right. So some people might be slightly surprised by that, but it is a small decrease which we might expect to find in a population where people are living longer and healthier life expectancy is improving. And there may have been other influences such as the pandemic. So asking people how they feel about their health and disability during the pandemic may have affected how they responded as well.

    MF
    But how does the data break down by region, and by age and by gender?

    HC
    Say for gender, we saw that females were more likely to be disabled than males. And we had a particularly interesting finding around older people. So there was a big decrease among older people who've been disabled in 2021 Compared to 2011. And that was particularly true among those who were limited a lot by their disability. Obviously, we've talked about the question change where we removed a prompt, which then include problems related to old age, so that may have reduced the number of older people thinking of their conditions as a part of a disability. But we did see that that data was the same for the health question which preceded it as well. So we do think it's a real change in the population. And another aspect of that may have been due to Coronavirus. So we did very sadly see a lot of deaths among disabled people during COVID. But that wouldn't fully account for the changes that we've seen. So we think there's also an improvement in health of older people more generally as well.

    MF
    Oh, that's a reflection of the healthy life expectancy that we've discussed in other podcasts already, perhaps over and above the COVID factor that you mentioned.

    HC

    Yes, that's right.

    MF

    A greater prevalence of disability among younger people, and that was very much reflected, perhaps unsurprisingly, in deprived areas.

    HC
    Yes, that's right. So the change that we saw for younger people, again was stronger for females than for males. It was true for both genders, but females saw slightly higher proportions of disability than males. And that had increased particularly in the 20 to 24 age group, and the surrounding age groups to that, and that corresponds with some another analysis we've done where we found higher proportions of people with mental health problems, such as depression in those age groups. And we have the same outcome for health in general as well, where there is a correlation between that age group showing poorer health and more disability.

    MF
    So overall, is disability remaining fairly static from census to census?

    HC
    That's right. Well, we have seen the numbers of people have increased but the proportion as a population has stayed reasonably static. There are small falls, which does tally with the kind of improvements in health, but overall, it is showing the sorts of trends that we would expect, but we do see one in five people in the population as disabled, which is quite stark and does make us remember that we really need to think about how to improve the inequalities for this population. You mentioned just now about deprivation and deprivation among younger people, and that was an interesting finding we've had from the census data as well. It's not really a surprise to see that in deprived areas more people are likely to be disabled. But what we also found is that that occurs for younger age groups. So younger people in deprived areas are more likely to be disabled across all of the age groups than non-disabled people.

    MF
    That's the strength of the census of course, that you can get that really, really local picture of where disabled people are, as well as their overall numbers.

    HC
    Yes, and the index of Multiple Deprivation enables us to understand those areas that are more deprived or less deprived, so that we can look at those at a more aggregate level as well.

    MF
    Helen, thank you for taking us through the insight, fascinating insight, produced by the Census. But Shona, there is much more to the ONS’ work on understanding disability. Could you set out some of that for us?

    SHONA HORTER
    Yeah, of course. And I can start by just giving some brief background, there was an independent group of experts who were convened, following the request of the national Statistician in 2020, to look at the inclusivity of data and evidence across the UK more broadly, and to make recommendations as to how we can make a step-change to really ensure that everyone counts and is counted within data and evidence and that programme of work identified disability as one key area that we really need to ensure that questions and concepts are accurately reflecting the experiences of individuals. They also identified the need for more qualitative approaches as part of this. So, we need that alongside our quantitative data. We also need to be really speaking to people and understanding their lived experiences

    MF
    Because statistics and numbers, and to really understand people's experience of disability, we need to hear from them directly.

    SH
    Exactly, exactly. And the qualitative can also help us to understand the how and the why beyond the numbers, so we can understand more about the lived reality of people's experiences, the barriers that people face in daily life and people's views as to what could help to improve things going forward. But also, we can understand where we might see patterns in the data, we can actually look at what is the social context beyond what's happening on the ground that might be shaping those experiences. So it's a really, really important thing that we include alongside our statistics.

    MF
    So what sort of patterns have we been seeing from the data? Helen?

    HC
    Similar in ONS we collect quite a range of data that encapsulates different disabled people across some of the different data sources that we collect. So, one of the main surveys that we do is the annual population survey which captures people across the UK. Every year we collect data from about 320,000 people. And the picture that we're having from that data is, unfortunately that disabled people tend to fare less well across the things that we measure, say for instance, they're less likely to be happy, they're less likely to see their life as worthwhile, life satisfaction is poor and they're likely to be more anxious than non-disabled people. And we've also seen from other surveys, like the Community in Life Survey that shows that disabled people are more likely to feel lonely. So these are all not positive outcomes. But some of the more positive ones that we have seen around education data, for instance, is showing that the proportion of disabled people with a degree has been steadily climbing since 2014. And the proportion of those who have no education has been steadily falling. It's not as in-line with non-disabled people. So disabled people are still less likely to have degrees than their non-disabled counterparts, but it's still a positive trend that we do see, but that does unfortunately, then feed into things like employment data, which we'll talk about more shortly, but with disabled people less likely to be employed. They're also less likely to own their own homes and more likely to live in social housing. And when we look at the Crime Survey for England and Wales, we also see that disabled people are more likely to experience things like antisocial behaviour and problems with nuisance neighbours than non-disabled people. So it's unfortunately not a positive picture when we look at the data more generally for disabled people.

    MF
    Nonetheless, that statistical picture fleshes out quite considerably the understanding we get from the census.

    So far, we've discussed a variety of different insights on the outcomes for disabled people, but we haven't looked at their experience in the workforce and how being disabled can come with additional costs. David, what are our data telling us about that experience in the workplace and the restrictions as well as the opportunities?

    DAVID AINSLIE
    So data from the Labour Force Survey shows in the last three months of 2020 to the latest data, and considering just working age adults, about half of disabled adults are in employment, so that's around 5 million disabled adults. So, this compares with about 8 in 10 when you consider non-disabled adults, the gap and rate between these two groups has decreased slightly over the last decade. In 2013, the earliest comparison we can make is that 4 in 10 disabled adults are employed compared with around three quarters of non-disabled adults. Some analysis from the Department for Work and Pensions suggests there's a range of factors that contribute to why this gap has decreased only slightly in the last decade, the largest factor probably being the overall disability prevalence itself has increased over the last decade. This tends to suggest that more people in work are becoming disabled than necessarily disabled people becoming employed. And there are other factors too, like overall changes in the size of the working population and general employment trends over the period.

    MF
    So this is more a question really, of people being able to hang on to their jobs despite having a limiting condition?

    DA
    So yes, to an extent, there are some quite stark findings in analysis of longitudinal data from the Labour Force Survey, and this has suggested that disabled workers tend to move out of being employed over an annual period at around twice the rate of non-disabled workers, to about 9% compared to 5%. By contrast, disabled people not in employment tend to move into being employed over a 12 month period at around a third of the rate of non-disabled people. So, 10% versus 27% here.

    MF
    So what's the evidence of the ability to work from home as encouraging more disabled people into the workforce?

    DA
    So the general trend of a slightly closing employment gap has actually stalled a bit since the start of the COVID pandemic. More research is definitely needed here to see what the impact of the pandemic has been as well as to look at if there's an impact of an increased ability to work from home. It's worth noting that the pattern of occupations that disabled people and non-disabled people tend to work in look a little different. So in the latest data, data again from the Labour Force Survey, working disabled people were less likely to be working in things like management in professional occupations than non-disabled people, but more likely than non-disabled people to be in occupations that might have been shut down during the pandemic or those where you might have had to work closely with people. So occupations such as in caring and leisure, or in sales and customer service, type occupations. This will of course have had some impact on people's ability to work from home. Lastly on this, of people who are in work at the moment, the latest data from our opinions and lifestyle survey, show it's actually a fairly similar proportion of disabled and non -disabled adults report working from home, hybrid working or indeed travelling to work.

    MF
    So not a fantastic picture perhaps for those with lifetime limiting conditions. But what did the data tell us about pay between disabled and non-disabled people Matt, what is the earnings gap?

    MATT MAYHEW
    So there's a similar sort of negative picture here as well. So the latest data we have on the annual population survey from 2021, showed that disabled employees on average were paid around 14% less than non-disabled employees. The gap appears to have widened slightly since 2014, whereas this was about 12%. If you tally this in pounds and pence terms using the latest data, the average pay was around £12.10 pence per hour with disabled adults, compared to about £14.03 per hour for non-disabled adults.

    MF
    And David, what evidence do we have about how different types of disability affects your chances of being in work and earning money?

    DAVID AINSLIE
    So the definition of disability of course covers a wide range of people impacted by a whole variety of physical and mental health conditions and to different extents. So the averages we've described cover a really broad group. But we've explored the Labour Force Survey data further, we've looked at both employment rates and pay by a variety of factors that are important to these things, such as the severity of disability, the number of and types of health conditions people report having, as well as some other things. But for example, in the latest data, the lowest employment rates were seen among those who reported they had severe or specific learning difficulties, autism or any mental health conditions with employment rates around about 25 to 30%. At the other end of the scale, disabled adults whose impairments were due to hearing, or skin conditions or allergies, had the highest employment rates, around 60 to 70%.

    MF
    And is this a similar picture with pay, Matt?

    MATT MAYHEW
    So when we look at pay, there's a similar sort of variation. So the largest pay gap is for disabled boys reporting autism as their main health condition, who have been paid an average of about a third less than non -disabled employees without any health conditions. Whereas the next largest pay gap is with those with depression with 18% less. By contrast, on the other side of the scale, for those reporting difficulties with seeing, there was actually no pay gap observed between the two groups. And in fact, we observed for those difficulties in hearing, were on average paid about 5% more than non-disabled employees. Severity of impairment is important too. So the pay gap between disabled people who are limited a lot in their day-to-day activities and non-disabled adults is about 20%, whereas those where their condition limits them a little, about 12%.

    Lastly, it's important to remember that when considering both pay and employment, that disabled and non-disabled people have patterns of other characteristics creating difference between them, such as age, sex, where they live, and the type of occupations they have, and they are quite often quite different. To look at this on the pay side we have modelled what the differences between the average pay of a disabled and non-disabled employee might look like, if the two groups have the same patterns or personal characteristics, similar ages, same sex distributions and job characteristics such as doing the same jobs, to see what effect that has. When we do this, the differences in the average pay between disabled and non-disabled employees are narrower but still persist between the two groups.

    MILES FLETCHER
    Shona, it's all very well to talk about a superimposed definition of disability, statistics tend to bring two issues, but our qualitative work has actually thrown up some fresh insights on what sort of characteristics make up disability?

    SHONA HORTER
    Yeah, and we've undertaken two quite large qualitative studies over the last year. One looking at the experiences of disabled adults accessing and engaging with activities, goods and services in the private sector and the other looking at the educational experiences of children and young people with special educational needs and disabilities. And through both those studies, we found a really a huge variation in people's experiences of disability or impairment. Many people have invisible impairments or conditions. People can have multiple impairments and comorbidities that can interact. So for example, physical conditions can be accompanied by mental health difficulties. And there can be a cyclical relationship that people describe as being catch-22. If you're struggling with your physical health, and that can exacerbate mental health, which can then cycle back into other challenges. So it's really important to understand that variety of experience and also how that can link to individuals' identity. So that can be a real process of adjusting to a diagnosis or to living with an impairment or condition and that can require adjusting one's own identity. And can also be different if you've recently acquired a condition or if you've had it for your whole life. And that's also linked to the social, so a lot of people spoke about anticipating judgement of others, being seen as different and the vulnerability that can come with that, and the anticipated stigma. So there are lots of reasons why people won't necessarily identify as disabled, or reasons why people may not identify with a specific condition, illness or impairment. And some young people also talked about diagnostic labels that they saw themselves as having such as dyslexia or autism. ADHD, that could help to explain their learning and support needs. But actually, many really didn't want to have labels that were used to justify any kind of different treatment, different opportunities and some rejected the label altogether. So it's really just understanding that nuance and the multifaceted nature of people's experiences and identity around this.

    MF
    And you made an interesting point there about disability that perhaps is invisible to others, unlike some forms of disability, which then people benefit from a recognition of that, but then others perhaps suffer from not having the disability recognised.

    SH
    Yeah, and that can be really complex. So we're working on a paper on that very topic at the moment that we're planning to publish soon, that actually that's another layer of, on the one hand, people describe being able to potentially hide having a condition or impairment that means that you might be able to avoid anticipated stigma, or discrimination or kind of negative reaction of the public but also, often people describe needing to disclose their condition or impairment in order to access the support that means they would be able to then access and engage with different areas of life so there can be quite a lot of vulnerability around that and that can also intersect with other characteristics. So people described the additional layers of vulnerability linked to gender linked to ethnicity that can form multiple, multiple layers of discrimination.

    MF
    And thereby we get to a much more useful understanding of, say a wider pattern of disadvantage generally.

    SH
    Yeah, and also really the need to understand me, what came through so clearly in both of these studies was the need to be flexible and to listen to and understand people's different needs.

    MF
    So this qualitative work is providing a much richer understanding of a wider pattern of disadvantage of which disability is but one factor.

    SH
    Yeah, and I think it really, these studies have both really highlighted the importance of having flexibility in listening to and understanding people's different needs. What we saw from both young people's perspectives and the perspectives of adults was really that just having that need for additional support. That means that you can fully and meaningfully participate in education and daily life in a range of different activities, goods and services. One of the particular difficulties I think, around invisible impairments that the findings presented was what people described as a need to prove their worthiness or prove their legitimacy of access to additional support needs. And that could bring an additional challenge, and also an additional level of stigma and vulnerability, that many people described facing backlash when, for example, having a lanyard or a badge that indicated the need for additional support. That there's that disbelieveability around that as well. So I think it was really emphasised through all our studies how important it is that there's increased awareness about people's different experiences and needs and that there's increased understanding and support for that put in place.

    MF

    And so the knowledge base continues to grow. The time of recording this, we're beginning to be able to step back and look at the pandemic period. What do we understand of that at the moment, in terms of its impact on disabled people at the time of the pandemic, and perhaps some of the lasting impacts and effects of it?

    SH

    Yeah, and I can comment briefly on the qualitative research there and Helen, it would be good to hear more about the quantitative findings on this. We found a really mixed picture in terms of experiences through the pandemic. So in some ways, the increased move to online services, groups, and education classes really opened up the world for people and could facilitate and support access and engagement and also could really support social contact. So many people described being able to join groups and networks that they hadn't previously been able to do as a result of, for example, physical access barriers. So that was really, really positive. But on the other hand, there was a real disadvantage for those who are digitally excluded. And we know that disabled people are disproportionately digitally excluded. Also in other ways the pandemic contributed to increased fear, increased isolation, so people talked about having increased social contact online, but that didn't necessarily replace the real in person activities and social contact. Some also found it quite difficult to engage and focus with online format. So particularly describing the cognitive demands that could be difficult for those with memory and learning, understanding or concentration impairments in particular, and real difficulties in accessing the required support. So SEN support was said to be quite slow to return to schools, and some children and young people with physical needs couldn't return to school safely without that support. So there was a kind of lasting impact there as well.

    MF

    And what's the future of the qualitative side of this work as well? Will that continue alongside the regular sources of overarching data?

    SH

    Yeah, I think it's important to continue to think about where we have gaps in our evidence and understanding and where there are questions that we can address qualitatively. And also considering having more mixed methods approaches. So where we can consider including qualitative components alongside our statistics to better understand some of these particular questions in more detail. So for example, one of the qualitative studies I've just mentioned was around the additional cost of disability. So, many people spoke about the additional financial costs like added premiums for insurance or for medical equipment compared to similar equipment that might be purchased by somebody who's not disabled, and we know that that's likely to become evermore difficult with the cost of living crisis. So whether that's an area that could warrant further exploration, I think it's just continuing to think about what are the key questions, evidence gaps and the needs for us to be addressing to ensure that we can inform policy and practice and make sure that our data is speaking to people's experiences and needs.

    MF

    Thank you, Shona. Helen, is there anything you'd like to add on that?

    HELEN COLVIN

    Yeah, absolutely. I think the work that Sona’s team have been doing around qualitative research with disabled people is so valuable for helping to really illustrate the findings that we tend to get from the aggregated data that ONS has been more traditionally producing this. So where during the pandemic those qualitative interviews really showed and highlighted the day to day situations that disabled people were living with, because alongside that we also ran a survey on the opinion survey about disabled people's access to products and services and the data that we had from that very much supported the sort of experiences that people were reporting to Shona. So we found that disabled people had more difficulty accessing products and services during the pandemic than non- disabled people, and much of that appeared to be down to difficulties using transport, having places to rest and with practical things like crossing the roads, crossing footpaths, and moving around buildings, which were evident in some of the sort of lived experience quotes that Shona’s team were picking up on as well.

    MF

    How to statisticians go about the whole question though of deciding whether it is disabilities that’s the driver here, or whether there are other factors that ought to be considered first?

    HC

    Yeah, good question. So we know that in a lot of the outcomes that we look at disabled people tend to fare worse than non-disabled people. This isn't likely to be down to just being disabled per se. We think it's more likely that no single factor is going to explain the kind of outcomes that we see. It's more likely that there is a range of disadvantages experienced by disabled people which lead to this difference for non-disabled and disabled people.

    MF

    Overall then, an awful lot of work is being done, a wealth of data and understanding, where's this work going now? And how it can inform really effective interventions and policies and services to support disabled people?

    HC

    As we mentioned, we measure disability against the Equality Act, and that's really crucial for us to be able to understand how as a country we are performing in terms of monitoring and reducing inequalities for disabled people. The data that we collect and publish then feeds directly to policy makers. So one of our key stakeholders for my team is the Cabinet Office and working with their disability units. We make sure that they have the information they need to make policy but also thinking about local authorities monitoring that policy impact, monitoring the outcomes in their area and thinking about what they can do in terms of planning to support disabled people. But also our data is used widely by the third sector and lobbyists to hold government to account, to look at these inequalities and say, what's happening here? What are you doing about this and of course, it's really valuable for the citizen users and public interest to know what the situation is for disabled people in our population.

    MF
    Well, that is it for another episode of Statistically Speaking, thanks very much to all our guests for another fascinating discussion. For news about these podcasts, and the work of the Office for National Statistics, and to comment or ask us a question, please find us on Twitter at @ONSFocus. That’s it from me Miles Fletcher. Our producer at the ONS is Julia Short. Thanks for listening, and until next time, goodbye.

    ENDS

  • With news headlines proclaiming the UK has ‘narrowly avoided a recession, we decode the ‘r’ word and explain why this sometimes misleading term is one the ONS is often cautious to avoid. We get the lowdown on GDP (Gross Domestic Product); discuss whether its time as the yardstick for measuring the success or failure of the world’s economies is coming to an end; and hear how the ONS is already looking well ‘Beyond GDP’ and introducing broader measures of social wellbeing and the environment to provide us with a more holistic view of how society is faring.

    Joining Miles is ONS Director of Economic Statistics, Darren Morgan, Chief Economist, Grant Fitzner; and Director of Public Policy Analysis, Liz McKeown.

    Links

    Latest GDP data Measures of National Well-being Beyond GDP

    Transcript

    MILES FLETCHER

    Welcome again to Statistically Speaking the official podcast of the UK’s Office for National Statistics. I'm Miles Fletcher, and this time, we're going to talk about a very famous and long running statistic that’s still regarded as the single most important economic indicator of them all. I'm talking of course about GDP (Gross Domestic Product), the expansion or contraction of which is the yardstick against which the success or failure of the world's economies is measured. It's been around a long time, since around the time of the Second World War, in fact, but is its pre-eminence now coming to an end? GDP misses some things out - that which matters, as was once memorably claimed. So we'll be talking about how the ONS has been updating GDP to keep it relevant and developing new complementary measures of economic and social wellbeing that could perhaps, in future, supplant GDP itself. And in the current economic climate, we cannot avoid the R word. What exactly is a recession? How much does it actually matter, if it's only a technical one? Is it the difference between economic disaster and salvation? Spoiler alert, it really isn't.

    Anyway, we have a panel of top ONS folk to explain it all: Darren Morgan is director of economic statistics production and analysis, Grant Fitzner is Chief Economist and director of macro-economic statistics and analysis, and also with us is Liz McKeown, Director of Public Policy Analysis, who is leading the drive towards these broader measures on social and economic welfare.

    Welcome, everyone.

    Darren to start with you. You are responsible for the production of the UK’s GDP estimates. So let's start by reminding ourselves what precisely it measures, it's basically seeking to put a value on all economic activity over a given period.

    DARREN MORGAN

    Yeah, so we look at GDP and we measure the economy in three different ways. First of all, we do it via what you call the output approach, and most simply, that's everything that's produced in the economy, and that can be cars rolling off the production line, that can be a lawyer providing advice as a service, and it can be public services as well. So surgeries, GP appointments and so on. So everything we produce in the economy. We also look at measuring the economy, everything that is spent, so that could be you and I in household, spending money in the shops or on leisure activities. It can be businesses spending money on goods and services. And it can also mean the government spending money, so everything we spend as well. And the third way we measure GDP is the income approach, which is basically everything that's earned in the economy. So for us in terms of households that's wages and salaries, for businesses it’s profit, for example. So we measure everything we produce, everything we spend and everything we earn, and in principle, they should all add up.

    MF

    And you're boiling it down then, a vast amount of data flowing into the ONS, boiling it all down to one single indicator.

    DM

    We do, and we do that by approaching thousands and thousands of businesses asking them about their performance. We speak to thousands of households about their behaviour. And we also use a lot of data already available withing government, so what we call administrative data - data that already exists. And we bring all those different data sources into the building, we look at it and we confront it, and we come up with ultimately, as you suggest, a single number on the growth of the economy.

    MF
    What's changed in the in the collection of data now? How timely a process is this?

    DM

    So in the UK, we've got one of the timeliest measures of the economy in the world. And we only have one of two countries who produce a monthly measure the economy, so we do it much more quickly, and obviously it is completely different to how we did it say, even 10 or 15 years ago. We collect most of our data now from businesses online. Whereas previously we used to send a questionnaire to them, used to write the questionnaire and they would send it back to us, and that could take a week or weeks to do that. Businesses can fill the form in now sat at their desk online, do it very quickly and it reaches us straightaway.

    MF

    And you mentioned administrative data as well. So that's coming from other parts of government. What are the main sources there? How is that gathered?

    DM

    So that's correct. So what we try to do is minimise the burden on businesses and households, so some businesses may have to complete a tax return to HMRC for example. So we are able to use that information and bring it in, so that's one example. Pay As You Earn, people who use pay as you earn systems, will be well aware that we use that in our labour market numbers. But we use lots of different sources that are already available across government, and we reuse them for statistical purposes, like I said, to provide better estimates, because that data tends to be very good, but also to minimise the burden, as I said on households and businesses at the same time.

    MF

    And what is the coverage, in terms of what's included, how has that evolved in recent years?

    DM

    So in a way, in terms of what we call the boundary, the economic boundary, that has actually stayed very similar over a long period of time. It is very traditional in terms of the boundary we measure. So, like I said, it's sort of business activities, household activity and government activity. But it is along those lines about how much is produced, how much is spent, how much is earned, but the boundary for the economy has been very similar for 50 years.

    MF

    Nevertheless, there are some things included in GDP which might surprise some people. For example, in the most recent GDP release we talked about the fall in the number of pupils in classrooms in the last quarter of 2022.

    DM

    The public services was actually a really key indicator for the number that we published for December, and we saw a fall in the number of GP appointments, a fall in the number of operations, less vaccinations being given because the autumn booster campaign tailed off. And we also saw lower attendance in schools, because in the lead up to Christmas not so many pupils will go into school as we normally see. And the reason why we measure that, as you can imagine we measure teacher salaries, doctor salaries, we measure how much is invested in the health service, how much is invested in schools, and obviously those schools and hospitals buy goods and services. So, it's a really important part of the economy. So of course we measure the goods and services that they produce as well. It's a really important part of the economic measurement for GDP.

    MF

    And I think I’m going to use it to motivate my children in the mornings as well. When they go off to school I’ll be reminding them of their contribution to our economic performance.

    DM

    They certainly are. So it's a really good way to get them through the school day, Miles.

    MF

    But there's a serious point underlying this, and there's a bit of a propaganda point for the ONS here as well, as it because we are actually taking real measurements of public sector activity, and it's been said that some countries just make broad assumptions about that activity. What do we do that other countries don't?

    DM

    You’re absolutely right, Miles. And that became most marked during the lockdowns during, the COVID pandemic. So we measured, if I can give schools and education as an example, we actually measured how much education was being provided to pupils during a lockdown, whether that was face-to-face in schools, or whether it was remote learning, or whether unfortunately, in some cases, there was no learning at all. We measured that directly, whereas perhaps some other countries basically measured the number of pupils. So as you can imagine, the number of pupils is the same whether they are getting taught or not. So in the pandemic we showed a sharp fall in education during some of the lock downs, but we've seen a faster recovery in the years that followed. Whereas if you look at other countries, their measurement of education has been far more stable over the most recent years because the numbers of pupils doesn't really change.

    MF

    They are pretending that the schools were open, when in fact, they weren’t. Anyway, that's just part of this enormous data gathering operation, bringing in all this data, and it takes around about six weeks to produce the preliminary estimate, which you say is among the quickest of the estimates, but of course that's only part of the story, isn't it?

    DM

    That's pretty quick, six weeks, but we do produce an estimate for all three measures, we produce a measurement how much is produced, how much is spent, and how much is earned at that point in time. So we do that, but obviously, we only have so much data at that point. You know, we have quite a lot of data to actually because those surveys are very timely, but not everything.

    MF

    As a percentage, it's about 40% isn’t it?

    DM

    That's correct. But obviously our data collection doesn't stop at that point. We continue to bring new data in. And that's why we publish the latest estimate, which covers more detail, more granularity, different parts of the economy. And that additional data that's brought in allows us to do that at a later stage.

    MF

    You have a couple more months to produce that one, and that's based on pretty much all of the data we're going to get.

    DM

    Yeah, it's over 90% of that stage, it’s about 90%. So yes, we have between the first estimate and the second estimate, we do get a lot more data in.

    MF

    And therein lies, what some people might say is one of the weaknesses of GDP, and particularly when making quick assumptions about the economy. There's a trade-off here isn't there, about wanting to know broadly where the economy is going, and making really, really hard and fast assumptions about what's happening. And therein lies the whole issue of revisions, revising GDP. Now, it's important for everyone to understand that when the ONS revises GDP, it's not correcting its mistakes is it.

    DM

    What you’re describing there Miles is a classic tension in statistical production. So we could say to everybody, our users, no, we're not going to publish anything until we get all that data, all that 90% of data. But to do that, you're going to have to wait about 80 days. Or what we could do is drag an earlier estimate based on less data, but still not a really good estimate, but you could have that 40 days quicker, 50 days quicker. So you know, there's that tension between timeliness and quality. And I think the way we do it, I think it's brilliant. We published two estimates initially, and that’s for the quarter. The one that's a bit quicker based on less data, and the one later based on more data content. But what we do to help our users is we have a really detailed revisions analysis between those estimates, so people can look and judge typically, how often and how much is that data revised when we publish. So they have the full information in front of them to make judgments if they have to. And I think we strike the right balance taking that approach.

    MF

    What is the ONS’ track record in doing this? Because have there been occasions perhaps, as has been suggested, sometimes that the early data can be misleading, and in fact, the economy might be heading in the opposite direction.

    DM

    So if you're looking at revisions analysis, it's pretty good, you know, within the first estimate, and that second estimate, and so revisions are typically very small, and importantly, unbiased, they're equally likely to be a revision up or a revision down, and that's really, really important. I think when a real spotlight is shone on revisions, that’s when the economy is around zero, you know if you have a 0.1 revision, which is a small revision if your economy is going along at 0.8, 0.7%. You know, whether it’s 0.7, 0.6 and so on, people go ‘Ah, so what?’. But if the economy is going around zero, or 0.1 or –0.1, that 0.1 revision can change the sign, and people get very excited about that. But actually, it's a 0.1 revision, and that's when the spotlight is really, like I said, is shone on the revisions performance

    MF

    As it was in our most recent estimate of quarterly GDP, the final quarter of 2022 when there was a big fat zero in terms of growth. Now, that led to headlines in some very respectable media organisations that went UK narrowly avoided recession. Well, did we?

    DM

    So we did technically yes, we did. Absolutely. Because it wasn't negative. That was our Q3 estimate of the economy was for a four, so if Q4 fell for economic growth, a technical recession, which is widely recognised as two consecutive quarters of negative growth. Yes, we would have been in a technical recession. But I think you've just highlighted how it makes sense to look more broadly at the economy because whether it was 0, or –0.1, 0.1, how different really was the economy at that point in time? I would say the economy was broadly flat.

    MF

    Because if you're beholden to this idea of a technical recession, a couple of months down the line we might say hang on, our better estimate based on 95% of the data says actually it was just slightly down, and therefore the headline writers say, Oh, we were in recession after all.

    DM

    Exactly. I think that just highlights, again, being sensible in terms of how you look at the economy overall, because really the economy, if it's a 0.1 revision ,if that's what happens in it in a few weeks time, is the economy fundamentally different to what it is at that moment? I would suggest not, but you're right, I would imagine that it would get splashed that the UK is now in recession, and coverage will be significant because of that.

    MF

    And it's fair to say that in the past these technical recessions, there was a double-dip recession wasn't there about 10 years ago, that made a lot of headlines at the time. It's not in the figures anymore.

    DM

    No, it's not. It's been revised and that period of our economic history when we were around that flat period for the economy. So the revisions have been relatively small in that period, but you're right, we were in recession and because we had revisions from later data, we no longer were. And as you suggested people got very excited about that. But really, Miles, the economy was in exactly the same position as it was in our first estimate.

    MF

    So a strong message there listeners, when you hear people talk about a technical recession, bear in mind, that may not be what it sounds like. In fact, it probably almost certainly isn't.

    DM

    Good advice, Miles.

    MF

    Grant, to bring you in on this then, from an economist's perspective, it's fair to say then that in fact, there's no definition of a recession that's really official or formally accepted anywhere. It's certainly not something that the ONS talks about.

    GRANT FITZNER

    No, I mean, ultimately, it's a matter of judgement. And of course, economists spend a lot of time arguing about these things. In fact, it was so bad in the US that academic economists, as part of the National Bureau of Economic Research set up a committee to discuss and agree on when business cycles were, well when recessions started and when they ended, so that when they were comparing their research they were all working off a common framework. Now, that sounds great, but the problem of course is with this being academics, they looked at a wide range of data, and they typically took several years after a recession had occurred before they would put definitive data out of it. Now, that's fine if you're publishing economic history, but if you're a journalist or indeed if you're working at the Office for National Statistics and you want to have an idea of what's going on now, you need something that's a bit closer to real time, and that does, as Darren said, involve a degree of judgement. But I think it's fair to say that the common sense understanding of a recession is a prolonged and significant downturn in economic activity. So not just one or two quarters, and not just a 0.1, but actually something a bit more substantial, as indeed we've seen in the 70s and the 80s, and of course, in the global financial crisis that kicked off in 2008. So they typically last for a while, and they do have quite a significant impact on the economy, households and business.

    MF

    In fact, that’s a lot more serious isn't it, than the definition that's used as a sort of working rule of thumb, which is two consecutive quarters of economic contraction. In fact the origins of that are very murky, really, nobody actually seems to know precisely where it came from. One of President Nixon’s speech writers seems to be the main suspect.

    GF

    Well, possibly, but it has been more widely used. I think journalists need something quick and simple to understand, and I guess this meets the bill. But imagine if you had a –0.1 in one quarter and then a –0.1 in the next, and then they were subsequently revised away, I don't think anyone would seriously call that a recession. And just the point about the length as well, if you look at the 70s, 80s, or 90s, recessions typically last about three years. That's how long it took for the level of economic activity to get back to the pre-recession levels, and indeed for the global financial crisis that kicked off in 2008, it took four and a half years before growth was back at pre-recession levels, so an incredibly long time. And I think just looking at the pandemic and the impact that that had in 2020, it's a very different set of events. We had two negative quarters and then the economy started to recover after of course, a very large fall. Now that's unusual. And of course that was because of this shock of the pandemic and lockdowns. Whereas typically, these things take quite a bit longer to kind of work their way through the system.

    MF

    And if you look at the path of GDP on the time-series graphic on the ONS website, it really goes off a ski slope doesn't it, really quite dramatically as the pandemic starts and then kind of sharply recovers, and then it's kind of clawing its way back now.

    GF

    That's right. And so things are often slower than we may be used to in recent years. And to give you an example of that, at the moment, we have the Bank of England raising rates quite aggressively so interest rates have gone up, mortgages have gone up, businesses are facing higher costs of borrowing, but the labour market still looks pretty robust. Now historically, if you look at past recessions, there's always a bit of a lag between, for example, central bank tightening or some sort of supply shock and for that to work its way through in terms of employment, business, profitability, and so forth. So these things often take longer than people expect. Now, I'm not saying of course, that that means we're in a prolonged economic downturn. I mean forecasters differ as to how severe and how long the current period of economic weakness is likely to be and indeed, people disagree on whether we may even enter recession this year. It's that close.

    MF

    But we'll know if we’re in a significant downturn, a genuine recession or whatever label we want to apply, when it happens, but at the moment we seem to be in sort of somewhere in between. Disappointing though that might be for headline writers.

    GF

    And the sort of things that you would typically look at would be more businesses going out of business, so business liquidations, weak retail spending, which of course we have seen, driven by the big increase in the cost of living over the past six months, and significant increases in the level of unemployment. Those are three of the things that you would typically look at. Possibly also weaker industrial production is often associated with recessions as well.

    MF

    So does that suggest then, talking about the action being in those other indicators, does that period for the economy, perhaps an economy on the cusp of growth and contraction, does that highlight one of the major limitations of GDP as a measure? How seriously do economists regard it now? Does it remain that big, totemic bellwether of economic success or failure?

    GF

    Well it is a broad and pretty comprehensive measure, so it does include income, expenditure and output. So a lot of what you would typically consider economic activity, but of course it doesn't cover everything. It doesn't cover anything produced in households, at the moment it doesn't properly capture what's going on in the natural environment. So it's certainly not broad enough to cover every kind of activity that produces something of value. And it typically focuses on things that can be measured or quantified, or have a value ascribed to them. So the market sector is the largest part of the economy that we measure through gross domestic product, because there's also the non-market sector, public sector charities, etc. They are a bit harder to measure. One of the interesting differences between the UK approach and some other countries is that we spent quite a bit of time trying to measure not just how much we spend on health and education, but as Darren said, what actual activity, what outputs, are we getting from that investment?

    MF

    Yeah, I mentioned at the top of the podcast, there's this famous quote from Robert Kennedy, of course, famously US Attorney General and then presidential candidate. He actually said the problem with GDP is it does not allow for the health of our children, the quality of their education or the joy of their play. It doesn't include the beauty of our poetry or the strength of our marriages, intelligence of our public debate or the integrity of public officials, etcetera, etcetera. It seems to me that the demand then for more holistic measures of well-being or progress, in fact goes well beyond economics, but is there more that economics can contribute? And what is the ONS doing towards that?

    GF

    Yes, there is more that we can do. And indeed, we have been doing that. So we've created a series of what we call satellite accounts, which measure either different parts of the economy or activity, or indeed measure things that are currently outside of what we call the national accounts. So for example, we've been publishing at the ONS for quite some time now an annual series of natural capital accounts, which tried to convey you what's been produced out there in the environment. Clean air, for example, is an output of trees and vegetation and parks. We try and put estimates around those. Now, of course, there's some challenging methodological issues about how you measure some of these things, but I think we've had quite some success in actually putting some values around those. And at the international level, the current system of national accounts was devised back in 2010, there's quite a lively, if indeed statisticians can have a lively debate, around what the next system of national accounts will look like, which is due to come in 2025. And one of those very issues is do we start to bring the environment more into those measurements.

    MF

    So not quite the beauty of our poetry but certainly the landscape, the value of our environment.

    GF

    Exactly. And I suppose the other misconception about GDP is people often see it as a measure of well-being. It was never really designed to play that role. It's a measure of economic activity. Now, of course, there’s a clear link between economic activity, prosperity, and well-being, but they're not the same concept.

    MF


    So in order to be more inclusive, and to fully reflect activity in its broadest sense, we're having to go much further than that. And a bold initiative in that direction, started more than a decade ago now, was the national well-being programme launched by the then Prime Minister David Cameron.

    Liz McKeown, the National well-being programme was, it was not taken wholly seriously. I recall at the time it was dubbed as Cameron's Happiness Index, and the idea that we could dump GDP and inflation and so forth was taken with some mirth. Ten years on, how far have we come to developing alternative measures like that, and how seriously have they been taken?

    LIZ MCKEOWN

    I think we've come a long way, but perhaps it's worth us looking back to those days of 2010 and what we did then, we wanted to know what matters most to people. And we went out and asked them and we had over 34,000 responses to that debate. And that allowed us to start measuring well-being for the first time as a national statistical Institute, that debate, understanding what really mattered to the public, getting those responses allowed us to develop 10 domains of well-being. These are the things that people were saying really mattered to how they felt as individuals, as a community, and you know, ultimately as a nation. And the domains that we developed there were personal well-being, they were our relationships, our health, what we do, where we live, our personal finances, our education and skills, the economy, governance, and the environment. And under those 10 domains, we developed a number of measures, both objective and subjective, which allowed us to begin to get to that question of how are we doing as the UK in a more holistic way than economic measures can do alone.

    MF

    And what story has that told over the years? How were we doing? How are we doing?

    LM

    I think it opens a new lens and allows us to think about that quite differently. Perhaps I could take an example of how we thought about well-being during the pandemic, there we were wanting to understand what's the impact of lockdowns more broadly, and we could use wellbeing measures to help us understand that. We could see how personal well-being and levels of loneliness were, you know, really negatively impacted during the lockdown, and then we could see the improvements as we came out of them. We could see how that differed by how men and women were doing. We saw during the pandemic women's well-being falling below men's for the first time, and so we could understand a different dimension of how society was reacting to one of the big issues of our time.

    MF

    And when we ask people how happy they are, they tend to give quite a positive response, don't they?

    LM

    Well, I think it's important to say that wellbeing goes beyond just asking people how happy they are. So personal well-being does look at people's happiness, it looks at their levels of anxiety, and it looks at how satisfied they are with their life and how worthwhile they think the things in their life are. But the broader concept of wellbeing is understanding how people are doing across these domains that I mentioned earlier.

    MF

    Now this isn't just suddenly what's been going on in the UK, there's something of a global movement to broaden out our approach to measuring not just personal well-being, but economic well-being as well. And an important part of that is the UN's Sustainable Development Goals. And put quite simply, it's a global initiative to find out if the world is becoming a better place, and to set targets and then policies from that.

    LM

    Yeah, absolutely Miles. And I think it reflects doesn't it that people do want to understand progress in that multi-dimensional way. They want to understand not just how we're doing economically, but actually what the impact on our environment is, what the impact on our society is. And those indicator-based approaches, be they the well-being measures that we've developed here in the UK, be they the Sustainable Development Goals, they're allowing us to take that broader check on progress or sort of multi-dimensional check on progress and allows us to see things that we couldn't see if we were only looking at the core economic statistics that you were discussing with colleagues earlier.

    MF

    Now on GDP day when the ONS produces its quarterly estimates of economic performance in that traditional sense that we talked about with Darren, there are two important publications that do get slightly overlooked on the day but are well worth highlighting now. And the first of those is one entitled quality of life in the UK. Sounds intriguing. Tell us about that.

    LM

    These two publications we added to the mix on GDP day last year, and why did we do that? I think it really wanted to reflect how important it is that we look at progress in that multi-dimensional way that I was talking about earlier. That we give people the chance to see not just what the latest economic data is telling us, but we are also looking at how life is going for people in the UK, and that's where the quality of life in the UK publication comes in.

    MF

    Break down the elements for that if you would, tell us what sort of narrative it's providing at the moment about our quality of life.

    LM

    Yeah, so this is a publication that every quarter looks across those 10 domains of national well-being, personal well-being, relationships, health what we do where we live personal finance, economy, education skills governance in the environment. It looks at the measures we have under those domains and says well, what news have we got from the last quarter. And I won’t go through all that here, I encourage you to go and have a read of it, it makes interesting reading. But for example, on the personal well-being side, we have seen in the last quarter a drop in the percentage of adults who've seen very high levels of life satisfaction and happiness. There's been a decrease in that. So that's one to watch, and one to keep an eye out for. But the publication goes across the 10 domains and yeah, as I said Miles, well worth a read

    MF

    An interesting alternative view as well at a time when the classic economic data was showing a big zero reading. In fact, there's another aspect in which an awful lot is going on, and obviously a downward trend there in some respects, at least.

    LM

    Absolutely. And users are telling us that they want to understand what's going on across the country in a more holistic sense and understand a bit more about our societal measures, but also about our environmental measures. And I guess that sort of takes us on to the other publication that we put out on GDP day on climate change insights. And if you take all those three publications as a whole, so the quarterly GDP figures, the quality of life in the UK and the climate change insights publication, you're basically allowing the public policymakers to look and understand, okay, what's the latest developments in the economy? What's the latest developments in society and people's well-being and what's the latest environmental developments? And it's allowing us to begin to answer that question, how is the UK doing in a much more holistic way than we've been able to before.

    MF

    So I guess what I'm taking away from this lightning tour of a fascinating and extremely diverse environment, is that when you see headlines saying the economy is neither growing nor contracting, there's a much, much bigger story out there and there's a much bigger story to be learned by looking at the ONS data.

    LM

    That's exactly right. And we're not standing still either as an office as well. We want to make sure that what we're measuring is still what matters most to people. As I said, that's how we started the well-being programme in the first place by going out to the nation and asking them what matters most. That was over a decade ago, and obviously, a lot has changed over the last 10 years. So it felt like a good time to take that step back and think, are we still measuring the best things to measure in our well-being programme, and the National Statistician kicked off a review of those measures back in October. So we're working through that at the moment and in the spring we’ll be presenting some recommendations for how we can do this even better in the future.

    MF

    And where do you think is going to lead? Do you think GDP might be toppled off its perch and we'll be able to produce one big comprehensive indicator that would bring in all that economic activity as well? Is that Is that where we're headed?

    LM

    I think GDP will always be an influential statistic. As a measure of the productive economy there are huge strengths to it. And strengths are continuing to increase as it becomes, as I think Darren mentioned earlier, more timely, better quality. So GDP is important and will remain important for ONS. But we also know that looking at progress more broadly than GDP is more important than ever to members of the public who want to understand how we're doing, but also to policymakers who are looking at future policies and providing statistics and insights that help both the public and policymakers to make the best possible decisions. That is what we are, as a national statistical institute all, about. So GDP, important, but actually having a full range of data and statistics and insights that go beyond that. That's where the future is.

    MF

    Darren, as the person responsible for producing GDP, that's a challenge for the future then?

    DARREN MORGAN

    That’s right and I think Liz summed it up really well. I think GDP is important, but it's not everything.

    MF


    Well thanks very much to all our guests for a fascinating discussion there, and we'll put links to some of the ONS publications we discussed in the programme notes for further reading.

    I'm Miles Fletcher. And thanks for listening to Statistically Speaking. You can subscribe to new episodes of the podcast on Spotify, Apple podcasts, and all the other major podcast platforms. With thanks to our producer Steve Milne, it's time to say, until next time, goodbye.

  • Miles explores how data linking can help tackle cross-cutting issues in an increasingly uncertain world, and how the ONS’ new Integrated Data Service will provide a step-change transformation in how researchers will be able to access public data.

    Joining him are ONS colleagues Bill South, Deputy Director of Research Services and Data Access; Jason Yaxley, Director of the Integrated Data Programme; and award-winning researcher Dr Becky Arnold, from the University of Keele.

    TRANSCRIPT

    MILES FLETCHER

    Welcome again to Statistically Speaking - the Office for National Statistics Podcast. I'm Miles Fletcher and in this episode, we're going to step back from the big news making numbers and take a detailed look at an aspect of the ONS which is, less well known, but arguably just as important.

    The ONS gather an awful lot of data of course, and much of it remains valuable long after it's been turned into published statistics. It is used by analysts and government, universities and the wider research community. So we're going to explain how that's done and look at some really interesting and valuable examples of how successful that has been to date. And we're also going to hear about a step-change transformation that's now underway in how public data is made available to researchers, and the future potential of that really important, exciting process. Our guides through this subject are Jason Yaxley, Director of the ONS’s integrated data programme, Bill South who is Deputy Director of the Research Services and Data Access Division here at the ONS, and later in the podcast we’ll hear from Dr. Becky Arnold who is an award-winning researcher from Keele University.

    Right Bill, set the scene for us to start with then, we are talking here about the ONS Secure Research Service, take it from the top please. What is it? What's it all about? What does it do? What do we get from it?

    BILL SOUTH

    Hi Miles, thank you. Yes, the Secure Research Service, or the SRS, is the ONS’ trusted research environment. We've been running now for about 15 years, and we provide secure access to unpublished de-identified micro data for research that's in the public good. So in terms of numbers, we hold over 130 datasets, we've got about 5000 Researchers accredited to use the service and about 1500 of those would be working in the system at any given time on about 600 live projects.

    MF
    So what sort of data, what is stored and what's made available? Is this survey responses?

    BS
    Traditionally the SRS has held most of our ONS surveys. So that's the labour market, business...all of our surveys really. In the last four years, thanks to funding we've received from Administrative Data Research UK (ADRUK), we've been able to grow the amount of data we hold, so now we've increasingly got data coming from other government departments. And we've got more linked datasets that enable us to offer new insights into the data.

    MF
    And so these are people's responses to survey questions and people's records, as well as data that are held by other departments?

    BS
    Indeed, yes, the data coming from other departments is often administrative data, so not from surveys but more admin data.

    MF
    And a lot of the value in that is in being able to compare and to link this data to achieve different research insights?

    BS

    Absolutely. I mean, a good example of that is a dataset that's been added in the last year or so where our ONS census data from 2011 was linked to educational attainment data from the Department for Education into a research dataset called Growing up in England (GUiE). And it's hugely important because we have a lot of rich information from the census but you know, linking that with the educational attainment data offers new insights about how kids do at school, and how they're linked to the characteristics of their background.

    MF
    So you use the underpinning of census to provide a really universal picture of what's going on across that particular population, and therefore gain some insight into how people have achieved educationally in a way that we wouldn't have done before. Of course, all this and the power of it is clear in that example, but a lot of people might think, oh my gosh, they must know an awful lot about me that in that case, tell us about how privacy and anonymity are protected in those circumstances.

    BS
    Yeah, absolutely. It's a central part of their operation, and clearly the word secure in the name is key there. So we follow a five safes principle which underpins everything we do. The five safes are safe people, so that anyone who uses the SRS has to be trained and go through an assessment to be accredited by us to use the environment. Once they're accredited, they then have to apply to have a project that's running in the system, and that gets independently assessed. There are a number of checks around whether it's ethically sound, whether the use of data is appropriate, but the key thing really is around the public good. So all research projects that happen in the SRS have to be in the public good and there's a commitment to be transparent. So every project that happens in the SRS, there's a record which is published on the UK Statistics Authority website. The third safe is around the settings, so it's a very controlled environment where people access the data. The fourth stage is around the data, so although we've said it's record level data it's already identified. Names and addresses, any identifiers are stripped out of the data before researchers can access it. And the final stage, the final part of the of the researcher journey if you like, is around outputs. What that means is we do checks to ensure that when any analysis leaves the environment that no individual or business can be identified for the published results.

    MF
    So in essence, you must convince the ONS that you are a Bonafide researcher, and you also have to convince them that what you're doing is definitely for the public benefit.

    BS
    That's right. And the other thing that's worth noting is that the SRS, like a number of other trusted research environments across the country, has been accredited under the Digital Economy Act to be a data processor, which means we go through a rigorous assessment process around the security, the environment, but also our capability to run it. So that's our processes, our procedures, whether our staff are adequately trained to run the service. That's a key part of that accreditation under the Digital Economy Act.

    MF
    So, on that point then about anonymity, you can drill right down to individual level, but you'll never know who those individuals actually are or be able to identify them?

    BS
    That's right. Researchers typically will run their code against the record level data, but when they've got the results of the analysis, there are clear rules that say you won't be allowed to take out very low counts. So that means like our published outputs, there's no way of identifying anyone once the research is published.

    MF
    And the SRS has built up over the years a good reputation for actually doing this effectively and efficiently.

    BS
    Yes, I think that's fair to say. We have a good reputation, and the service is growing in terms of the number of datasets and the number of projects and the number of people using it. So, I think that speaks for itself.

    MF
    Okay, let's pull out another I think powerful example of why this facility is so important and that comes from the recent COVID pandemic. Many listeners will be aware that the ONS ran a very, very large survey involving upwards of 100,000 people providing samples, taking COVID tests, and they were sent off to be analysed creating an awful lot of community level data about COVID infections, and we in the ONS then publish our estimates and continue to do so as we record estimates every week of fluctuating infection levels. But behind all that work, there were expert researchers in institutions around the country who were doing far more with that data. And the SRS was fundamental to delivering the data to them. Tell us about how that operated Bill, and some of the results that we got out of it.

    BS
    Yeah, sure. I mean, the COVID infection survey that you refer to there, that dataset is available for accredited researchers to apply to use, and they have done, but we've also brought in a number of others, about 20 COVID related datasets are in the SRS, so things around vaccination or the schools infection survey, mortality, etc.

    So since the start of the pandemic we've had over 50 projects that have either taken place and completed, or are currently underway, in the environment. Some of those are directly using the COVID related dataset. So looking, if you like, at the health impact, but there's also projects that are are looking at, if you like, non COVID data, economic data or education data, that are projects dedicated to understanding the impact of COVID.

    MF
    What sort of insights have we seen from those?

    BS
    In terms of those using the COVID related data there's been analysis to highlight the disproportionate impact of the virus on ethnic minorities, that went on to implement a number of government interventions. Another project assessed the role of schools in the in the Coronavirus transmission. We had another project that was run specifically on behalf of local authorities to inform their response to the pandemic that offered insights into the risks between occupation. Also research into footfall in retail centres and how business sectors were affected by the pandemic. So a really huge range of things. There were other research projects looking at the impact and you know, an example there was a project that looked at learning loss. So, kids not being in school for that sort of 20 to 21 academic year. Similarly, the Bank of England ran a project looking at the financial stability of the UK during the pandemic period. So hopefully those examples give you this sense of the range.

    MF
    An incredibly impressive array of projects, all underpinned by that big survey, the likes of which the ONS has a unique ability to run, that big survey taking part run across the United Kingdom of people providing and answering questionnaires as well as providing survey samples. And don't take our word for it, I mean, it was reported in the Daily Mirror no less. A researcher who benefited from that data described the COVID Infection Survey as, when it came to the pandemic, one of the most valuable resources on the planet. So that's a powerful example of the research value that can be extracted through the secondary uses of data gathered by the ONS.

    Anyway, enough of blowing our own trumpet, the service has been running a very successful award scheme that recognises the achievements of external researchers Bill. Tell us about some of the projects that have been recognised in that.

    BS
    It’s worth mentioning I think also that we've got case studies on our website, the Secure Research Service website and the ADRUK website, which show in a little bit more detail the impact some of these research projects have had, but like you say, we also hold an annual Research Excellence Awards, which is great. We have different categories of awards where people can submit their project and explain where their research has been published and had an impact. And like I said, we get a lot of nominations and reviewing the applications, which I did last year, it really emphasises the breadth and quality of the research taking place in the SRS.

    MF
    Check those out then if you're interested in learning more about those projects, some of the examples that Bill mentions and winners of the Research Excellence Awards, of course, one of whom I'm very pleased to say joins us now and that's Dr. Becky Arnold from the University of Keele, who took home the cross-government analysis award for her team's work on controlling the spread of COVID-19 in vulnerable settings in a project undertaken at the UK health security agency.

    Becky I guess that's but another example of the kind of secondary uses of the COVID infection data. Welcome to the podcast. Please tell us all about that.

    Dr. Becky Arnold
    Yeah, very, very glad to. So first thing I want to talk about essentially is what a vulnerable setting is. And that was really key to the sort of cross governmental aspects of this because vulnerable settings are settings like care homes, hospitals, prisons, schools, where you have a lot of quite often vulnerable people in a really dense environment where COVID can sort of spread and get out of control really quickly. And if we want to define a testing policy for that, so our testing policy being perhaps everybody takes like three LFT tests a week, or maybe one monthly PCR test, but also other factors, like what's your isolation policy? So, if somebody is infected with COVID, how many days do they have to be isolated for? Do they need a negative test to be released? What is your outbreak policy in these institutions, if you know that there's an outbreak going on? It's this really, really complicated thing. And you know, for government policy, you need a testing regime to try and keep COVID under control in these settings. But there's a few difficulties with that. The first thing is that the settings are all really different. So, when I just mentioned about the cross governmental thing, it meant interacting with lots of different departments, lots of different data sources to try and understand these particular settings and their particular characteristics. The really, really critical point I want to make is that the whole project was about trying to understand what that testing policy should be. And the best testing policy in one setting may not be the best testing policy in another setting, because when we're trying to give advice to policymakers and policy departments about what testing strategy you should use in an institution, you don't want to just pull that out of the hat. You don't want to just go oh, I think this many LFT tests a week. We want to give data-driven, informed, evidence-based advice. So essentially, what this project was looking at was all of these different settings in a lot of detail, looking at the demographics within them and their particular vulnerabilities. So, care home residents are particularly vulnerable, as are people in prison. They're more clinically vulnerable than people of the same age that are not in prison and a bunch of different aspects, how people interact in these different settings, how infection spreads in these different settings. And from that, essentially, we created a model where you can simulate the spread of COVID in these different settings under different testing strategies. So, you can answer questions like if we use ‘x’ testing strategy versus ‘y’ testing strategy, what is the likely impact going to be on the number of people that died, the number of people that need hospitalisation, how many of those people that go to hospital are going to need intensive care, which often comes with long recovery and sometimes permanent impacts on people's lives. So, there are huge things to consider. And it's actually the point of this project was to study these environments and try and make something which can provide that evidence to inform decision making.

    MF
    This was data being gathered, presumably then in institutional settings up and down the country and then being collected centrally and made available to you at a single point of contact?

    BA
    It would have been very nice if that was the case. Because we're looking at so many different settings we were kind of scrambling around quite a lot just to try and identify what datasets were available and to sort of gather them together. And also there were so many different types of data that we needed to drive this. So firstly, like you say, the health outcomes data, in some cases, there were specific datasets available for certain institution types, but we weren't always able to get access to those for various reasons. But there were also considerations like the sort of data that was published every day, there's sort of a nationwide aspect, when we're also looking at another data type is how people interact within these different settings. For that we used an awful lot of literature review. We spoke to people that work in the settings. We spoke to people that work in care homes, we spoke to care homes franchise owners to understand their staffing policies and things related to that. We also spoke to government departments like the Department of Justice. So, it was a lot of different data sources all sort of gathered together for the various aspects of this project.

    MF
    This model you’ve created, what's its future? Perhaps in different scenarios that might arise in the future.

    BA
    The model was very, very carefully constructed to be as flexible as possible at the time for potential future COVID variants in mind, but because of that, it means it's very adaptable to different infectious diseases. So if you change just a few input parameters, like the mortality rates, you know, the infection rate, a few factors like that, it's quite easy to transform this model to simulate the spread of other infectious diseases. So, things like flu, which has a big impact on care homes every year and has the potential to be used to better understand how to combat that. But another thing that I think is very useful about this model is it has the ability to help us in game plan for potential future pandemics, because I think it's fair to say that governments around the world when COVID came along, were kind of caught by surprise, or wrong-footed, sort of without a game plan of how to respond. And as we know, the early stages, whether it's a single pandemic or an individual outbreak, it's those early stages which are really, really critical. With this sort of model, we can gameplan you know, what response should we give if we have a future pandemic with these properties? Say we've got this transmissibility, it's got this mortality rate, we have tests that cost this much and they give you this accuracy. In that scenario, what should we do? And to be able to do that research upfront and to have some sort of game plan in mind so that if and when future pandemics come along, we are better prepared and can respond efficiently and quickly to try and have the best outcomes possible. So that's something I think is really exciting for the for the future of this model.

    MF

    Okay, that's beautifully explained, thank you very much indeed.

    Bill, so we've heard from Becky about how the data that she had to access had to come from many different places, but I guess that might have been an impediment to actually producing a model as rapidly in the pressing circumstances of the pandemic as it could potentially have been achieved. Does that suggest then that while the SRS has achieved on its own terms, a great deal, nevertheless, there have been limitations, and perhaps it's time to be doing this kind of data sharing across the public sector in a much bigger and better way?

    BS
    Yes. When I look at the sort of challenges and limitations around the SRS, I think there's probably three things, one of which is the ability to get the data sharing moving as fast as we need to meet this sort of policy need. The second area would be around the fact that actually the SRS is ageing technology now, and although it's performed really well, and especially during that sort of pandemic response we talked about earlier, it's fair to say it has struggled to cope with some of the really sort of heavy processing requirements that have come out of during that sort of COVID response. Some of the modelling required was much larger than the traditional sort of research projects we might have had in the SRS. And then the final thing is around some of the processes that we described earlier, that sort of five stages framework. All of our processes and rules apply to users, regardless of their sector. What that means is for government analysts who are seeking to access government data, working on government systems to inform government policy, there's a feeling that we could do things faster. Only 25% of our user base is government analysts at the moment, you know, I think that's something we certainly could improve to build that area of the service.

    MF
    Building the service then for the future is where Jason comes in, Jason Yaxley. As the director of the new Integrated Data Service, we've heard about potential, we've heard about the opportunity to do more in future. Tell us then about the Integrated Data Service, which promises to expand the amount of data available to researchers to speed up the delivery of it and to really produce a huge step-change or transformation in the ability of researchers to do this kind of work in the future. Is that a fair expectation?

    Jason Yaxley
    Hi Miles, pleased to be here. Yes, I think it's a very fair expectation. So I have the pleasure of being the programme director for the Integrated Data Programme, which will deliver the Integrated Data Service and the ONS is the lead delivery partner for all of government to deliver a transformation both in how government uses data, but also the underpinning technology that enables us to analyse and use that data much more quickly. And so that's a reason why we're one of the key enablers of the government's data strategy and why I view this very much as a transformation rather than just another big data lake where lots of government data goes and we can't really get into it. So, it's a really exciting opportunity. Were in the sort of middle stage of the programme where we have a service that is built and now we have to sort of grow it and expand it and get more data to really enhance its functionality, but it's a really exciting time. A really great job to have.

    MF
    And in terms of scale, what's the difference between IDS coming in, the Integrated Data Service, compared to the old, if I can put it that way, Secure Research Service?

    JY
    When it comes to the SRS, it is brilliant at what it does, but it's technology is starting to age and that is causing limitations. And I think what makes the Integrated Data Service sort of a step-change and perhaps unique across government falls into sort of four broad categories. There's the enabling infrastructure itself, which will be state of the art cloud-based, there is the data which will be much more friction free and will be quicker and easier to access data, use data, shar data. It will enable data visualisation in a way that's never been done before. And rather than having to do individual agreements to link one bit of data to a different bit of data, what we will have here is a service for people that will be scalable, repeatable, standardised, which makes it much much easier on a regular basis to link and index and then do research against much larger datasets much more quickly and produce faster results, which is going to be a huge benefit to the public good through the lens of better more informed and evidence-based policy decision making, that has much more statistical and analytical evidence that sits underneath it.

    And so we're transforming both the data access itself and the technology that enables that, but also the sort of almost the cultural lens through which we work together. We share information to simplify it. I really want to stress the IDS is keeping all the really good parts of SRS around the five safes, around the de-identification of data, protecting that data and ensuring that you know, public concerns about how government holds and uses data are entirely met.

    MF
    That's an obvious question isn't it, if this is happening much more widely on a much bigger scale, and how are those safeguards that were heard about from Bill going to be protected? How are they going to persist, and the same level of protection be provided?

    JY
    2023 is a big year for the programme, particularly March when we hope and we're aiming to receive our own Digital Economy Act accreditation in the same way that the SRS has. So we will carry forward the same safeguards that SRS has used so successfully, as I say around the five safes around, how users are accredited, but through technology and through the service that we operate, to streamline and simplify that, particularly for government users using government data. So this is about that cultural journey as well as that technological journey. Very central to what we're doing is the security of data, the protection of data, you know, we have to convince all of the Chief Technical Officers and all the data analysts across Whitehall that we are as safe and as secure as we could possibly be. So that they'll be comfortable with us having access to that data.

    MF
    Other potential areas that most UK government data will be made available will be accessible by researchers.

    JY
    And that's the end game. Absolutely. As I say, we're on a journey at this point. Again, 2023 is important to us. We've just brought in what we're calling super early adopters, which are strategic experienced government analysts from both Whitehall departments and the devolved administrations, particularly Welsh Government right now, and we have brought census 2021 data into the system very early. And so we're already working with government analysts to start to do early exploratory projects that unlocks the information and the power of the census data against certain government priorities, for example, around the economy or around energy, and particularly, we're working with Welsh Government to look at what is the impact of recent economic situation on the Welsh farming community and how can we analyse the industry against the information that we hold in the census data and other data sources to find outcomes of what's happened in say, the last 10 years between the two census datasets.

    MF
    So what happens next, what are the next steps on this? And particularly what's the message to researchers who think that they would like to be involved in this project?

    JY
    2023’s really big steps are, as I've just mentioned, DEA accreditation, we reach the next level of maturity for our functionality also in in March, which means in the rest of 2023, having had these two points in time, we’ll be in position to unlock the full sort of power of ideas, we will be wanting to encourage particularly more government researchers. Our aspiration is that every government professional analyst will be registered on and be able to use the service. We will accelerate our pipeline with Whitehall departments with data that we want to bring in. And over the life of the programme we will want to transition SRS itself, and its data and its users into IDS unlocking for those users as I say, the enabling technology of data visualisation, the speed and the pace, the scale. So, I at the moment feel that what we have is a huge warehouse with one corner that has data in it but the potential to fill it with as much data as we can in a way that is linked and matched and indexed. So that you can do much greater analytical research than hitherto has been possible. Just to illustrate that the way the way I like to think of it is there are a lot of people both in government and in academia that can do point to point linkage between dataset A, dataset B, and then run some research against it. And you can think of that perhaps as a ferry crossing a river from point A to point B on the other side, what helps visualise why IDs will be different is to think of us as a bridge and a road that goes over the river and so we can have multiple streams of traffic. We can have a much greater flow of information and research and all the agreements only have to be done once and then it's just repeatable from there. And that's one of the reasons why I'm so excited to be working with the colleagues on the programme and colleagues across government and academia to deliver the transformation which we aim to complete by March 2025. So we still have some way to go to fully exploit all of the technology and get all the data in, but we're on our way.

    MF
    In the meantime however, there are a couple of examples already out there that listeners might care to check out for themselves if they haven't already. The first of which is the climate statistics data dashboard, creating a one-stop shop if you like for statistics on climate change related topics, bringing together data from around government, you can see it at climate-change.data.gov.uk and another one is the violence against women and girls data dashboard that's vawg.GSS-data.org.uk, which has been created as an important part of the government's 2021 tackling violence against women and girls strategy. And of course, the very popular and widely used COVID dashboard which continues to be available as well. So real living examples of the Integrated Data Service already serving the public benefit.

    Becky, if I could bring you back in again, if we're able to deliver on this and the warehouse as Jason described, it becomes bursting with data from right across government sources, presumably then in the future, the kind of work you told us about your award winning work during the pandemic will become that much faster, much easier to execute.

    Dr. Becky Arnold
    Yes, it really, really would. And I also can't understate how much the integration value of it of having things in the same place and linked just saves so much time and try to track down what data is available and then trying to combine it all together is such a undertaking. Having that sort of delivered there, sort of knowing what is available in a much more accessible way. Being able to use it much more readily would vastly, vastly speed up the sort of research that I did. But it would also be hugely, hugely valuable.

    MF
    Perhaps some of those listening to this Becky might be surprised actually at how difficult it has been to access public data like this in the past, and that government departments haven't collaborated in making it available in a single place.

    BA
    One of the biggest difficulties in doing the research I did was trying to get access. Just trying to find what datasets are out there is also a really, really big time sink and the idea of these all being integrated together and much more findable in a way that they aren't now is really, really exciting because it means that if you know what data there is you can use the most appropriate data for what you're trying to use, rather than trying to cobble together what you know exists and you can get your hands on. So integrating this all together in one place where it's findable. It would be a huge, huge win for the sort of research like what I did - or what my team did a lot more accurately. Another factor on that as well is the linking. It is so difficult if you've got different datasets compiled for completely different purposes by different departments - trying to combine those together is really hard. Even if they are about the same sorts of people, the same sorts of things. So having datasets that are already integrated would be a huge, huge step forward in trying to use that data as effectively as possible for the sort of research to drive evidence-based decision making in policy, which I think is something that is so important, and it's something I'm really passionate about.

    MF

    Becky, thank you very much for joining us. And thanks also to Jason Yaxley, and to Bill South for taking us through this important topic.

    I'm conscious that we've approached it largely through the perspective of researchers. And the whole issue of data ethics and how public good is assessed. It's something we've tackled in a previous podcast - do please listen to that and hear about the work of the data ethics committee as well because obviously, confidence in these kinds of initiatives, public trust in these kind of initiatives, depends very much on people understanding the ethical framework under which this work goes on. That's another big topic we will return to in the future, no doubt, and also track progress in the development, the ongoing development, of the Integrated Data Service and tracking the progress of some of the fantastic research projects that have already resulted from this kind of work and the potential ones very excitingly in future too, as well.

    I’m Miles Fletcher, and thanks once again for listening to Statistically Speaking. You can subscribe to new episodes of this podcast on Spotify, Apple podcasts and all the other major podcast platforms.

    Our producers at the ONS are Steve Milne and Alisha Arthur. Until next time, goodbye.

    ENDS

  • National Statistician, Sir Ian Diamond, joins Miles in a slightly festive episode of Statistically Speaking, to look back on some of the highlights and challenges for the ONS in 2022 while gazing positively, but objectively, towards 2023.

    TRANSCRIPT

    MILES FLETCHER

    Hello, and as another statistical year draws to an end you join us for a slightly festive episode of Statistically Speaking.

    I'm Miles Fletcher and with me this time is the national statistician himself, Sir Ian Diamond. We're going to pick out some of the key stats from another momentous year. Talk about some of its highlights and the challenges faced by the Office for National Statistics. We’ll gaze positively, but objectively, into 2023 and Sir Ian will be answering some of the questions that you our listeners wanted us to ask.

    Ian, welcome once again to statistically speaking.

    IAN DIAMOND
    First, thanks very much for that introduction. And can I offer festive greetings to all of your listeners?

    MILES
    Yes, it's come around again quickly, hasn't it? So much to talk about from the past year, but let's kick off with a very big number in every sense, and that's 59,597,542

    IAN
    ...is the population of England and Wales according to the census, and one, which I have to say is one of the greatest censuses that has ever been undertaken. And it's just an absolute thrill to commend my colleagues who have worked so hard to deliver it but also to every citizen of England and Wales who filled in those forms in 2021, and of course, those in Northern Ireland as well.

    MILES
    Now, you had to press the button, both on the decision to have that field operation go out in March 2021, against the backdrop of the pandemic, and then of course, to sign off on the results. How difficult were those decisions?

    IAN
    Well, I'm not going to say it was difficult Miles, I mean, it was a difficult decision, but if you surround yourself with all the information, so before we took the decision to go with a 2021 census, we looked at all the upsides, all the downsides. We measured the risks. We looked at the cost of delaying and we looked at the chance that we would get a decent count, and whether people were looking like they were now prepared to fill in forms, which have a whole set of risks. Was there an algorithm that told us what to do? I'm afraid there isn't an algorithm at the end of the day, I had to make a decision. I made that decision in collaboration with my colleagues. It was a decision we took together, and I think in every way it was the right decision. And it was a real privilege for me to work with the team in March and April, as we looked at the numbers, and for the first time, and I think it's a really important milestone, that for the very first time we shared our results with the local authorities. I have always believed that you need to involve the people on the ground to sense check the numbers and so for the first time ever, we invited local authorities to be part of the quality assurance process. So we contacted them under a nondisclosure agreement. You have access to the numbers, let's have a conversation and then we can co-create the numbers so that we all feel comfortable and local authorities to their great credit, really embraced this opportunity to co-create what was a great piece of work. We believe that helped, that the numbers that we were able to produce, we felt we had much more traction. And so it really was a national effort to produce those numbers. And I'm very proud of them.

    MILES
    In hindsight, and of course, it's easy to look at things in hindsight, but did you think it helped that essentially there was a captive audience?

    IAN
    Not at all. I completely disagree. I think the reason for the high numbers wasn't a captive audience. Let's remember that a very high proportion of the population were not able to lock down, they had to go out to work. The reason I think that we got high numbers was because of three reasons. Number one, engagement. A massive programme of engagement with different communities, which really, really, really meant that people in different communities of our country understood why we were asking, what the reasons were, in a way that perhaps hadn't happened before, and critically to say to people, if you give us your data we're not going away. We'll be back. And there's now a programme of going back and sharing those data for particular communities with them. So that's the first reason. The second reason was, I've always said that censuses are nine tenths logistics and 1/10 statistics and I felt that the logistics here were absolutely right. And moving to an online first model was incredibly important, it made it very easy for people to respond. You could respond on your way to work on your mobile phone. That's an awful lot easier than having someone knock on your door with a big form. And so I think that worked. And then a final piece was after the day having really good management information, which really enabled us to understand where our coverage was higher and lower, and then to target our field workers in a way that we've never been able to before. Historically when I did censuses, for example the 1981 census, every enumerate had a small area, they walked around, they found people within that area. But we were able to say right, we need more people in a particular area, less people in another area, so we were moving them around, maximising the resources and maximising the count.

    MILES
    Okay, so what do you think are the biggest takeaways on the data we've released so far?

    IAN
    I think some of the work around the ageing of our country is really important, but not just the ageing of our country because let's be honest, ageing is associated with demand for services. And what we show very clearly is a changing geography of ageing. Now, that's an ongoing situation. So if you look at the proportion of over 65s, it's a very different proportion of over 85s and so there is clearly a new internal migration which gives in some areas, for example, mid Wales and Cambridgeshire, a new demographic to think about for services over time. So here's a really interesting point about the geography of ageing, while noting that some of it is pretty traditional, the south coast of England remains a place with high levels of older people. Seaton in Devon, with the highest proportion of people over 90 in the country is an area which already knows that it has a high demand for services. Other places will be coming along, and I think that’s the first thing to say. The second thing I would note Miles is the changing demographic of where people were born. And certainly we are able to reflect some of that in the work but also again to look at the geography of where different people are living. And that's important. And also, for the first time ever, we have asked questions on veterans, and I think that was a really, really interesting piece of information. I must admit that the age distribution initially looks a little surprising, because for men, almost everybody is a veteran over the age of 80 because of national service, and that goes down, but we now have the ability to identify both the geography and the age distribution of veterans and it was noticeable that the highest proportions of veterans tended to be in places with military bases, Richmond Shire, in Yorkshire, which is near to Catterick or Portsmouth near the Navy areas. That says to us that they are obviously, and I'm not saying it's surprising, but people who have been in the military tend to end up staying around the areas where perhaps they have been based, but actually being able to do that and then following that up with a survey, a survey of veterans to understand their circumstances and the services they need, and also their families, I think is really super important that I have to say that that survey which went out after the results of the census were published, and we were able to launch them on the same day with the Ministry of Veterans Affairs Johnny Mercer has been an incredibly successful survey. Great response. And we're just in the process now of analysing those data. And that's something to look out for in the new year.

    MILES
    And plenty more census data still to come. Of course,

    IAN
    Well, yes. And of course, the data will be available now for an analysis by anyone. And that's really exciting,

    MILES
    Well worth pointing out as well. Okay, here's another big number for you. 11.1%

    IAN

    Is inflation.

    MILES

    That was the figure in October, it's recently dropped down to 10.7.

    IAN
    You don't really understand inflation until you actually get down to what's driving it and what the components are. And so, we spend an enormous amount of time looking at the components to understand them. So this drop to temporary 7% In the most recent data is driven by a reduction in fuel costs, with fuel prices going down, I mean it's still too expensive don't get me wrong but they're going down a bit, and at the same time that has been offset by increases in alcohol prices at hotels, restaurants, and pubs. And so all put together, yes it’s a drop, but not an enormous drop, and still a significant rise compared with the same month last year.

    MILES
    Now there's been a fascinating and very public debate over the cost of living of course, and particularly over the relevance and validity of headline inflation measures, CPI or CPIH. A preferred measure on the one hand, and on the other hand, the actual experience of people seeing the cost of their weekly shopping shooting up much faster than the official rate, which is just an average of course, would suggest.

    IAN
    I think it's an important point. I had a very good conversation with a number of influencers in this area. And I think it is important to recognise that what one is asked to do, and we are statutorily responsible for producing an inflation statistic that is an average at the end of the day, and it's based on a basket of goods, and that basket gets changed every year to reflect buying patterns. So with a pandemic, we were more relaxed Miles and you would be sitting opposite me just wearing a jumper instead of a three-piece suit, it means that we took men's suits out of the basket this year, but that's an average. The point that people have asked is does that average reflect what's going up for all groups of society? What about those people who are at the poorer end of society and whose budget only allows them to buy the least priced goods and that's why we put together a least price index and one that's based on what might be called the value goods that Supermarkets sell. And if we look at those we found that the average price there was not unlike the overall inflation, but again, an enormous amount of heterogeneity on the various prices. The highest increase in the most recent products was for vegetable oil, of course, driven by the issues associated with Russia and Ukraine and the difficulties of the Ukrainian farms which drive so much of that area. On the other hand, beef mince and orange juice went down relatively. So there was heterogeneity, inflation was high, but let me be very clear, not unlike the overall inflation in the country as a whole on the average.

    MILES
    The important point here being that everyone's rate of inflation, of course is slightly different and we have a means now of allowing people to find out exactly what their personal rate is don’t we.

    IAN
    For those people who want to have a really close look at their budget, the personal inflation calculator which people can use and that personal inflation calculator has been massively used. We had a very good partnership with the media - BBC, The Guardian - for it to be widely available. And indeed, in the first 24 hours or so of it being available on the BBC website, over a million people used it - over a million people accessing ONS data.

    MILES
    And you can find that out of course by visiting ons.gov.uk and calculate your own personal rate of inflation there.

    Of course, when we think about money, we inevitably think about work and that brings us on to the figures around the labour market. And one rather sombre area of the Labour Force Survey that's been the focus of again, a lot of attention this year, is the increasing number of people deemed to be economically inactive, perhaps very often because of long term sickness. Now, what do you make of that?

    IAN
    Economically inactive is not just people who are on sick, I mean there has been a steady move initially from those over age 50 to inactivity, and that means that they are reporting that they are not in work, nor are they looking for work. We've called it a bit of a flourish, that flight from the labour force of the over 50s is a real trend and a real worry for the economy, given the skills that those people hold, and we've done two surveys of the over 50s to understand why they have left the labour force and what might tempt them back in. 500,000 over 50s leaving the labour force, though it's only a very rough indicator, if you don't replace them somehow, and with every 100,000 people being around 0.1 of GDP full time equivalents, and that's 0.5 on the GDP. It's as simple as that. The other point I would make that I think is important is another real concern for the labour force. Just in the last few weeks we have started to see just a hint of an increase in inactivity amongst the 16 to 24s. That is important because if it were to continue it is normally an indicator of challenges in the labour force and when 16 to 24s are saying I don't have a job and I'm looking for one it tends to be because there isn't one around. And so I do think that there is an issue again for us to keep a laser focus on these numbers as we go into 2023.

    MILES
    Okay, so we've mentioned GDP and of course, there's been a lot of focus again on the level of GDP and whether the economy is in so called recession or expanding or whatever. Let's not get into that in any great detail now, but it's worth pointing out that alongside GDP, the ONS has been trying for some time now to broaden its focus on what matters in terms of wellbeing, both socially and economically. And to produce a more comprehensive picture of what's going on, aside from that very raw, basic GDP estimate. Can you tell us a little bit about what's developed on that front this year?

    IAN
    I think that's a really interesting point. We, as other parts of the world’s national statistical institutes have been saying, well, actually, there is much more to our gross domestic product than just what comes strictly from the economy. And so we have been working on the environment and natural capital and building that into our overall estimates. And we're now also working on some things that I have been thinking about for a long time and I'm very excited that we are going to be able to work on that. And that is to look at in many ways at the human capital that we have, and how that is being effectively used. If you are spending six hours a day, shall we say, caring for your elderly parent and perhaps your grandchildren, then are you being productive or not? And of course, the answer is you're being incredibly productive. Or if you are, as a neighbour of mine is, working a couple of evenings or a couple of afternoons a week at a homeless shelter in Somerset, then are you being productive in that volunteering? 100% yes. And so I think it is important that we build these extra pieces in now. Is this point about human capital, is this new? Well, the great, famous Nobel Prize winner Richard Stone wrote in his Nobel lecture about this, I made some suggestions, but at that time I would submit that it was actually quite hard to build the models in the way that one would want to. One could do the algebra, but it would kind of drop out after a while. Whereas now with numerical estimation, we can really move forward in an effective way and I'm looking forward to 2023 being a year when we really push forward with those models, and really build the human capital. And most importantly, alongside that, the wellbeing. Wellbeing is a much more complex indicator, and we have a consultation out at the moment which I see coming into fruition in 2023 around the measurement of our wellbeing. We talk about the increasing proportion of elderly and I think it is also important to think about that in the context of how are people ageing. Now, let me just give you a statistic, Miles. If I looked in 1951 at the age at which 1% of men had a probability of dying, that'd be about 50. If I looked at it now, it’s 65. So 65 is the new 50. And you can look at things in all kinds of ways like that, but that original idea is that of the great demographer James Vaupel. And this 65 is the new 50 is absolutely brilliant, but, and this is the nub of this, it needs to be healthy ageing. It comes back to that point about inactivity, what are we doing to enable people to feel that they can age healthily and therefore be productive whether that is through traditional paid employment or through other issues such as volunteering, that's something we will be spending a lot of time over the next little while estimating.

    MILES
    You mentioned ageing and on the topic of health in 2022, the introduction of what some may view as the GDP of health and that is the Health Index for England. Another important piece of work that's been going on here.

    IAN
    What the Health Index allows us to do is to get down to the local levels and we've got a pilot with colleagues in Northumberland, Director of Public Health up there to go down to sub local areas. And I think the important thing to recognise is the geographical difference here in levels of health. It's interesting to look at the national level, we need to look at the geography, expectations of life at birth for men in Glasgow City are 14 or 15 is less than expectations of life for men in places like Westminster and Kensington and Chelsea, you know, that's a real issue. When I worked in Scotland, the Director of Public Health for Grampian region put out some statistics which showed within Aberdeen the difference between the two wards, probably seven or eight miles apart was 16, a full 16 years. Those are the kinds of differentials that I think we need to understand more, we would all agree it is a priority to reduce those inequalities in health. And it seems to me there is a challenge for us to understand that and to reduce those inequalities.

    MILES
    Okay, so we've talked about health, personal wellbeing, economic wellbeing as well. Now there's an additional element of attention for the ONS now, and that's been the environment and particularly monitoring progress towards net zero emissions by 2050 and to help with that ONS has contributed to the official climate change portal, which you can view at climate-change.data.gov.uk. Here's a statistic from that, in 2021 84% of our energy still came from non-renewable sources.

    IAN
    And that's what we need to continue to measure. And clearly the focus on energy and energy supply has increased this year as a result of the conflict in Ukraine. And we over the next while need to make sure that we have very accurate data on sources of energy. And our job is to monitor that in an effective and efficient way. And we will do that.

    MILES
    Now, we mentioned to some of our podcast listeners, we'd be speaking to you today and asked them to come up with their own questions on topics they'd like to put to you.

    So let's kick off with this one from Professor Athina Vlachantoni, from the University of Southampton no less, who asks: What's the most intriguing number or statistic you've come across during your time as national statistician?

    IAN
    One of the most interesting I would have to say, was the very first number that we got from the COVID infection survey, because we had to look at it very, very, very carefully, to make sure going back to an individual level, to look at the amount of virus in each positive case, so that we were sure that we did not have a high number of false positives. And what that showed, and when we linked it in with our questions about symptoms, was the number of asymptomatic cases. And I found that really, really interesting. On a lighter note, the data that we get from credit card and debit card sales. On July 21, I think it was in 2021, “Freedom Day” as it was called, when people were able to go to the pub we saw a spike in sales in pubs but we were also able to identify whether those sales were in person or online. We've been monitoring online sales during the pandemic very carefully. And I was really surprised to see a spike in sales in pubs with the person not present. I was wondering whether there were people down the street, you know, with very long straws. Of course, what I hadn't realised is that in some pubs now, you can get an app for your beer and it arrives as if by magic at your table. And so it was a learning experience for me that it was possible for large numbers of people to enjoy a drink, while apparently not being at the pub.

    MILES
    Well, that's a lovely example of fast digital data contributing towards incredible insight, which the ONS is now able to access. But actually it leads nicely on to our next question which comes from Sam Smith, from Cambridge, who asks: Hhat are the longer-term opportunities and threats to the public from the use of safe settings and the Integrated Data Service? Now that's a question that’s essentially about security and the ethical use of data for the public good.

    IAN
    Sam, that's a really super question and something that we're absolutely passionate about. Firstly, using data positively on the lives of our fellow citizens is what we're here for, and therefore we recognise at all times that we use data with the implicit permission of the public. So the first answer I would say to Sam is that we are absolutely committed to public engagement, transparency to make sure people know what we're doing, how we're doing it. And we don't just talk about data, but what are we going to use it for, and how is it going to be used and can you find out how it has been used. These are really, really important questions and public engagement and involving the public in our decision making is important. Secondly, when we build something like the Integrated Data Service, we are very, very careful about the security and we work very closely with the top security people across government to make sure that we have the highest levels of security so that all the data doesn't need to be in one place. We are able to bring the data we need from different places so that we're not, if you like, moving large amounts of data around and forming data lakes, that is not what we do. Thirdly, we are very, very careful about how people can use the data and how they can access the results. So we work very carefully to make sure that those results have no way for people to impact on the privacy and our data can only be used by approved people and the projects on which they work on have to go through an ethical committee and have to go through a research approvals panel. We call this process “the 5 safes” and we believe that that does enable us to be able to look any member of the public in the eye and say that we are taking every precaution with your data, but at the same time, the proof of the pudding has to be in the eating and the public have to be able to see, I would argue, how those data have been used and how there are real concrete examples of how the lives of them or their fellow citizens have been improved by the use of linked administrative data.

    MILES
    Final question. This comes from Jennifer Boag from Scotland - clue there - and she asks: Do you have confidence that the work being done to retrieve Scotland’s census will give us reliable UK wide statistics, so that Scotland's data will be comparable with the rest of the UK?

    IAN
    Well, thanks, Jennifer, for that. A census is a process and we are seeing that our colleagues in Scotland working on the Census have now got the ability to use the data they collected as well as the coverage survey, and now the administrative data, to be able to bring those three sources together into a reliable estimate of the population. I would just like to thank Professor James Brown and the international steering group for the very hard work that they've been putting in providing very strong steers on what should we do. And my position at the moment is that we can expect, if everything goes well, to see some reliable Scottish data during 2023. And we at the ONS are working extremely hard to make sure that we can roll forward our data in a way that means that we will have the 22 best estimates for the whole of the UK which we can put our hand on heart and say that we trust. We're not there yet. I believe we can get there. And I will do everything in my power to ensure that we do.

    MILES
    Data from Scotland on the way then, and more data from England and Wales still to come, but also in 2023 a decision on whether the UK Scotland, England, Wales and Northern Ireland perhaps will have censuses in future?

    IAN
    Well not a decision on all four because undertaking a census isn't independent that Scotland and Northern Ireland will take their own view, as will Wales. Currently we do the census for Wales with our colleagues in Wales, but at the end of the day it is a Welsh Government decision for that to happen. We in the ONS will be making a recommendation to our board and through them a recommendation to Parliament as to whether we believe that we can produce regular population estimates and the multivariate data that comes with them in a way that means that we will not have need to have another census in 2031. I mean, I would say that we're able to do this and there's an enormous amount of work going on. And that's a real major breakthrough because while I'm passionate about censuses and a census is an incredibly beautiful and wonderful thing, I would have to say that it is out of date as soon as you've done it, and therefore being able to have regular estimates would be a breakthrough rather than simply rolling forward and we can't hide from the fact that as you roll forward and you get further rolling forward, it becomes much more difficult at the local area level to make those estimates. And so I am really excited about that decision and will be consulting during 2023 on where we have got to, which of course also brought about a statutory responsibility to see whether we can make local estimates of average income, and we will continue to look at that as well. So I think it's an exciting 2023 with regard to the future of the census.

    Miles, it's been a real pleasure. Thank you very much, and I look forward to another opportunity to join this podcast in the future. Thank you.

    MILES
    Well, that's it for another episode of Statistically Speaking and if you're one of the people who collectively browsed the ONS website 21,809 times on Christmas Day last year, rest assured that this year you'll be able to access every single one of our podcasts from 2022 directly from the homepage now on the ONS website.

    And as always, you can subscribe to future episodes on Spotify, Apple podcasts and all the other major podcast platforms. Do also please follow us on the @ONSfocus Twitter feed.

    I'm Miles Fletcher and from myself, our producer Steve Milne, and the whole of the Office for National Statistics, have a very Merry Christmas.

    ENDS

  • Miles is joined by colleagues from the Health and Life Events team to explore how data is good for our health.

    Within the diagnosis: the Health Index, dubbed “the GDP of health”; the impacts of Covid-19 as well as an ageing society; and the increasing importance of linking data from numerous sources to generate complex insights that inform decision-making.

    TRANSCRIPT

    MILES FLETCHER

    Welcome again to Statistically Speaking the Office for National Statistics podcast. This time we're taking the pulse of the nation's health and exploring the role of public data in making it better. Of course, we would say that statistics are good for you. We recommend at least five a day, but more seriously, what do the ONS figures say about the state of our health now? And what are we doing to create new and better statistical insights to support a healthier population in future?

    With us to examine all are ONS colleagues, Julie Stanborough, Deputy Director of Health and Life Events, Neil Bannister, Assistant Deputy Director of Health Analysis and, later in the podcast, Jonny Tinsley, Head of Health and Life Events Data Transformation.

    Julie to start with you. The World Health Organisation defines health as a state of complete physical, mental and social wellbeing and not merely the absence of disease or infirmity. Now, the ONS has begun a major project that seeks to capture the key elements of that in one place and to a certain extent in one single number. Can explain what that is, and what it's all about?

    JULIE STANBOROUGH
    Yes, so that will be the Health Index, and as you say, it is kind of regarded as the GDP of health. And at its simplest, it allows the health of England and local authorities to be tracked over time, which allows greater understanding of the relationships between the drivers of health and health outcomes. So the index starts in England in 2015. And we've got data up to 2019, which is available online, but we're going to be publishing 2020 figures very shortly.

    MF
    So tell us about the nuts and bolts, what are the data sources here and how have they been put together?

    JS
    We've got a huge number of different data sources that go into the Health Index. We've grouped them into three different themes, that we have healthy people, healthy lives and healthy places. And we use data sources from within ONS, but also from across government, and more broadly, to give that really in-depth breadth of all the data that goes into health.

    MF
    What sort of factors, what sort of elements are we looking at? People living without serious health conditions?

    JS
    Yeah, so it's a whole range of things. For example, looking at child poverty through to access to green spaces, life expectancy, a whole range of different factors which contribute to whether a particular area is deemed to have high health index or a low health index.

    MF
    Is there particular value - because you can understand wanting to understand disparities at local level and we'll talk about that a bit a bit later - but boiling it down to a single reading, a GDP. That's a very ambitious thing. How useful, how relevant, is that figure going to be? Is it something that the future will look to us regularly and take as seriously as a big number like GDP?

    JS
    I'd really hope so. And I think because the complexity of health is so complex, if we can boil it down to one number and be able to track that over time, at a national level, or at a local level, that really helps people understand what's going on and helps them to engage, but equally because it has all the different data sources in there, it allows those policy makers in local authorities to be able to go into that data and explore what really is happening in their particular area.

    MF
    More than simply measuring the outputs or successes of the health services, it's about understanding a much wider range of factors as well as the environment in which people live and their socio-economic position as well.

    JS
    That's right. I mean, there are so many different aspects to it. And that's why the Health Index has so many different data sources in there. But because of that complexity, it makes it really difficult for people to understand what they should be doing to improve the health in their areas. So you need that breadth, but then the ability to aggregate it up into a single number helps with the accessibility.

    MF
    So the index will provide this big reading of this multi factor estimate of health but perhaps it'll be the case that it isn't so much what the index says at any given time, but how it changes over time, that'll be its real value.

    JS
    That's right. And it's being able to track that at a national level. And at a local level. We're going to be publishing 2020 results, but we're going to have to be quite careful with those results because it'd be the first year with the pandemic and so we'd expect to be seeing some changes as a result of the pandemic. But equally, some of the data collections will have changed as a result of not being able to interview people in the same way because of lockdown. So we're going to have to monitor that data over 2020 / 2021 and further to really see the impact of the pandemic.

    MF
    And provide also perhaps some measure of people's changing economic circumstances at a time when there's so much concern obviously around the cost of living.

    In the meantime, because this project the Health Index is still in its relative infancy of course we have a wealth of other data already that the ONS generates and brings in from elsewhere and works with. Of course the number one indicator of a nation's health is our life expectancy - how long we might be expected to survive. Tell us what's been happening - the broad picture - as far as life expectancy is concerned.

    JS
    Life expectancy, if I just explain what that is, is a statistical measure which estimates the average number of years a person can expect to live. So male life expectancy at birth in the United Kingdom for the years 2018 to 2020 was 79 years, and that compared to 83 years for females. And during the past two decades life expectancy has grown, but much faster growth appeared in the naughties, and during the 2010s. We've seen that life expectancy pretty much slow right down and flatten.

    MF
    As well as this obviously the key measure of life expectancy. There's another important dimension here and this is particularly relevant if we're talking about health and that of course is healthy life expectancy because it's all very well to be alive, but if you have not got a great quality of life, well that brings all sorts of other issues and it brings problems for the health service as well of course. Tell us about healthy life expectancy. What is that as a statistic, how is that measured? What are the characteristics that inform healthy life expectancy?

    JS
    It's slightly different to life expectancy. Healthy life expectancy is a measure of the average number of years someone can expect to live in good health or free from limiting illness, and in 2018 to 2020 male healthy life expectancy at birth in the UK was 63 years, which meant that you had 16 years of life in not good health. In contrast for females, they had 64 years of healthy life expectancy, which meant that they had 19 years of life in not good health.

    MF
    That's fascinating and obviously begs the question, has that period of healthy life expectancy been going up in line with overall life expectancy, or have people simply been living longer in poor health?

    JS
    Yeah, so between 2011/13 and 2018/2020, both males and females, there was no improvement in health and life expectancy.

    MF
    That goes some way to explaining some of the current pressures on the National Health Service.

    JS
    That's right. I mean, if you've got more people that aren't in good health and have limiting conditions that's going to have increasing pressure on our health services and our GP services.

    MF
    And it does mean also that people are dying from different things, and they might have died younger from different conditions. They're living longer, but perhaps in poorer and poorer health in many cases, and in the end, actually dying from different causes. What are the data saying?

    JS
    So there's a range of different factors which are associated with a healthy life expectancy, and things that you'd probably think yourself. So when we looked at areas across the country with the lowest healthy life expectancy, 29% of males aged 30 to 49 smoked compared to just 17% of those that were in the highest healthy life expectancy areas. So smoking is clearly one of the drivers. We've also looked at whether people are overweight, and more than one in eight children in the lowest healthy life expectancy areas became overweight between entering primary school and starting secondary school. In contrast, those in the highest healthy life expectancy areas, it was just one in every 10. So there's a number of different factors there that we can see are driving it.

    MF
    If any justification was needed on why public health campaigns tend to concentrate on issues like obesity and smoking that's starkly revealed in the numbers.

    So that's the big picture. That's what's happening at a national level. But tell us about the differences from place to place because the local variations are quite significant too, aren’t they?

    JS
    That's right. So to commit those geographical variations, Ribble Valley in Lancashire is ranked the healthiest out of 307 local authority areas in England, and that's using the Health Index.

    MF

    And the least healthy?

    JS

    So we do have all those rankings, but we do try to not think about the scores in a sort of ranking capacity. The whole point of having this information put out there is for local authorities to be able to compare themselves with similar local authorities or their nearest neighbour and see how different aspects of health are given the different policy initiatives that they're implementing in their local areas.

    MF
    Because lo and behold, whenever these league tables – and I do emphasise that we don't claim them to be league tables, they're often seen as such - when they appear of course, people want to know where is top. Whereas, surprise surprise, normally it goes with socio-economic status doesn't it. To put it bluntly, the better off areas see the highest life expectancy and healthy life expectancy?

    JS
    Yes, that's right. And even for those areas, you'd want them to be perhaps comparing themselves to other similar areas with the same sort of socio demographics and then to think about where different aspects of, whether it's smoking prevalence or childhood obesity, how are those different areas responding, what are the policies that they're putting in place to try and improve those statistics.

    MF
    Because again, it's not a matter of stating the obvious, which is self-evident, isn't it? Health outcomes tend to be better in more prosperous areas. This has been well known for some time, although we opened a local paper the other day writing up some of these numbers and saying certain towns in the West Midlands have been named and shamed as having the worst health locally. This is emphatically not about naming and shaming areas, neither is it about stating the obvious. As you say it's about informing better health outcomes, so resources can be better targeted.

    JS
    That's right. I was actually looking at a Coventry Marmot city review, and they have been using a whole range of different public health measures to try and improve the outcomes in that area. And one of the key measures they use is healthy life expectancy. They're comparing the outcomes after a number of years in their area to what's been going on nationally. So it's helping them benchmark the initiatives that they've been putting in place

    MF
    As with the overall Health Index itself, it sets the standard doesn't it. Puts in numbers what is clearly self-evident, but useful numbers because they give you that sense of the scale of the issue at the local level. That's at least as far as England is concerned, but also we've been working with the devolved administrations around the United Kingdom as well, and what do we know about that picture?

    JS
    So on the Health Index, that's actually one of the areas that we were looking to expand. So the Health Index at the moment covers England – we would really like to develop them for Scotland, Northern Ireland and Wales and then create a UK wide one as well. So that's something that we're looking to develop in the future.

    MF
    That's a work in progress, and a ‘watch this space’ then for forthcoming publications, both of the Health Index and of data being compared across the UK as well.

    So Neil, people are living longer, but with that experiencing a whole range of health conditions. Tell us what we're picking up in the data and what's changing.

    NEIL BANNISTER
    That’s right Miles. So age is a very big important social determinant for health and an ageing society places a big burden on the health and social care systems in the country. Recent Census analysis from the 2021 Census showed that nearly one in five people in England and Wales was over 65 now, with the fastest increase happening in the 85 plus age group. So there really is a fundamental kind of growth in the ageing population, and that leads to increases in certain disease types. So for example, we know that being in an elderly age group you can experience being more disabled and having more multiple chronic and complex health conditions as well as there being an increase in dementia and Alzheimer's disease. So for example, with dementia, we know that around about 900,000 people in the UK have been diagnosed with dementia and by 2025 it is expected to reach around about 1 million people in the UK. In terms of how we look at it from our data within ONS, we know that 12.5% of all deaths that we record are caused by dementia and Alzheimer's, and it is the leading cause of death in age groups over 80 plus within England and Wales.

    MF

    That is a relatively recent development.

    NB

    That's right. So that's happened really over the last three to five years, we've seen this increase in the dementia and Alzheimer's as a leading cause of death in England and Wales.

    MF
    And is the rate of increase showing any sign of abating?

    NB
    Well, if you take away the COVID pandemic period, no, it doesn't. It looks like it's actually on track to continue to be the leading cause of death and with the new figures that we have in from Census showing there is an ageing population, and the age is increasing, we would expect there to be a continued increase in the number of deaths from dementia and Alzheimer's and, as I said, the number of diagnoses as well.

    MF
    Yes, that's a stark finding and something you'd suspect we're going to be hearing quite a lot more about.

    NB
    It's not just within the UK that this is occurring though. When you look across other economically developed countries. So looking at the data from the OECD, for example, we can see that Japan, Italy and Greece - these are countries with well-known elderly populations - they have a very high prevalence of dementia. The UK out of the 44 OECD countries UK is 15th highest in terms of the prevalence of dementia, which is equivalent to where Denmark is as well in terms of comparability.

    MF
    And that speaks loudly to some of the challenges the health system is going to face in future, and the social care sector as well, which is already under pressure in some respects. Tell us about the potential impacts there, what are we seeing?

    NB
    What we found during the pandemic is that there are big gaps in data around social care statistics and being able to understand that population within our society and that group in society.

    MF
    Is that because the sector is diverse, and it's sprawling and it's uncertain and in places it's quite informal?

    NB
    Absolutely. There are different types of social care. There's social care that happens within care residences, and there's also social care that happens within the home. There’s a big private industry there as well as the public sector being involved. And trying to pull together information across that diverse and complex landscape is very difficult.

    MF
    What are we doing to try and close some of those gaps?

    NB
    So we're working very closely with the Department of Health and Social Care. They have a large programme of work to try and collate data and improve data collections across the piece. What we've been doing, we've been looking at particular areas. So we're looking at trying to understand more about self-funders - individuals who fund their own social care, as opposed to those who have the state to fund it for them. And other areas of what we're doing is to look also at the workforce in social care, which is very hard to track over time and to understand the size and scale of that workforce. So that's another area of work that we're doing.

    MF
    And this is just part of a much wider body of work going on across the ONS to try and shed new light on health inequalities in particular.

    NB
    Yes, that's right. So we are going to be using the Census, the 2021 census data, to really look in more detail at social care once that data becomes available. But what we have been able to do though, during the COVID pandemic, is use the 2011 census data to link to other sources to really understand how, for example, the COVID pandemic had impacts across a number of different groups in society. We were able to produce statistics for the first time looking at the impact that COVID had on particular ethnic groups, on religious groups, and on the disabled groups in society.

    MF
    And what did we discover about the unequal impacts of COVID?

    NB
    Yeah, so when we're looking at ethnicity for example, since the start of the period where the Omicron variant was more prominent, we found that the Bangladeshi ethnic group of males had the highest rate of death of COVID-19, as opposed to the white British group. And we also found that for females, the Pakistani ethnic group had the highest rate of death involving COVID-19, which is 2.5 times higher than that of the white British group

    MF
    On the topic of ethnicity, was it factors such as the nature of the occupations undertaken by those groups, or perhaps socio-economic status, living conditions and so forth? Or was there something, by the very nature of their ethnicity, that was actually contributing towards higher mortality? Have we got to the bottom of that?

    NB
    It's very hard to know that, Miles. What we've done is some complicated modelling to understand, and we've taken into account certain social demographic groups and economic factors, but we still do find that certain ethnic groups have a higher rate of death, even when taking into account those factors. Things that it could be, but we don't know the detail yet, could maybe be how people in those ethnic groups live in terms of having multi-generational households, for example, and maybe that was a contributing factor, but to understand that in more detail much more work is needed to be done.

    MF
    Another area where research remains in progress. And also more recently, we've gone into partnership with one of the world's great philanthropic organisations to try and uncover what's going on behind some of these inequalities.

    NB
    That's correct. So there's a piece of work that we're doing working with the Wellcome Trust and the Race Equality Foundation. And what we're trying to do is to understand that there are different sources of ethnicity data within the health system and also with our Census data as well. What we know is that there are different qualities of how that data is recorded. What we're doing with the Wellcome Trust is to really understand the quality of the data across the different sources so we can provide a better understanding of the analysis that can be done with those data sources, which is really important as its data itself, which is a fundamental building block of any analysis that we can undertake. And the quality of that.

    MF
    It's quite hard to disentangle the effects of the pandemic at the moment, and it's probably worth discussing those. Are we in a position yet to know how life expectancy has been affected by COVID?

    NB
    At the moment, we have some indication. So the last publication we produced for healthy life expectancy covered the period of 2018 to 2020, which has a period of a COVID pandemic within that analysis, and that did show that there has been a drop in healthy life expectancy both in England, Wales and Scotland. But what we don't know for certain yet is the full impact of that because we haven't had the data to analyse for the entire pandemic period. And that's work that's ongoing within the office.

    MF
    So we will in due course then be able to get a much better understanding to what extent life expectancy might have been impacted by long COVID. But in the meantime, other ONS data suggest that a lot of people at least say or think they are suffering long term effects from it.

    JULIE STANBOROUGH
    That's right. We estimate over 2 million people in the population are experiencing long COVID. And it is self-reported long COVID. So we collect this data from the COVID Infection Survey, which was started at the beginning of the pandemic and people are reporting whether they're experiencing a whole range of different symptoms, which are associated with long COVID. And we've been monitoring that on a monthly basis to see whether those numbers have been increasing or decreasing and which types of people in the population are more likely to be experiencing long COVID.

    MF

    And what’s been the pattern of those numbers?

    JS

    It’s actually been broadly stable, a slight upward trend but broadly stable over time, and you would hope that over time it will start to drop down, but we're not in that situation at the moment.

    MF
    So the data at the moment is seeming to suggest that - so obviously, we know a lot of people have been infected - a lot of people seem to be suffering symptoms for a protracted period afterwards, but at least as far as the data are concerned, they will tend to imply a lot of those people are getting better.

    JS
    So we measure whether people are experiencing long COVID after a set number of weeks. So there's a significant proportion of people that still experiencing long COVID At least 12 months after their first infection – it is a small group but it is a significant number of people. But of course it has impacts on their ability to go about their day to day lives. Look after family, go to work, study. So it does have a significant impact on people

    MF

    And what sort of effects are they reporting?

    JS

    So it can be a range of things from fatigue, breathing difficulties to perhaps more severe symptoms. So a whole range of different symptoms.

    MF
    What further analysis are we doing on the impacts of COVID generally? We've explored differences in ethnicity, other characteristics as well. Tell us a little about that work and what’s up next for this programme of research.

    JS
    As you say, yes, we've done a whole range of different analysis to support the COVID pandemic. A lot of the analysis that we have produced has gone into the COVID Insights tool, which is on the ONS website. And that brings together a range of different data and analysis around hospitalizations, infections and deaths but also tries to put it into a sort of societal context, in terms of wellbeing, and employment as well. It's actually one of the most looked at on the ONS site.

    MF
    So even though the pandemic subsides - as at least we hope it will - a lot of work will continue to assess its full impact.

    JS
    That's right. It will be trying to understand in more depth what happened during the pandemic as well as monitoring the long-term effects, either on employment or in terms of people experiencing long COVID.

    MF
    The ability to link data to provide complex insights, of course, is such an important area of research at the moment.

    And that brings us to Jonny Tinsley. Jonny this is very much your area of expertise.

    And with that in mind, tell us about the Public Health Data Asset. What is that and how does that bring together data in that very useful way?

    JONNY TINSLEY
    During the pandemic data became incredibly important to understand what was going on and a lot of data sources in the health space exist. The NHS collects an awful lot of information about people and a lot of other organisations produce analysis, including the NHS themselves of that data. But one of the things that is unique to ONS is its access to non-health data and in particular, the Census data. By bringing in some of that health data from the NHS, which we’re able to do for statistical purposes, we were then able to link that with the Census 2011 data and also our mortality data and create what we call the Public Health Data Asset. And what that effectively gives us is a huge cohort of people that were here in 2011, at the Census and then in combination with that mortality and health data able to analyse, giving it such a huge cohort of people. It allows us to have quite a lot of power and the statistics we can produce and pick up. Some of the differences that Neil was talking about actually, because the Census data includes things like ethnicity, religion, and disability status. We're able then to look at differences across those groups for things like COVID-19 mortality.

    MF
    So we can track essentially, as I understand it, we can track what's happened to individuals' health over that period of time, from the information supplied to Census and from their interactions with the NHS and other public services?

    JT
    To give a specific example, what we can do is for these different groups, so the Census effectively allows us to separate out the groups. For example, it shows that these people are of this ethnicity whereas these people are of this ethnicity, and then for those groups we will then know which people have died and when and what was the cause, in particular during the pandemic obviously, whether that cause was COVID-19. And then the main thing the health data has allowed us to do so far is look for what we would call comorbidities. So who has pre-existing conditions that put them at risk of poor outcomes from conditions such as COVID-19 in this particular case,

    MF
    And that will help the health services to be more predictive of the sort of conditions people are likely to face?

    JT
    Yes, to a degree. So some of that was already known. But what it allows us to do is if a particular ethnic group tends to suffer from certain conditions more than another, by taking those co-morbidities into account, we can do what we would probably, in layman's terms, call ‘control’ for them in the models, and therefore effectively discount them. And if any differences still remain after that, between different ethnic groups, then something else must be going on. And as Neil says, one factor for example could be the multi-generational households impacting how likely it is for transmission to happen.

    MF
    And as well as differences by ethnicity and other characteristics it allows this to be done at a very, very local level as well, because of the sheer scale of these databases.

    JT
    From a data point of view, that's a really interesting question, because we have the COVID Infection Study, which Julie mentioned earlier, and whenever we do a survey we try and make it as representative as possible. So obviously it's not everyone. It might still be several hundred thousand people, as it is with the COVID infection study, but ideally it's made to be as representative of the population as possible, such that if 2% of the sample are infected with COVID-19 that week, it probably means that 2% of the whole population of England and Wales also are. But one of the downsides to doing a survey is even if you have a large number of participants, the statistics you're able to produce are at a really low level, this isn't as good because the number of people you have available in a small area that are actually in your study can be really, really small. And it can also make it more difficult to pick up these differences between groups, such as ethnic groups. Whereas when you've got the Census data, because you've got a much larger sample of people in your study, it gives you more statistical power and you can pick up the differences more easily, and produce lower-level statistics more easily. The downside, because there's always upsides and downsides when it comes to data quality, is that the 2011 Census data is now somewhat out of date. And the cohort, the kind of study population we had available to us, by definition excluded anyone who's been born since 2011, and anyone who's immigrated to the country since 2011. Because you know, they weren't here for us to pick up in the Census in 2011. So there's some work we've done to think about just how representative what we're calling the Public Health Data Asset is in terms of who is included and who isn't, compared to the people that are actually here in the country right now and have been during the pandemic, if that makes sense.

    MF
    Potentially, this is an incredibly valuable resource, but primarily who is it for and what are they going to use it for?

    JT
    To an extent they were for the general public of course, but also important stakeholders - decision makers during the pandemic like the scientific advisory group for emergencies (SAGE). Listeners will probably be familiar with people like Chris Whitty, the chief medical officer, and so these sorts of decision makers were finding everything we were doing really useful and at times even commissioning us to produce particular statistics or analysis using this really powerful dataset that we had available to us.

    I mean, one aspect of this that I think is really important to talk about is data protection, confidentiality and kind of ethical uses. So to be clear that we take our responsibilities when it comes to the Census data really seriously and as many people know, we don't release Census data until 100 years after it's been collected. And when we get in other data for statistics that's of a sensitive nature, like the health data, we have all sorts of processes in place to secure that in our secure data systems and ensure that only highly trained security cleared staff can access it. So primarily, we're talking about substantive ONS employees who are specially trained using this data for statistical purposes. We will then use the data to produce the statistics that our users are telling us they need the most.

    MF
    Yes, it's well worth emphasising the data protection side there because obviously we are talking about vast amounts of highly sensitive data. And if anyone's interested in finding out more about how we approach those issues at the ONS, do please have a listen to our podcast on data ethics, where we explore that topic in some detail.

    Overall then, a lot of ground-breaking work going on, a lot of new data coming in and we have to say that it has actually been picking up some prestigious awards.

    JT
    That's right Miles. Some of the work we've talked about has won a number of awards, probably the most prestigious being the RSS Campion award for official statistics.

    MF
    After hearing a lot of really quite sombre detail over the course of our conversation today, it's good perhaps to end on a relatively upbeat note. At least people can be assured that so much work, so much research, is going on to try and anticipate some of these problems before they manifest fully. And we hope of course to contribute to improving health outcomes over the longer term.

    So that's it for this episode of Statistically Speaking, I'm Miles Fletcher. Thanks very much to our guests Julie Stanborough, Neil Bannister and Jonny Tinsley. Thanks very much to you for listening once again.

    You can subscribe to new episodes of the podcast on Spotify, Apple podcasts and all other major podcast platforms. Thanks again to our producer at the ONS Steve Milne for this episode and, until next time, goodbye.

    ENDS

  • In this episode of Statistically Speaking Miles is helped with his enquiries by Meghan Elkin and Billy Gazard from the Office for National Statistics, as he investigates how we use data to get valuable insights into the impact of crime on modern society.

    Along the way he debunks common misconceptions; learns how the nature of crime continues to evolve; and uncovers the work being done behind-the-scenes to make crime data more inclusive.

    TRANSCRIPT

    MILES FLETCHER

    Hello, and welcome again to ‘Statistically Speaking’ the Office for National Statistics podcast. I'm Miles Fletcher and in this episode, we're going to be investigating crime.

    What is the statistical evidence that despite the impression you might have got from the media, overall crime in England and Wales has actually been falling? Or is it the case that the nature of crime has simply changed and we're more likely these days to be targeted online than in the streets, and what in any case is the value of understanding the overall level of crime when that term captures such a wide and varied range of social ills and harms?

    Helping us with our enquiries today are Meghan Elkin, head of the ONS centre for crime and justice, and Billy Gazard head of acquisitive crime and stakeholder engagement.

    Meghan, so much to talk about in the many and varied crime figures that ONS produces, but let's focus first on where those numbers come from. In this case, there are two major data sets and the first and arguably the most significant of those, statistically at least, is a very large survey and it's not information gathered from the police or government. It's information that comes directly from people and their experience of crime. Tell us all about that.

    MEGHAN ELKIN
    That's correct. So the best source we have for measuring crime is the crime survey for England and Wales and this is a massive undertaking. We interview around 34,000 people aged 16 and over each year, and over 2000 children, and we really appreciate everyone who takes the time to respond to our survey as it helps us to produce these important figures. As you said, crime covers a wide range of offences and there's no perfect source, but the crime survey has had an established methodology over a long period of time, which really helps us to get a good idea of the trends and changes in society that people are experiencing.

    MF
    Give us a sense of the scale of this operation. Is it one of the biggest surveys the ONS runs?

    ME
    It is, I would say that we are consistently speaking to 34,000 people each year and what's probably different to most surveys is that we have children as part of the response as well. So when we go to a household, we'll interview an adult, so someone aged 16 and over, to ask about their experiences. If there are children aged 10 to 15 in their household. We'll also ask if one of them would be able to complete our children's survey so that we get a picture of the crime that they're experiencing as well.

    MF
    And what is the particular value of speaking people to people face to face in their homes like that?

    ME
    I mean, the real value of the crime survey for measuring the trends is that it doesn't matter if people have reported what they've experienced to the police or not, so unlike police recorded crime, it doesn't have that impact. And so we can ask people about their experiences in the last 12 months. We'll also ask them questions about their attitudes towards crime related issues such as the police and amount of security that they have, and for the most sensitive questions rather than being asked by the interviewer directly, we'll give someone a tablet so that they can complete those questions privately themselves to ensure that confidentiality and confidence in telling us such sensitive information.

    MF
    That’s taken the survey into some quite new areas, hasn't it in recent years, would you like to talk about some of those developments? You talk about actually, and this is highly unusual, of course a very sensitive area, it's about the ability to actually speak to children as well. Tell us to what end that work has been directed...

    ME
    So for children in particular, we've been working closely with a number of stakeholders to understand what's most useful for us to ask children. So we do collect their general experiences of crime in the last 12 months, but we also ask them about their experiences online and that's provided some really useful data about children's lived experiences about being bullied and whether that's happening at school or online, but also the behaviours and activities that sometimes could be quite risky that they're taking part in online. And that's given some new information into that sector that we had just not understood before, and has been really useful in shaping policy and understanding how children can be better protected online.

    MF
    So this is quite an intensive encounter with the ONS data gatherer as they're sitting down for about 40 to 45 minutes or so. But how are the people selected? And how do you go about ensuring that they're a good representative sample and that we're not missing out important sections of the population, which, on a subject like this, of course, it's very important to get a really accurate picture of how people are experiencing crime at that grassroots level.

    ME
    So we use a postcode address file, basically a list of addresses to sample from, so households are chosen at random to ensure that we've got a representative sample for England and Wales. That's why it's really important and we really appreciate people responding to the survey because that's how we ensure good quality data, by getting that good, rounded sample.

    MF
    So there's a lot of rich data coming out of the crime survey, but by its nature, it doesn't cover some of the more serious offences does it?

    ME
    No that's true, particularly the higher harm but lower volume crimes, for example knife crime, those don't appear in the survey very often. And so we look to other data sources for those. It also excludes crimes that are often termed “victimless”, such as possession of drugs, which again, we then measure through different sources.

    MF
    And that is where the other major data source starts to become more relevant. We're looking at very serious offences particularly, including murder and rape. Those offences are covered by the police and their recording of crime. Tell us about the value of that data, and how that contributes to the wider understanding of crime.

    ME
    So the police record all the crimes that are reported to them and those are fed into us via the Home Office as a record of police recorded crime. And it has lots of advantages as a data source in that for some crime types, it is a good measure. And unlike the crime survey for those crime types, it can be very good at looking at short term trends. So particularly through the pandemic it was helpful for some of those crime types where we know that it's a better measure. But we also know that there are a lot of crimes that people don't report to the police and that's where that source of data struggles the most, particularly for really hidden crimes. Rape would be one of those crimes, where relatively few people do report that to the police so it doesn't appear in the numbers as much. But the police figures are subject to changes in recording practices. So when new offences are introduced that obviously changes how the count is put together, but also it’s impacted by police activity and how they record and that also will change the numbers. When you see increases in police recorded crime, for example, it doesn't necessarily mean that crime has gone up. And that's part of our work at ONS to unpick and understand what's going on there. But it does have benefits as you say, for some of the higher harm but lower volume crimes that we see, homicide it records very well, and for knife crime it's our best measure. So there's definitely a place for it as a data source still.

    MF
    So two major data sources contributing to this bigger picture. And what has that bigger picture been showing us these last few years?

    ME
    Well when we look across trends in general, actually, over time, crime has been decreasing since the mid 90s, and has been more flat in recent years. So the crime survey estimated around 20 million offences in 1995. And we've seen that decreasing over time and our latest data shows that it's around 5 million offences. And that's when you're using a comparable estimate. So the overall picture is very much if that crime sits much lower than it used to in the mid 90s. And that's not just a pattern that we've seen in England and Wales. It's a pattern that's reflected across other countries, across Europe and America. And it's something that lots of people have tried to understand what's really driven that long term change. More recently we have seen some decreases, some of them very much linked to the pandemic. But now as we look and compare before the pandemic to our most recent data, we have still seen some decreases. I think it's always important to point out that while total crime is a useful measure and reflection, it's only when you really start digging into the individual crime types that you can start seeing some trends that just get averaged out when you look at the total.

    MF
    Yes, you need to understand what kind of offences we're talking about. And if we talk about that long term picture, isn't it the case that we saw, coming out of the 1980s into the 1990s, turn of the century, violent crime decreasing, damage to property and so forth and theft from cars. Was that the broad trend that we saw?

    ME
    Yeah, so we've seen decreases in that time period across a number of crime types. One of the most popular explanations of the overall pattern there is the “security hypothesis”, which is very much built on the widespread improvements we've seen in security devices which have prevented crimes from happening and caused that decrease. So you mentioned there of vehicles, vehicle theft has decreased, most likely due to some things like improvements in central deadlocking systems and electric immobilisers, those security measures that have improved so much. But we have also seen decreases in violence across that time as well.

    MF
    Threat to property is one thing of course, but yes, personal safety and and our well-being on the streets, is of course a major factor as well. Talk us through the trends on that because if you rely entirely on the news media for your understanding of violent crime, you probably think that things are in a pretty desperate situation.

    ME
    So when we look back over that long term picture again, the estimate that we have from the crime survey for violence shows that there were around 4.5 million offences in 1995 And that compares to 1.2 million in the most recent data. Obviously, we've talked about the limitations to the crime survey data for understanding violence, but the more serious crimes within this type that we don't see in the crime survey are at much lower levels. They are lower volume, thankfully, and so we have seen some patterns there of variation during the pandemic.

    MF
    Another important development these last few years, of course, has been getting a much better understanding of the nature and extent of child abuse, an area of huge sensitivity and massive public concern. Can you talk a little about the work that's been going on in that area?

    ME
    So we've been conducting a feasibility study over the last few years to look at whether a measure of prevalence of child abuse could be estimated. A few years ago we put together a compendium of statistics on child abuse to help people understand the levels of child abuse and the nature of child abuse being experienced in our society. But the major gap in that evidence base is a prevalence level for what's being experienced now by children. We do in the crime survey for England and Wales ask people about the experiences they had as children. So we asked that of adults and that gives us some insight but it's still not helping policymakers understand what's actually happening in society today. So we've been conducting lots of research to understand the challenges, and how we might be able to overcome those of asking children such sensitive questions. And that work has been going really well, we're now at the stage of looking at what questions could actually be asked and the safeguarding that would need to wrap around that survey to look after the children completing it. So we're working very closely with DFE and Ofsted and schools to understand how that might best work going forward. So that's the next stage of that project.

    MF
    What has that experience and that engagement brought to this highly sensitive topic?

    ME
    We work very closely with the NSPCC, who have been extremely supportive of the project and how it's developing and helping us understand the safeguarding procedures that we might be able to use with a survey, and the support that we can give children and the different ways of doing that. There's a careful balance of helping children feel they are able to open up and tell us about experiences while also then safeguarding them and managing that challenge of confidentiality. And the NSPCC and others like them, obviously have great experience of being in this place and supporting children that we can then take on board to make sure that we do the survey in the best way possible.

    MF
    That's going to remain an important piece of work for the future. If there's one really important use of all this data, it is to understand the risks that any of us face of becoming the victim of crime at any given time. Billy, what are the numbers saying about that?

    BILLY GAZARD
    So I think it's quite a complicated picture. When we're talking about all the crime that the crime survey measures, for example, just under one in five people would have experienced a crime in the last 12 months according to the latest data, but obviously that varies across different crime types. So for example, fraud, about one in 12 people would have experienced fraud in the last 12 months, whereas offences such as violence, only about 2% of the population would have experienced a violent offence in the last 12 months.

    MF
    That overall is kind of reassuring, I guess, but nevertheless, those are significant sections of the population.

    BG
    Yes, I agree. That still translates into a lot of people experiencing that crime. So obviously, it's really important that we continue to monitor levels of violence moving forward to see how that changes over time.

    MF
    And if you break it down by geography, I guess of course, in some areas, those risks, particularly of violence and crimes against property are going to be much higher?

    BG
    This is looking at the national picture, but there will be variations at geographical levels, as well as by lots of different characteristics. For example, we know that younger people are at more risk of experiencing violence than older sections of the population.

    MF
    So that's the overall picture, but Meghan the risks might be rather different if you happen to be female.

    MEGHAN ELKIN
    There are some crime types that disproportionately affect women and girls compared to men and boys. Say for example, we estimate 1.6 million women aged 16 to 74 suffered domestic abuse in the last year and that one in three women over the age of 16 were subjected to at least one form of harassment in the last year. So there again, there is that variation in crime types that people are experiencing. And when we look at measures around domestic abuse, again, the crime survey for England and Wales is our most trusted measure. And as I reflected earlier, those are the crime types where we actually give respondents a tablet so that they can complete those questions confidentially. And actually, that posed us a particular challenge during the pandemic where our face-to-face interviewing had to stop and we moved to telephone interviews, and we managed to make that switch very quickly to be able to keep getting the crime estimates that were needed to understand society. But we did think there was a risk of asking people on the telephone those really sensitive questions about experiences of domestic abuse and sexual assault, but the concerns around confidentiality and respondent safeguarding were just too great for us to be able to ask those questions. So for a period of time, we weren't collecting that information when the survey returned to the field, though, we went back as early as we could so that we could start collecting those important topics again. And we now have the first data from those for domestic abuse since before the pandemic started. Now that we've started to get the face-to-face survey back into publication, some caution needs to be taken for interpreting those results. Because of how the surveys come back there are some challenges to quality and so again, we need to be a bit cautious in interpreting them, but it's so important that we've got those figures back. And actually what we see from the crime survey is that there's been no change in the prevalence of domestic abuse in the most recent data when compared to before the pandemic. But this is an opportunity to show how we use multiple data sources to really understand what might be going on in society and what people are experiencing. Because while the crime survey has now shown no change in the prevalence of domestic abuse, we have seen increases throughout that time in police recorded crime data. And we've also seen increases in data that we collect from charities. We work closely with a range of charities in the domestic abuse space to understand the changes that are happening to their services and the demand they're seeing, but also to help us understand the nature of abuse. So during lockdowns for example, we saw a 22% increase in calls to the National Domestic Abuse helpline for the year ending March 21, so there was definitely that increase in demand. But now combining that with the crime survey evidence that we haven't seen an increase in prevalence, actually that helps us understand that maybe that increase in demand from charities primarily came from a lack of other coping mechanisms and people reaching out in different ways to get the support they needed during that difficult time.

    MF
    And that would be seem to be a very valuable example of using other data sources than police recorded crime to get an accurate picture of what's going on, because of the simple reluctance that so many people have in reporting these experiences when they happen to them.

    ME
    Yes, that's true. I mean, the evidence we have is that one in five victims of partner abuse in the last year would have told the police, and that just shows how hidden these crime types are. And that's true when you look into sexual assault as well, where one in six tell us that they would have told the police about what happened to them. And that's an area where we haven't used charity data before to help us understand sexual assault, but it's something we're working on at the moment to be included in next year's publication.

    MF
    Are there other areas of offending where we could possibly get a better picture than we currently have at the moment?

    ME
    So harassment is also an area that in initial work on violence against women and girls, we found that actually there wasn't as much data as we thought there might be to help understand that situation. So we used some questions on the opinion survey last year to help us understand levels of harassment at a very basic level, for lack of a better description. But we've now introduced new questions on the crime survey as well to help us understand that topic. But again, I think that's going to be one where when that data becomes available to us to analyse we'll be able to start looking at other groups and organisations we'd like to work with to understand better that situation and support our efforts to make our statistics more inclusive. Working with stakeholders really helps us to look for these new data sources and new insights, to really understand the scale and nature of crime that people are experiencing.

    MF
    So Billy, just as the recording of crime evolves over time, so does what we consider to be a criminal offence. Tell us about the offences of the past, and what acts were regarded as criminal in their day and are no longer.

    BILLY GAZARD
    Yes I think it's important to remember that our laws are always changing to reflect the concerns of our society as it evolves. Something that was criminal hundreds of years ago might sound pretty absurd today. So for example, playing football used to be an offence in mediaeval times, there was punching the ball as well as kicking the ball and deaths were not uncommon.

    MF

    That's not a well-known fact!

    BG

    It actually became an offence in 1388 and wasn’t repealed until 1845.

    MF
    And were any people prosecuted? I think we know we know what the sentence was...

    BG
    The sentence for breaking this law was actually six days in prison. There are no stats on how many people were punished for breaking this law or how many people were put in prison for that. Another offence it seems absurd today, but we do have some stats on how many people actually were prosecuted for this is witchcraft. This actually became an offence in 1542 and it wasn't repealed until 1736 and during this time 500 witches were put on trial and over 100 of those were executed.

    MF
    Grisly stuff! History has moved on, and of course these days we're dealing with some very 21st century phenomenon, and that of course is the growth of online crime, cybercrime and and phishing scams. Tell us about the emergence of that type of offending. What has happened over the last few years and what is the position now?

    BG
    So what we've seen with fraud and computer misuse offences is very different to what we've seen with other types of offences. But unfortunately, we've only been starting to measure fraud and computer misuse offences on the crime survey since 2017. So we don't have the same long standing time series that we have for the other crime types, and this is obviously because a lot of online crime, this is a fairly new phenomenon, so we've taken our time to really develop those questions and now they are on the survey. What we have seen over the last five years since we started recording these offences is that those offences have stayed fairly flat over that time period. Over the pandemic however, we did see an increase in fraud and computer misuse offences during that period and we think that's probably to do with people spending more time at home and spending more time online. And what we did see in terms of fraud, we saw that the proportion of fraud incidents that were cyber related increased up to almost two thirds, from about 50% before the pandemic. So it suggests that actually, a lot of the rise in fraud offences that we did see were because of a rise in cyber related fraud rather than offline fraud.

    MF
    One popular conception is that it's mainly elderly people who are the targets of this online crime, but that's not actually the case is it?

    BG
    No. And definitely, when we look at our data, that's definitely not something that we're finding. Actually what we find is that adults aged 75 years or older are actually less likely to be victims of fraud. It's those in working age groups, adults aged 25 to 44 for example, who are more risk of receiving phishing messages, those employed and those living in less deprived areas are much more likely to receive those messages. And that might be to do with fraudsters targeting those groups because they know that they have more disposable money. So definitely older people are at less risk than the working population.

    MF
    So there is something in this argument perhaps that crime generally has moved online?

    BG
    I think there’s definitely an argument that a lot more crime is happening online, and we're definitely seeing that with fraud incidents. We have less data on other crime types, though obviously the internet and the act of being online can be used across many crime types. For example, harassment, stalking, these are other offenses that people can use online tools to commit. And that's something that we're always trying to improve on the crime survey, to introduce those types of questions so that we can get a better understanding of how online tools are being used to commit crimes.

    MF
    So online crime is a relatively recent development, but crime and offending of course, continues to develop unfortunately and go in different directions. Tell us about other developments that the ONS has got in hand. To try and either capture new types of offending or perhaps just get a better insight on more established patterns of crime and harm.

    BG
    In the Crime Survey for England and Wales (CSEW), we ask people living in private households lots of questions about their experiences with crime so we can produce an estimation of how much crime that group of people is experiencing. That's about 98% of the population of England and Wales. But what the survey doesn't cover is people who do not live in private households. This covers, for example, people living in residential care settings, or homeless people, students living in student halls. So although this is only about 2% of the population, these groups have very different experiences of crime. And it's really important that we also try and capture their experiences so that we can provide information for policymakers to take action on the crime that those groups are experiencing. One of the things that we're trying to do is produce a publication looking at crimes experienced by non-household populations as well. So we're currently doing some work investigating what other data sources are available that we can use to shine the light on it at ONS and share that information alongside what we do with household populations. So we're going to be going out talking to various stakeholders, talking to different data holders to see how can we work together to bring a picture of all the data that we have and better understand what the risks are for these groups in terms of experiencing crime and how can we bring all of that together.

    MF
    So as crime continues to evolve you can count on one thing, the ONS will continue to measure it, and explore it, and hopefully contribute to solving it.

    Thanks very much to Meghan Elkin and Billy Gazard. I'm Miles Fletcher, and you've been listening to ‘Statistically Speaking’. You can subscribe to new episodes of the podcast on Spotify, Apple podcasts and all other major podcast platforms.

    Our producers at the ONS are Steve Milne and Alisha Arthur.

    Until next time, goodbye.


  • In this episode Miles is joined by Professor Luciano Floridi of Oxford University; Simon Whitworth of the UK Statistics Authority; and Pete Stokes from the ONS to talk about data ethics and public trust in official statistics.

    TRANSCRIPT

    MILES FLETCHER

    Hello, I'm Miles Fletcher, and in this episode of Statistically Speaking we're exploring data ethics and public trust in official statistics. In 2007, 15 years ago to the very day we are recording this, the UK Parliament gave the Office for National Statistics the objective of promoting and safeguarding the production and publication of official statistics that serve the public good. But what does, or should, the “public good” mean? How does the ONS seek to deliver it in practice? Why should the public trust us to act in their interests at a time of exponential growth in data of all kinds? Where are the lines to be drawn between individual privacy and anonymity on the one hand, the potential of data science to improve public services and government policies to achieve better health outcomes, even saving lives, on the other.

    Joining me to discuss these topics today are Simon Whitworth, Head of Data Ethics at the UK statistics authority, Pete Stokes, Director of the Integrated Data programme here at the ONS and Luciano Floridi, professor of philosophy and the ethics of information and director of the digital ethics lab at the Oxford Internet Institute.

    Professor let's start this big concept with you. What do you think Parliament meant when it said that the ONS should serve the public good in this context?

    LUCIANO FLORIDI

    It might have meant many things, and I suspect that a couple of them must have been in their minds. First of all, we know that data or information, depending on the vocabulary, has an enormous value if you know how to use it. And, collecting it and using it properly for the future of the country, to implement the right policies, to avoid potential mistakes and to see things in advance - knowledge is power, information is power. So, this might have been one of the things that they probably meant by “public good”. The other meaning, it might be a little bit more specific...It's when we use the data appropriately, ethically, to make sure that some sector or some part of the population is not left behind, to learn who needs more help, to know what help and when to deliver it, and to whom. So, it's not just a matter of the whole nation doing better, or at least avoiding problems, but also specific sectors of the population being helped, and to make sure that the burden and the advantages are equally distributed among everybody. That's normally what we mean by public good and certainly, that analysis is there to serve it.

    MF

    So there's that dilemma between using the power of data to actually achieve positive outcomes. And for government, on the other hand, being seen as overbearing, or Orwellian, and spying on people through the use of data.

    LF


    That would be the risk that sometimes comes under the term “paternalism”, that knowing a lot about your citizens might lead to the temptation of manipulating their lives, their choices, their preferences. I wouldn't over-emphasise this though. The kind of legislation that we have and the constraints, the rules, the double checking, make sure that the advantage is always in view and can more easily be squeezed out of the data that we accumulate, and sometimes the potential abuses and mistakes, the inevitable temptation to do the wrong thing, are kept in check. So yes, the State might use the government’s political power, might misuse data, and so we need to be careful, but I wouldn't list that as my primary worry. My primary worry perhaps, would be under-using the data that we have, or making mistakes inadvertently.

    MF

    Do you think then, perhaps as a country, the UK has been too cautious in this area in the past?

    LF

    I don't think it has been too cautious, either intellectually or strategically. There's been a lot of talking about doing the right thing. I think it's been slightly cautious, or insufficiently radical, in implementing policies that have been around for some time. But we now have seen several governments stating the importance of that analysis, statistical approaches to evidence, and so on. But I think that there is more ambition in words than in deeds, so I would like to see more implementations, more action and less statements. Then the ambition will be matched by the actions on the ground.

    MF

    One of the reasons perhaps there might have been caution in the past is of course concern about how the public would react to that use of data. What do we know of public attitudes now in 2022, to how government bodies utilise data?

    LF

    I think the impression is that, depending on whom you ask, whether it is the younger population or slightly older people my age, people who lived in the 50s versus my students, they have different attitudes. We're getting used to the fact that our data are going to be used. The question is no longer are they going to be used, but more like, how and who is using them? For what purposes? Am I in charge? Can I do something if something goes wrong? And I would add also, in terms of attitude, one particular feature which I don't see sufficiently stressed, is who is going to help me if something goes wrong? Because the whole discussion, or discourse, should look more at how we make people empowered, so that they can check, they have control, they can go do this, do that. Well, who has the time, the ability, the skills, and indeed the will, to do that? It's much easier to say, look, there will be someone, for example the government, who will protect your rights, who you can approach, and they will do the right thing for you. Now we're getting more used to that. And so, I believe that the attitude is slightly changing towards a more positive outlook, as long as everything is in place, we are seeing an increasingly positive attitude towards public use of public data.

    MF

    Pete, your role is to make this happen. In practice, to make sure that government bodies, including the ONS, are making ethical use of data and serving the public good. Just before we get into that though, explain if you would, what sort of data is being gathered now, and for what purposes?

    PETE STOKES


    So we've got a good track record of supporting research use of survey data, that we collect largely in ONS, but on other government departments as well. But over the last few years, there's been an acceleration and a real will to make use of data that have been collected for other purposes. We make a lot of use now of administrative data, these are data that are collected by government not for an analytical purpose but for an operational purpose. For example, data that are collected by HMRC from people when they're collecting tax, or from the Department of Work and Pensions when they're collecting benefits, or from local authorities when they're collecting council tax - all of those administrative data are collected and stored. There's an increasing case to make those data available for analysis which we're looking to support. And then the other new area is what's often called “faster data”, and these data that are typically readily available, usually in the public domain where you get a not so deep insight as you'd get from a survey of administrative data, but you could get a really quick answer. And a good example of that from within the ONS is that we calculate inflation. As a matter of routine, we collect prices from lots of organisations, but you can more quickly do some of that if you can pull some data that are readily available on the internet to give you those quicker indicators, faster information of where prices are rising quickly where they're dropping quickly. There's a place for all of these depending on the type of analysis that you want to do.

    MF

    This is another area where this ethical dilemma might arise though isn't it, because when you sit down with someone and they've agreed to take part in the survey, they know what they're going in for. But when it comes to other forms of information, perhaps tax information that you've mentioned already, some people might think, why do they want to know that?

    PS


    When people give their data to HMRC or to DWP as part of the process of receiving a service, like paying tax for example, I think people generally understand what they need to give that department for their specific purpose. When we then want to use this data for a different purpose, there is a larger onus on us to make sure that we are protecting those data, we're protecting the individual and that those data are only being used ethically and in areas of trust, specifically in the public interest. So, it's important that we absolutely protect the anonymity of the individuals, that we make sure where their data are used, and that we are not using the data of those data subjects as individuals, but instead as part of a large data-set to look for trends and patterns within those data. And finally, that the analysis that are then undertaken with them are explicitly and demonstrably in the public interest, that they serve the public good of all parts of society.

    MF

    And that's how you make the ethical side of this work in practice, by showing that it can be used to produce faster and more accurate statistics than we could possibly get from doing a sample survey?

    PS


    Yes, exactly, and sample surveys are very, very powerful when you want to know about a specific subject, but they're still relatively small. The largest sample survey that the ONS does is the Labour Force Survey, which collects data from around 90,000 people every quarter. Administrative datasets have got data from millions of people, which enables you to draw your insights not just at a national level and national patterns, but if you want to do some analysis on smaller geographic areas, administrative data gives you the power to do that when surveys simply don't. But, any and all use of data must go through a strict governance process to ensure that the confidentiality of the data subjects be preserved. And not only will the use be clearly and demonstrably in the public interest, but also, will be ethically sound and will stand up to scrutiny in that way as well.

    MF

    And who gets to see this stuff?

    PS

    The data are seen by the accredited researchers that apply to use it. So, a researcher applies to use the data, they're accredited, and they demonstrate their research competence and their trustworthiness. They can use those data in a secure lockdown environment, and they do their analysis. When they complete their analysis, those can then be published. Everybody in the country can see the results of those analyses. If you've taken part in a social survey, or you've contributed some data to one of the administrative sources that we make available, you can then see all the results of all the analysis that are done with those data.

    MF

    But when you say its data, this is where the whole process of anonymization is important, isn't it? Because if I'm an accredited researcher selling it to see names and addresses, or people's personal, sensitive personal information.

    PS


    No, absolutely not. And the researchers only get to see the data that they need for their analysis. And because we have this principle, that the data are being used as an aggregated dataset, you don't need to see people's names or people's addresses. You need to know where people live geographically, in a small or broad area, but not the specific address. You need to know someone's demographic characteristics, but you don't need to know their name, so you can't see their name in the data. And that principle of pseudonymisation, or the de-identification of data, before their used is really important. When the analyses are completed and the outputs are produced, those are then reviewed by an expert team at ONS, and so the data are managed by us to ensure that they are fully protected, wholly non-disclosive, and that it's impossible to identify a member of the public from the published outputs.

    MF

    Historically, government departments didn't have perhaps the best record in sharing data around other bodies for the public benefit in this way. But all that changed, didn't it? A few years back with a new piece of legislation which liberalised, to an extent, what the ONS is able to do.

    PS

    So, the Digital Economy Act, passed in 2017, effectively put on a standard footing the ability of other departments to make their data available for researchers in the same way that ONS had already been able to do since the 2007 System Registration Service Act. It gave us parity, which then gave other departments the ability to make their data available and allow us to help them to do so, to take the expertise that the ONS has in terms of managing these data securely, managing access to them appropriately, accrediting the researchers, checking all the outputs and so on, to give the benefit of our expertise to the rest of government. In order that the data that they hold, that has previously been underutilised arguably, could then be fully used for analyses to develop policies or deliver services, to improve understanding of the population or cohorts of the population or geographic areas of the country, or even sectors of industry or segments of businesses, for example, in a way that hasn't previously been possible, and clearly benefits the country overall.

    MF


    So the aim here is to make full use of a previously untapped reservoir, a vast reservoir, an ocean you might even say, of public data. But who decides what data gets brought in in this way?

    PS

    We work closely with the departments that control the data, but ultimately, those departments decide what use can be made of their data. So, it is for HMRC, DWP, the Department for Education, it’s for them to decide which data they choose to make available through the Secure Research Service (SRS) or the Integrated Data Service (IDS) that we run in ONS. When they're supportive and recognise the analytical value of their data, we then manage the service where researchers apply to use those data. Those applications are then assessed by ONS first and foremost, we then discuss those requests and the use cases with the data owning departments and say, do you agree this would be a sensible use of your data?

    MF

    Is there an independent accreditation panel that reports to the UK statistics Authority Board, that assesses the request to use the data is in the public interest, that it serves the public good?

    PS

    The ethics of the proposal are also assessed by an independent ethics advisory committee, whether it's the national statistician's data ethics advisory committee or another. There's a lot of people involved in the process to make sure that any and every use of data is in the public interest.

    MF

    From what we know from the evidence available, certainly according to the latest public confidence and official statistics survey - that's a big biannual survey run by the UK Statistics Authority (UKSA) - I guess for that, and other reasons, public trust remains high. The Survey said 89% of people that gave a view trusted ONS, and 90% agreed that personal information provided to us would be kept confidential. But is there a chance that we could lose some of that trust now, given that there is much greater use, and much greater sharing, of admin data? It should be said that it doesn't give people the chance to opt out.

    PS

    I think one of the reasons that trust has remained high is because of the robust controls we have around the use of data. Because of the comprehensive set of controls and the framework that we put around use of data that protects confidentiality, that ensures that all uses are in the public interest. And another important component of it is that all use of data that we support is transparent by default. So, any analyst wanting to use data that are held by ONS, or from another department that we support, we publish the details of who those analysts are, which data they're using, what they're using them for, and then we require them to publish the outputs as well. And that transparency helps maintain public trust because if someone wants to know what their data is being used for, they can go to our website or directly to the analyst, and they can see the results tangibly for themselves. Now, they might not always agree that every use case is explicitly in the public interest, but they can see the thought process. They can see how the independent panel has reached that conclusion, and that helps us to retain the trust. There's a second half of your question around whether there is a risk of that changing. There is always a risk but we are very alive to that, which is why as we built the Integrated Data Service, and we look to make more and more government data available, that we don't take for granted the trust we've already got, and that we continue to work with the public, and with privacy groups, to make sure that as we build the new service and make more data available, we don't cross a line inadvertently, and we don't allow data to be used in a way that isn't publicly acceptable. We don't allow data to be combined in a way that would stretch that comfort. And this is that kind of proactive approach that we're trying to take, that we believe will help us retain public trust, despite making more and more data available.

    MF

    Professor Floridi, we gave you those survey results there, with people apparently having confidence in the system as it stands, but I guess it just takes a couple of negative episodes to change sentiment rapidly. What examples have we seen of that, and how have institutions responded?

    LF


    I think the typical examples are when data are lost, for example, inadvertently because of a breach and there is nobody at fault, but maybe someone introduced the wrong piece of software. It could be a USB, someone may be disgruntled, or someone else has found a way of entering the database - then the public gets very concerned immediately. The other case is when there is the impression, which I think is largely unjustified, but the impression remains, that the data in question are being used unjustly to favour maybe some businesses, or perhaps support some policies rather than others. And I agree with you, unfortunately, as in all cases, reputation is something very hard to build and can be easily lost. It's a bit unfair, but as always in life, building is very difficult but breaking down and destroying is very easy. I think that one important point here to consider is that there is a bit of a record as we move through the years. The work that we're talking about, as we heard, 2017 is only a few years ago, but as we build confidence and a good historical record, mistakes will happen, but they will be viewed as mistakes. In other words, there will be glitches and there will be forgiveness from the public built into the mechanism, because after say 10 or 15 years of good service, if something were to go wrong once or twice, I think the public will be able to understand that yes, things may go wrong, but they will go better next time and the problem will be repaired. So, I would like to see this fragility if you like, this brittle nature of trust, being counterbalanced by a reinforced sense of long-term good service that you know delivers, and delivers more and more and better and better, well then you can also build a little bit of tolerance for the occasional mistakes that are inevitable, as in everything human, they will occur once or twice.

    MF


    Okay, well, touching my mic for what would in effect be my desk, I can say that I don't think ONS has had an episode such as you describe, but of course, that all depends on the system holding up. And that seems a good point to bring in Simon Whitworth from the UK Statistics Authority, as kind of the overseeing body of all this.

    Simon, how does the authority go about its work? One comment you see quite commonly on social media when these topics are discussed, is while I might trust the body I give my data to, I don't trust them not to go off and sell it, and there have been episodes of data being sold off in that way. I think it's important to state isn't it, that the ONS certainly never sells data for private gain. But if you could talk about some of the other safeguards that the authority seeks to build into the system.

    SIMON WHITWORTH

    The big one is around the ethical use of data. The authority, and Pete referred to this, previously back in 2017, established something called the National Statisticians Data Ethics Advisory Committee, and that's an independent committee of experts in research, ethics and data law. And we take uses of data to that committee for their independent consideration. And what's more, we're transparent about the advice that that committee provides. So, what we have done, what we've made publicly available, is a number of ethical principles which guide our work. And that committee provide independent guidance on a particular use of data, be they linking administrative data, doing new surveys, using survey data, whatever they may be, they consider projects from across this statistical system against those ethical principles and provide independent advice and guidance to ensure that we keep within those ethical principles. So that's one thing we do, but there's also a big programme of work that comes from something that we've set up called the UK Statistics Authority Centre for Applied Data Ethics, and what that centre is trying to do is to really empower analysts and data users to do that work in ethically appropriate ways, to do their work in ways that are consistent with those ethical principles. And that centres around trying to promote a culture of ethics by design, throughout the lifecycle of different uses of data, be they the collection of data or the uses of administrative data. We've provided lots of guidance pieces recently, which are available on our website, around particular uses of data - geospatial data, uses of machine learning - we've provided guidance on public good, and we're providing training to support all of those guidance pieces. And the aim there is, as I say, to empower analysts from across the analytical system, to be able to think about ethics in their work and identify ethical risks and then mitigate those ethical risks.

    MF


    You mentioned the Ethics Committee, which is probably not a well-known body, independent experts though you say, these are not civil servants. These are academics and experts in the field. Typically, when do they caution researchers and statisticians, when do they send people back to think again, typically?

    SW

    It's not so much around what people do, it's about making sure how we do it is in line with those ethical principles. So, for example, they may want better articulations of the public good and consideration of potential harms. Public good for one section of society might equal public harm to another section of society. It's very often navigating that and asking for consideration of what can be done to mitigate those potential public harms and therefore increase the public good of a piece of research. The other thing I would say is being transparent. Peter alluded to this earlier, being transparent around data usage and taking on board wherever possible, the views of the public throughout the research process. Encouraging researchers as they're developing the research, speaking to the public about what they're doing, being clear and being transparent about that and taking on board feedback that they receive from the public whose data they're using. I would say that they're the two biggest areas where an estate provides comments and really useful and valuable feedback to the analytical community.

    MF

    Everyone can go online and see the work of the committee, to get the papers and minutes and so forth. And this is all happening openly and in a comfortable way?

    SW


    Yes, absolutely. We publish minutes of the meetings and outcomes from those meetings on the UK Statistics Authority’s website. We also make a range of presentations over the course of the year around the work of the committee and the supporting infrastructure that supports the work because we have developed a self-assessment tool which allows analysts at the research design phase to consider those ethical principles, and different components of the ethical principles, against what they're trying to do. And that's proved to be extremely popular as a useful framework to enable analysts to think through some of these issues, and I suppose move ethics from theory to something a bit more applied. In terms of their work last year, over 300 projects from across the analytical community, both within government and academia, used that ethics self-assessment tool, and the guidance and training that sits behind it is again available on our website.

    MF


    I'm conscious of sounding just a little bit sceptical, and putting you through your paces to explain how the accountability and ethical oversight works, but can you think of some examples where there's been ethical scrutiny, and research outcomes having satisfied that process, have gone on to produce some really valuable benefits?

    SW


    ONS has done a number of surveys with victims of child sex abuse to inform various inquiries and various government policies. They have some very sensitive ethical issues that require real thinking about and careful handling. You know, the benefits of that research has been hugely important in showing the extent of child sex abuse that perhaps previously was unreported and providing statistics to both policymakers and charities around experiences of child sex abuse. In terms of administrative data, yes, there are numerous big data linkage projects that have come to ONS and have been considered by ONS, in particular, linkage surveys that follow people over time. Linkages done over time provide tremendous analytical value, but of course need some careful handling to ensure that access to that data is provided in an ethically appropriate way, and that we're being transparent. So those are the two I think of, big things we are thinking about in an ethically appropriate way. And being able to do them in an ethically appropriate way has really allowed us to unleash the analytical value of those particular methods, but in a way that takes the public with us and generates that public trust.

    MF

    Pete, you are part of the organisation that in fact runs an award scheme to recognise some of the outstanding examples of the secure use of data?

    PS


    We do, and it's another part of promoting the public benefit that comes from use of data. Every year we invite the analysts who use the Secure Research Service (SRS), or other similar services around the country, to put themselves forward for research excellence awards. So that we can genuinely showcase the best projects from across the country, but then also pick up these real examples of where people have made fantastic use of data, and innovative use of data, really demonstrating the public good. We've got the latest of those award ceremonies in October this year, and it's an open event so anybody who is interested in seeing the results of that, the use of data in that way, they would be very welcome to attend.

    MF


    Give us a couple of examples of recent winners, what they've delivered.

    PS


    One of the first award winners was looking at the efficacy of testing that was done for men who may or may not have been suffering from prostate cancer, and it analysed when if a person was given this test, what was the likelihood of its accuracy, and therefore whether they should start treatment, and the research was able to demonstrate that actually, given the efficacy, that it wasn't appropriate to treat everyone who got a positive test, because there was risk of doing more harm than good if it had persisted, which is really valuable. But this year, we'll be seeing really good uses of data in response to the pandemic, for example, tying this back to the ethics, when you talk about the use of data made during the pandemic in retrospect, it's clearly ethical, it's clearly in the public interest. But, at the start of the pandemic, we had to link together data from the NHS on who was suffering from COVID which was really good in terms of the basic details of who had COVID and how seriously and sadly, whether they died, but it missed a lot of other detail that helps us to understand why.

    We then linked those data with data from the 2011 Census where you can get data on people's ethnic group, on their occupation, on their living conditions, on the type and size of the family they live with, which enable much richer insights, but most importantly, enabled government to be able to target its policy at those groups who were reluctant to get the vaccination to understand whether people were suffering from COVID due to their ethnicity, or whether it was actually more likely to be linked to the type of occupation they did. Really, really valuable insights that came from being able to link these data together, which now sounds sensible, but at the time did have those serious ethical questions. Can we take these two big datasets that people didn't imagine we could link together and and keep the analyses ethically sound and in the public interest. What’s what we were able to do.

    MF

    That's certainly a powerful example. But before we pat ourselves on the back too much for that survey I mentioned, some of the research we've been doing at the ONS does suggest that there is nevertheless a hardcore cohort of sceptics on all of this. Particularly, it is suggested, among the older age groups, the over 55’s in particular. I mentioned the social media reaction you see as well. Kind of ironic you might think, given the amount of data that big social media platforms and other private organisations hold on people.

    Professor, do you think there's a paradox at work there? People are apparently inclined not to trust public bodies, accountable public bodies, but will trust the big social media and internet giants? Or is it just a question of knowledge, do you think?

    LF


    I think it might be partly knowledge, the better you know the system, who is doing what, and also the ability to differentiate between the different organisations and how they operate, under what kind of constraints, how reliable they are, etc, versus for example, commercial uses, advertisement driven, etc.

    The more you know, and it happens to be almost inevitably the younger you are, the more you might be able to see with a different kind of degree of trust, but also almost indifference, toward the fact that the data are being collected and what kind of data are being collected. I think the statistics that you were mentioning seem to be having an overlapping feature. A less young population, a less knowledgeable population, is also the population that is less used to social media, sharing, using data daily, etc. And is also almost inevitably a little bit more sceptical when it comes to giving the data for public good, or knowing that something is going to be done by, for example, cross referencing different databases.

    On the other side, you find the slightly younger, the more socially active, the kids who have been growing with social media - and they are not even on Facebook these days anymore, as my students remind me, Facebook is for people like me - so let's get things right now, when it comes to Tiktok, they know that they are being monitored, they know that the data is going to be used all over the place. There is a mix of inevitability, a sense of who cares, but also a sense of, that's okay. I mean data is the air you breathe, the energy you must have, it's like electricity. We don't get worried every time we turn on the electricity on in the house because we might die if someone has unreliably connected the wires, we just turn it on and trust that everything is going to be okay. So, I think that as we move on with our population becoming more and more well acquainted with technology, and who does work with the data and what rules are in place, as we heard before, from Simon and Pete, I mean, there are plenty of frameworks and robust ways of double checking that nothing goes wrong, and if something goes wrong, it gets rectified as quickly as possible. But the more we have that, I think the less the sceptics will have a real chance of being any more than people who subscribe to the flat earth theory. But we need to consider that the point you made is relevant. A bit of extra education on the digital divide, which we mentioned implicitly in our conversation today. Who is benefiting from what? And on which side of the digital innovation are these people placed? I think that needs to be addressed precisely now, to avoid scepticism which might be not grounded.

    MF

    I hope through this interesting discussion we've managed to go some way to explaining how it's all done, and why it's so very important. Simon Whitworth, Pete Stokes, Professor Luciano Floridi, thank you very much indeed for taking part in Statistically Speaking today.

    I'm Miles Fletcher and thanks for listening. You can subscribe to new episodes of this podcast on Spotify, Apple podcasts and all the other major podcast platforms. You can comment or ask us a question on Twitter at @ONSFocus. Our producer at the ONS is Julia Short. Until next time, goodbye

  • David Freeman and Nicola White join Miles to discuss how the Office for National Statistics (ONS) tracks employment and pay across the UK.

    Transcript:

    Hello and welcome again to Statistically Speaking, the Office for National Statistics podcast. In this episode, we enter the world of work and clock on for a shift with the ONS labour market team. We'll explore how they keep track of employment and pay across the UK and find out how the figures we hear so much about in the news should really be interpreted. At your service, are employees of the month, our head of labour market and household statistics David Freeman, and later on his colleague, senior statistician Nicola White.

    David, let's start with the basics. And one common misconception you still hear around the official statistics on unemployment is that they're based on the number of people claiming out of work benefits. And so, the theory goes therefore, that they're subject to manipulation in some way. But to be absolutely clear, the figures don't come from any other government department. This is data that comes from the ONS talking directly to real people, in their tens of thousands.

    DAVID FREEMAN
    That's absolutely right, Miles. The bulk of the information that we publish as part of our labour market statistics come from something called the ‘Labour Force Survey’. As this is one of our big household surveys, every three months we sample 40,000 households across the UK. And we go and we interview the people in those households about their labour market status. So, are they working, are they not working. We also gather a lot of information about the people in those households, what age they are, whether they have got a disability, what ethnic group [they belong to], which gives a us rich picture of the UK labour market.

    MILES FLETCHER

    And by the standards of any survey, any regular survey, that's a huge sample isn’t it. I know we don't go in for superlatives, but it's possibly the biggest household survey regularly undertaken of any kind?

    DAVID FREEMAN
    I think it is the biggest one in the UK, outside of the Census of course, and again, through the data that we use, we’ll learn about the labour market, but the data will also feed into things like population estimates. So quite a wide range of uses, but its core purpose is really trying to measure the UK labour market.

    MILES FLETCHER
    And it's that time spent with people to gather a whole raft of data from them, and at scale, that can give a localised picture, which is so important too.

    DAVID FREEMAN
    Absolutely, we get a lot of information from the Labour Force Survey, either by age groups, by country of birth, also by regional level, and we have an annual version of the Labour Force Survey where we put the data together across a longer time period, which means we can get data down to things like local authority levels as well which is important for local government.

    MILES FLETCHER

    And how do we choose people to take part?

    DAVID FREEMAN

    It’s a totally random process. So we have access to the postcode directory for the UK, which is effectively a list of all the households in the UK, and we take a random sample of those. However, we make sure within taking that sample that we're represented across the country. So within each local authority area, we've got enough people to be able to give us a robust estimate of what's happening there.

    MILES FLETCHER

    You stay in the survey a little while, don’t you?

    DAVID FREEMAN

    You do, that's right, and that's one of the strengths of the Labour Force Survey. If you're selected to take part, you are in there for what we call “five waves”. So if you're selected in January, we'll also come back and talk to you again in April, July, October and the following January. And that's important because not only do we find out what people are doing now, as you say we find out how people have changed, and whether they have moved into employment, out of employment, how have their circumstances changed. And that gives a deep insight into how people are flowing through the labour market and changing over time.

    MILES FLETCHER

    So, big sample, lots of data coming in. When it comes to the analysis though, essentially, we group people under three big categories. Now the first of those is employment. It sounds self-evident, but what is the definition of an employed person?

    DAVID FREEMAN
    To be employed is to be someone who has done paid work in the reference week, so when we interview people we’ll say, what were you doing in the week before we're interviewing you? They are considered employed if they have done paid work for a minimum of one hour in that week. So the bar is, you could say it’s quite low, in terms of one hour of work a week. But we have looked, and not that many people work that little in a week – less than 3% of people work less than five hours. So, as well as you'll get paid, we have a couple of other areas as well. We cover people who are employees, so employed by a company, the self-employed, people in government training schemes and people who work for their family business and might not get a wage packet but benefit from working for that business.

    MILES FLETCHER
    What is the average number of hours that employed people do?

    DAVID FREEMAN
    Overall, the average is around about 31 hours a week, and that does differ between if you're full time or part-time. So if you're full time, then the average is around 36. If you're part time, the average is around 16 hours a week.

    MILES FLETCHER
    Okay, so that's a working week. Now who is unemployed? Technically speaking.

    DAVID FREEMAN
    The technical definition of unemployed, there are three elements to it. Firstly, you've got to be not employed, so not doing any paid work. But you must also be actively seeking work in the previous four weeks. So that means applying for jobs, going to interviews, looking through listings, etc. And finally, you must be available to start work in the next two weeks. So you have got to be available to start a job within the next fortnight after we interview you. Again, another international definition used across the world to define who's unemployed.

    MILES FLETCHER
    And how long do you have to be unemployed to be classed as long-term unemployed? Because that's a very important category to understand as well.

    DAVID FREEMAN
    To be considered long-term unemployed, a person must have been in that position for a year or more.

    MILES FLETCHER
    What's the average time that people are currently spending unemployed?

    DAVID FREEMAN
    It's a bit hard to say, we don’t have a technical age or an average time, but the majority of people who are unemployed have been unemployed for less than six months. So people moving into unemployment after having recently lost a job or moving through unemployment to get to a job. And it's just under 1 in 3, who have been unemployed for more than a year.

    MILES FLETCHER
    So if you don't satisfy any of those two definitions. You're not doing any kind of paid work and you're not actively seeking it in the way you've described, where does that leave you?

    DAVID FREEMAN
    Well, that leaves you in a third group that we call the “economically inactive”. And so these people are not in work, and are either not actively seeking work, or are unavailable to start work. So you can be looking for work and not available, and you'd be economically inactive, or you might be available and not looking, and again, you'd be economically inactive there. And the sort of people included in this category are the sort of people who may be looking after family or home, they are stay-at-home parents, or they have caring responsibilities that mean they can't work. They might have a long-term illness or disability which means they are not able to work, or they may have retired. It's the people who aren't working and are not looking or available for work.

    MILES FLETCHER
    One contentious area under this definition of the economically inactive is a group that swells and contracts according to the economic cycle, and it’s that group of people who are unable to work and are collecting benefits. What do we understand about that group at the moment?

    DAVID FREEMAN
    That group as you say, it does change over time. And the reason for that is because people on benefits depend on the rules around those benefits. So, over the years we have published something we call the “claimant count”. This counts people claiming benefits and the main reason they're claiming benefits is because they're out of work.

    MILES FLETCHER

    And that used to be the main measure of our unemployment, as it was understood.

    DAVID FREEMAN
    You’re absolutely right. If we go back to the early mid 90s, it was a lead measure. But at that point the rules around the benefits were such that the official unemployment count and the benefit count was about the same. However, when we moved to Jobseeker's Allowance in the late 90s, the rules changed on benefits. So fewer unemployed qualified for the benefits, and the two measures did diverge there.

    MILES FLETCHER
    It's been said that there's a very large group now who are on out of work benefits alone, and that is hidden unemployment?

    DAVID FREEMAN
    Some of these people will be unemployed if they're out of work, and actively seeking or available to work. However, out of work benefits will also include people who we would class as economically inactive. Such as people who have a long-term illness or disability that prevents them from working. They'll be getting out of work benefits because they're not working, but because they're not able to look for work, or not actively looking for work, we wouldn't count them in our unemployment statistics. So yes, there are a lot of people on out of work benefits, more than we would count as unemployed. But not all these people would fit that definition of unemployed that we use.

    MILES FLETCHER
    But nonetheless a very important indicator when you're thinking about how people might be helped into work.

    DAVID FREEMAN
    That's right. Yeah, and and it indicates what that potential workforce could be. But obviously, some of these people may need some help to get themselves into a position where they're able to look for work and gain employment.

    MILES FLATCHER
    Okay, well what that briefly explained, is how the headline measures - you might like to call them your classic ONS measure of employment and unemployment - work. But one criticism that you might care to make about this system is that it takes a while to process and the numbers when they come out...there's a bit of a lag isn’t there.

    DAVID FREEMAN
    There is a little bit of a lag, again because of the size of the sample, the amount of data we have to process and the fact that we have to make sure we're getting enough responses in. There’s about a six-week lag between the end of the period we're looking at and the data being published into the public domain.

    MILES FLATCHER
    So in order to speed things up a bit, and to have a timelier indicator of what was happening with employment, and this came in very useful with the arrival of the pandemic, we've been using faster sources of information to supplement the headline employment figures. Can you talk us through that? What progress has been made and how useful these other sources of data have been?

    DAVID FREEMAN
    Yeah, so probably the biggest one that we've been using throughout the pandemic has been the counting of people for the real time tax information from the Revenue and Customs department. So this is a big database that HMRC hold, and it contains information about everyone on a payroll. So if you are on a pay as you earn scheme, all your information is collated in HMRC for the purposes of calculating your tax. At the end of 2019, we started working with HMRC on publishing regular data from that system. I counted the number of people on payroll schemes and how much they're earning. The benefits of this are that it is a complete count of people on the pay as you earn scheme, so it gives us lots of information, meaning we can analyse smaller levels and small groups of people without impacting on the confidentiality of the data. When the pandemic started, we worked with HMRC to see if we could speed the data up, because previously it was at the same sort of pace as the Labour Force Survey, so about six weeks, and we managed to move to what we call a flash estimate. This means we can publish the data for a particular month within three weeks of the end of that month, which is so much faster and was a real benefit at the beginning of the pandemic. Getting information quickly about what was happening to employees on tax schemes.

    MILES FLETCHER
    And that was vital wasn't it, to inform the policy response to the pandemic when it arrived. Because you know, waiting a few weeks could have been too late for a lot of people.

    DAVID FREEMAN
    It could have been, and this is a big step forward in using this local administrative data in the labour market, and we've carried on doing that flash estimate. And as well as that we've been, over the pandemic period and up to the present day, adding more and more information from the pay as you earn tax data. So, a company produces data for a local authority level, we also do it by regional and industry. So, lots of information much more quickly than we can get it from our survey data.

    MILES FLETCHER
    You could say we've got the best of both worlds now. We've got the rich data coming out of the Labour Force Survey. But on the other hand, we've also got the much quicker data coming hot off the systems of HMRC to give that flash picture as you described it.

    DAVID FREEMAN
    One of the things that has been very developed over the pandemic is having this extra data and it provides a very, very rich picture. And when you put it together, you do get a very, very good picture of what's happening in the economy. I mean, the next step is to try and actually bring these data sources together. So linking data from the tax system to survey data, and trying to exploit even more, the benefits of having these sorts of information available.

    MILES FLETCHER
    Do you think we'll get to the point where we replace the survey completely? Or will it continue to have that very important central role?

    DAVID FREEMAN
    I think surveys will always have a central role. The tax data is brilliant. It does only cover employees, so we don't we don't cover the self-employed, you don't cover government trainees or people working for their family business. Also, the level of information we get from the Labour Force Survey is much bigger than we get from administrative data. On the tax system, we merely have information that's relevant to people paying tax. So that means we don't get a lot of the information that we get from labour force surveys - whether someone's got a disability, what their ethnic group is, what their nationality is - and these are all important variables in terms of informing government policy and giving a picture of what's happening in the UK.

    MILES FLETCHER
    You mentioned that the tax data was a development that was already in progress before the pandemic, but it was sped up given the urgency of that situation, but other sources of data have been coming in as well?

    DAVID FREEMAN
    Another big source of data that we've been working with over the pandemic period has been the online job vacancies data from a company called Azuna, who we've been partnering with over the period. And this has been another big step forward in calculating the number of vacancies in the UK economy. The data we are getting is really really timely, so we can take a download of data on the Friday, and we’re publishing it the next week. So really timely. And, the information you're getting in an online job vacancy means we can look at things like where the vacancy is, so what geography it’s located in, and some indication of the skills or the occupation of that vacancy as well.

    MILES FLETCHER
    Obviously, if you think about impacts of the pandemic for quite a period, over the last two years, when you add it all up, we spent a lot of time chained to our laptops, in many cases, working from home. How has that rubbed off on the workforce now, and what do we think is the lasting impact of the working from home trend?

    DAVID FREEMAN
    Certainly, on the latest data we've got, it does look like there's been a bit of a shift in terms of the number of people who work at home on a regular basis. Prior to the pandemic, fewer than three in 10 people had ever worked from home at any point, whereas if you look at the most recent data, around 35% of people are working from home regularly. So that 1 in 3 people are now doing some work at home during the working week.

    MILES FLETCHER
    So that's a huge change and we reckon that is, to some extent, showing signs of lasting?

    DAVID FREEMAN
    It does look like it is lasting. Home working doesn't necessarily work for everyone. When we did the analysis, there's quite a few professions or occupations where homework is relatively low. That’s particularly in the caring occupations, retail, catering and construction, where it's hard, or if not impossible, to work from home.

    MILES FLETCHER
    We'll have to see how that develops over the months ahead. But another phenomenon that was spotted as we emerged from the pandemic was what's been called ‘The Great Resignation’. Over 50s apparently disengaging with the labour market, and that I guess, is them going from employment in large numbers into the ‘economically inactive’ category? What do we know about that?

    DAVID FREEMAN
    You're absolutely right. This is something we've seen particularly in the last 12 months, people over 50 are moving out of the labour market into economic inactivity. Some of these people are retiring, so particularly the over 60s, most of those people are retiring. However, for the people aged 50 to 59, a lot of them are retiring for health reasons. They've developed a long-term illness, which again may be related to COVID, which is preventing them from carrying on with work. And this is having an impact on the overall labour market because the employment rate is still lagging behind where we were pre-pandemic, and a lot of that is down to these people moving outside into economic inactivity.

    MILES FLETCHER
    That's an important factor because other ONS statistics tell us that there were some 800,000 people who report, or we estimate, are suffering the effects of long COVID. So that would be a big factor in this, one might think, and it really isn't a question then of people having had a taste of being at home all the time and thinking, “Oh I just don't want to go back to work. Let's call it a day now”.

    DAVID FREEMAN
    You're right. So the older people aged 60+, again, particularly people who have got a private pension and won’t rely on the state pension, it is that retirement. But say for those 50-59s, while some of them are retiring early, there are people who believe themselves too ill to work

    MILES FLETCHER
    And what do we understand then from our lifestyle survey? About how people's patterns of leisure and work have changed?

    DAVID FREEMAN
    There are a few things to think about again, will the people who have moved out of the workforce want to go back into the workforce. Looking at those over 60, only about 18% of those want to go back and will consider returning to work. Whereas those in their 50s, just over half would consider returning to work, but looking for a job that suits their skills and would suit their lifestyle. So, people wanting more flexible work and something that will fit around their caring responsibilities as well.

    MILES FLETCHER
    So overall, how do we think the UK did in terms of dealing with a pandemic? And particularly its impact on the labour market compared with other countries? Did they see these kind of impacts as well?

    DAVID FREEMAN
    It's quite interesting when you look at the impact of the pandemic across different countries. In terms of the UK, we have a very similar pattern to the rest of Europe. We saw a drop off in employment rate at the start of the pandemic and then gradual increases. But that drop off in employment was about 2 to 3% of the employment rate, and that's in stark contrast to the USA and Canada where the pandemic impact was much greater in terms of falling employment - about nine to 10 percentage points of the employment rate. Moving onto inactivity, what seems to be the difference is the coronavirus job retention scheme in the UK, and similar schemes across Europe, kept people linked to their job and in employment, rather than moving into unemployment. Unemployment remains, again in the UK and across Europe, relatively low. But all countries, including the USA and in Europe as well, saw an increase in the level of inactivity during the pandemic.

    MILES FLETCHER
    So overall the UK not too exceptional really, in how governments responded to the impacts of a pandemic, and how those effects played out on the labour force.

    DAVID FREEMAN
    Not very different at all at the beginning of the pandemic. We're seeing a little bit of a difference now, and we touched on it earlier in terms of economic inactivity, is that the UK employment rate is still a bit below where it was pre pandemic, whereas the EU and USA and Canada, they've got back to about where they were at the beginning of 2020. This links to the over 50s moving out of the workforce. We're still a little bit behind other European countries at the moment.

    MILES FLETCHER
    And explains perhaps why the over 50s are the subject of particular research, extra research going on now to understand what's really going on there.

    DAVID FREEMAN
    Yeah, absolutely. Because that does seem to be the difference between us and the rest of Europe.

    MILES FLETCHER
    Okay, well I mentioned earlier on, the richness of the data that we get from the Labour Force Survey, and when you delve into the data, you get to explore some quite interesting topics. And one of them we uncovered the other day was that even in 2022, there are still some jobs that are dominated by one gender. Tell us about that.

    DAVID FREEMAN
    Yeah, so this is a really interesting thing. We do put out regular data, where we go right into the detail of some of the occupations. And it is interesting when you look at the sort of gender split in some of these jobs. So, there are a few jobs where we have hardly any women at all doing them, so that includes ship officers and metal workers, and at the other end of the spectrum, we've got very few men who say they are dancers or choreographers.

    MILES FLETCHER
    You might be less surprised to hear that pipelayers tend to be all male, but also veterinary nurses are almost exclusively female.

    DAVID FREEMAN
    That's right. And again, if you look at other occupations, that are predominantly female, they are things like midwives, school secretaries, PA’s and secretaries, child minders, nursery nurses and medical secretaries. And then if you go to the occupations that are predominantly male, they’re very much in the construction space, so carpenters, bricklayers, electricians and plumbers.

    MILES FLETCHER
    How do we classify people into jobs? We don't just listen to how people describe themselves. You have to fit into some classification, don't you? How does that work?

    DAVID FREEMAN
    Well, we have got a classification, it's called a ‘standard occupational classification’, and that gets updated regularly. The latest version was updated in 2020. And the way we classify people, when we do the interviews as part of the Labour Force Survey, we ask them what their occupation is. And then we take that description, and we match it onto our list of occupations. There are hundreds of potential occupations. We've got a computer programme that helps when you put the description in, it'll narrow it down to a few options, and then the interviewer can pick the most suitable of those options to match what the person has told us.

    MILES FLETCHER

    And that makes the figures internationally comparable. Again, you can't tell the Labour Force Survey, well, I'm an image consultant. They'd have to find a way of matching that against one of the definitions, and I see we were asked the other day whether ‘Social Media Influencer’ was a classified job, it turns out it isn’t. They're either marketing associates, or actors and presenters, it turns out. These classifications, they're reviewed every 10 years or so aren't they, perhaps the next update will recognise a job like that.

    DAVID FREEMAN
    If it grows in terms of importance and the number of people doing it, it's quite likely it could end up with a classification. I mean, the latest update started including programmers as a separate job description. They were lumped in with other things in earlier classifications, again because of a growing occupation.

    MILES FLETCHER
    It's quite a good test this. If your mum asks you if you've got a proper job yet. If you can point to the standard occupational classification, I think that that'll answer the question for her quite satisfactory wouldn't it. By the way, recent additions are coffee shop workers, not surprisingly, given the huge growth in coffee serving establishments, what other ones have been officially designated recently?

    DAVID FREEMAN
    Lots of jobs linked around the internet and web development and website development as well. You go back 15 or 20 years and it didn't even exist. And things like ‘Play Workers’ as well, with the use of child minding and child play facilities, they’re also new additions to the list.

    MILES FLETCHER
    So, working in the gig economy, you know, the hours might be irregular, you might be on a zero hours contract, but nevertheless, chances are you're your job is officially recognised.

    DAVID FREEMAN
    Almost certainly, even if your job may not have an official designation, you would still be fitting into the framework somewhere.

    MILES FLETCHER
    And it might be worth noting since we're sitting in the ONS, that data analysts have only been recently recognised as an official classified occupation.

    Well, just as important as finding out what people do is the whole question of how much they get for doing it. And who better to talk to about that than our Head of Earnings at the ONS Nicola White, how does the ONS find out what's on people's salary cheques every month.

    NICOLA WHITE
    We use several surveys to estimate wages. So, one is a monthly survey, which gives us the latest picture of what's happening, and the other is once a year, and this allows us to measure not only weekly earnings but also annual earnings, hourly earnings and it enables us to also look at detailed characteristics such as age, sex, region and occupation. It's a much richer data source.

    MILES FLETCHER
    Again, this is a big national level, thousands and thousands of people.

    NICOLA WHITE
    For the monthly survey, we ask to provide us with the number of employees in their business, and then what they're paying out in wages that month, and then we just calculate the average weekly earnings. The annual survey is slightly different. It's filled in again by businesses, but we ask for a selection of employees so that we can collect the additional data that we require.

    MILES FLETCHER
    So, we're not just trusting people to come clean about how much they're earning because I wonder if people might be concerned about what the tax authority might say.

    NICOLA WHITE
    As we collect this from businesses, we think the quality of the data might be much better than giving the individual data.

    MILES FLETCHER

    For statistical purposes, what is the average wage in the UK?

    NICOLA WHITE

    So, the average weekly earnings for all employees at the moment is around 565 pounds a week. Then if we include bonuses into this, it increases it to around 600 pounds a week.

    MILES FLETCHER

    And what’s been the trend recently?

    NICOLA WHITE

    It's been quite difficult to interpret earnings recently given the pandemic, and one reason for is because COVID has impacted the workforce. So many workers were on furlough or had their hours reduced during 2020 and 2021. And this meant that people saw their earnings fall, pushing down weekly earnings, but in the following year, fewer people were on furlough and hours returned to normal, so then weekly wages were higher. Making that year-on-year comparison was quite difficult to interpret. And adding to that, the actual makeup of the workforce during 2020 and 2021 changed and because our statistics is an average this will impact on the average. During the pandemic we saw that lower paid people were at a greater risk of losing their jobs. So where fewer people were in the workforce, this increased average earnings. The way I like to think about it is as thinking about height. So, if the shortest person in the room leaves, the average height of those remaining will rise, but no one in that room has got taller, have they. It's just the makeup of the people in the room that has changed the average, so if you think about that in terms of earnings, if someone's paid less than the average earnings per week, they then lose their job. Other things being equal, average earnings will increase and this was quite prominent during 2020 and 2021. But we're now seeing things return to normal levels.

    MILES FLETCHER
    Shaking out that furlough effect, if you like. Compared to pre pandemic levels, how do we stand now?

    NICOLA WHITE
    So at the moment, we're seeing when we compare to pay for this time, the latest papers are 12 months ago, we're seeing increases in regular pay, and in total pay which is regular pay plus bonuses. And we're seeing some high bonuses that have been paid out, particularly in March this year when we normally get the bonus months. We're seeing levels we haven't really seen before .

    MILES FLETCHER

    And what's been driving that then?

    NICOLA WHITE
    The main sectors that are contributing to this is the finance and business services sector, and within here are financial and insurance activities. That's banking, it's not unusual for these sectors to see large bonus payments, and they're just continuing to be quite large, although we did see some smaller bonuses paid during the pandemic. We've then seen this rise to levels we haven't really seen before.

    MILES FLETCHER
    And how disproportionate is the effect of these city slickers getting Ferraris?

    NICOLA WHITE
    If you look at the data split by private sector and public sector, you'll see public sectors very minimal bonus payments there, whereas it is all being driven by the private sector, and in particular the finance and insurance activity sector.

    MILES FLETCHER
    Any other sectors in which people have been getting bonuses?

    NICOLA WHITE
    Yes so there are other sectors such as manufacturing and construction and wholesale and trade. They've also been seeing quite large bonuses, particularly in March.

    MILES FLETCHER
    And that's perhaps a reflection of the shortages of appropriately trained and skilled workers in those industries, and employers are having to shell out extra to get people in.

    NICOLA WHITE
    Yeah, so bonuses are a way of retaining staff, and that will not impact on basic pay. They were not included in pay rises, but it’s a way to keep staff from moving on.

    MILES FLETCHER
    Overall then, of course real pay has suddenly become a talking point again. For years and years when inflation was relatively low it was a concept that wasn't discussed that much. Now inflation has gone back up and people are concerned about the real value of their earnings. Just talk us through how we measure that, and why it's so important.

    NICOLA WHITE
    Yes, we do produce a real average weekly earnings estimate which adjusts for inflation. So here we look at the growth rates of wages, and we then adjust this by the latest inflation rates. So as you've just said, inflation is currently very high, so it is having a big impact on real wage growth rates. Following the recent increases in inflation, pay has now clearly fallen in real terms, both including and excluding bonuses, so that’s excluding bonuses. Real pay is now dropping faster than any time that we've seen since records began in 2001.

    MILES FLETCHER
    What's the benchmark for the rate of inflation that the ONS uses?

    NICOLA WHITE
    So, we use the CPIH version of inflation. And that's what we adjust our estimates by.

    MILES FLETCHER
    Because the ONS believes that's the most reliable? If we were to take RPI, which of course we don’t recommend, the real base situation would look even more pronounced.

    NICOLA WHITE
    Inflation as measured by CPI, which at the moment is slightly higher than CPIH. This would have an even bigger impact on growth and real growth rates if we were to use CPI, which is often used by the Bank of England.

    MILES FLETCHER
    So Nic, another issue in recent years, of course, has been the gender pay gap, which we've heard a great deal and that's not, it's important to explain isn't it, it's not the difference between men and women getting different pay rates for doing the same work, because that's been illegal for some time. This is about women as a group being paid less than men as a group. How does the ONS measure that, and how have things been changing?

    NICOLE WHITE
    We use our annual survey to measure the gender pay gap, and what we do is we calculate the difference between the average hourly earnings of men and women as a proportion of men's average earnings. For example, we'd say that the gender pay gap currently is at 7.9%. What this means is that women earn 7.9% less on average than men. If we had a negative gender pay gap, for example, negative 4%, this would mean that women earn 4% more on average than men. As you just said, it's not a measure of the difference of the same job being paid. It's a measure across all jobs in the UK.

    MILES FLETCHER
    But that’s all men, compared to all women. But if you start to break it down, then a slightly different pattern emerges, doesn't it?

    NICOLA WHITE
    Yeah, that's right. It's interesting to look at this by age group, because there's a clear difference for those aged over 40 and those aged under 40. With those full-time employees under 40, they have a gender pay gap of around 3%. And for those aged 40, this is around 12%. And this reflects the type of jobs and the fact that women have had children at that age.

    MILES FLETCHER
    So, it’s those family responsibilities, taking people out of their careers?

    NICOLE WHITE
    And maybe working more part-time. It's very much at the younger ages when the gender pay gap isn't as big, but as you go into those older age groups it does become more prominent.

    MILES FLETCHER?
    And perhaps there is an occupational skills divide as well?

    NICOLE WHITE
    Yes, there is. So looking at ‘occupay gap’ in this gender pay gap, the biggest gap is for processing and machine operatives, which is at 16.2%. Women earn 16.2% less on average than men, which probably you'd expect because these jobs are generally held by men. But if we look at this at the other end of the scale, so we'll look at the largest negative gender pay gap. This is in the occupation of secretarial and related, where women earn 7.4% more on average than men. So, the occupations kind of tie in with the kind of jobs that men and women do tend to do.

    MILES FLETCHER

    If I knew someone for whom the world of statistics had just become too exciting and they had to go work in a less dynamic field, but were out to make a bit more money, what should I recommend they do?

    NICOLE WHITE
    Okay, for full time employees, the highest paid occupations are chief executives and senior officials, and they're paid around about 90,000 pounds per year. The lowest occupation for full-time employees is playworkers, which includes teaching assistants, child minders and nannies, and these are paid around 14,000 pounds per year. But if you want to look at all employees, the highest occupation is still the same group, which is chief executives and senior officials. But the lowest paid occupation changes here, and it's more school mid-day and crossing patrol occupations. And these have a medium of around 3000 pounds per year. And this is because much of these jobs are part-time.

    MILES FLETCHER
    So that's what's going on with pay. But what's the current situation with employment in the labour market overall then, suffice to say, David, it's complicated really, isn't it?

    DAVID FREEMAN
    A very accurate description, I think complicated or a very mixed picture at the moment. As we touched on earlier, there are a lot of people who removed themselves from the labour market and go into economic activity, particularly in the over 50s age group. So that means it's held all the unemployment down a bit. There's also a record number of vacancies, which you would normally say is good news, but it's been at a record high for quite a while, so over 1.3 million vacancies and that for the first time is slightly more than the number of unemployed people. So that means companies are struggling to fill the jobs that are available particularly in things like the health sector, hospitality and the retail sector.

    MILES FLETCHER
    So that's speaks of skill shortages then isn't it, employers need people, but they haven't got the right people.

    DAVID FREEMAN
    Yes, if you haven’t got the right people, or not people in the right areas of the country, there's plenty of challenges there in trying to make sure that these jobs get filled and we find the right people in the right place. We're also seeing falling self-employment as well. This is the one area where we’re still lagging behind where we were before the pandemic started. So the number of employees has reached its pre pandemic level, but the number of self-employed is over three quarters of a million below where it was before COVID-19 struck. So that's again another challenge. Where have these people gone? Have they gone into inactivity or employment or are they struggling to restart their business after the pandemic.

    MILES FLETCHER
    Is that perhaps because of the disruptive effects of the pandemic, when it was easier for a lot of people to take one of the many jobs that are available rather than to go back into self-employment with all the risks and uncertainty that then implies.

    DAVID FREEMAN

    Potentially we have seen lots of people moving into employment and leaving self-employment over the pandemic period. I mean particularly with a lot of jobs that are offering flexible hybrid working, people are finding it much more constant, a bit more reliable than perhaps they were in their self-employed jobs. And lots of jobs in self-employment would have been hit by the pandemic. There were lots of jobs in construction, in catering and in the service sector, which would have been hit by the pandemic.

    MILES FLETCHER
    We said the picture was complicated, but anytime where we have record high employment and a record number of vacancies, there's good in this labour market too isn't there.

    DAVID FREEMAN
    There is some good news, they say the number of employees is back above where it was pre pandemic, so a lot of people in employment. What's holding it back is the self-employed. And, the level of unemployment is one of the lowest we've seen since the mid 70s. It's down below 4%. So, there are very few people out-of-work actively seeking work. That again shows there's certainly scope for the labour market to expand with the number of unfilled vacancies that we're seeing.

    MILES FLETCHER
    On that largely positive note, it's back to the daily grind we go. Thanks to Nicola and to David for joining me, and thanks to you for listening. To comment on this podcast or ask us a question please follow us on Twitter at @ONSfocus. I'm Miles Fletcher and our producers at the ONS are Julia Short and Steve Milne.

  • In this episode Miles is joined by Dr James Tucker and Sarah Caul MBE to talk about how and why the Office for National Statistics count births and deaths, and what current fertility trends might mean for the future population. They look at the impact of popular culture on the most common baby names in England and Wales, and discuss the new significance of a dataset that was itself buried for 50 years.

    Transcript:

    MILES FLETCHER

    I’m Miles Fletcher and this episode of ‘Statistically Speaking’, the official ONS podcast, is literally a matter of life and death. Specifically how and why we count births and deaths and what those numbers are telling us. We'll talk about the possible impacts of declining fertility rates in the UK and of children being born to older parents. And at the other end of life we'll look at the new significance of a dataset that was itself almost buried for 50 years

    I'm joined here at ONS by two people who lead on all our data around births and deaths - Head of Analysis in our health and life events teams, James Tucker, and our very own Head of Mortality, Sarah Caul MBE, honoured for her work during the pandemic about which we will talk later.

    Starting with you then James, at the beginning as it were, with births - how does the ONS gather information about the number of children being born in England and Wales week in, week out?

    JAMES TUCKER
    So the registration of births is a service that's carried out by local registration services in partnership with the general register office in England and Wales and the good thing about this, from the perspective of having a really nice complete dataset, is that birth registrations are actually a legal requirement, giving us a really comprehensive picture of births in the countries.

    MF
    So we gather the numbers, we add them up, what do we do with the information then?

    JT
    So there's a couple of ways that we look at the data. One is to simply look at the number of births per year. So for example, we're looking at about 600,000 births per year at the moment. But an alternative approach is to use what we call the ‘total fertility’ rate, which is basically the average number of live children that women might expect to have during their childbearing lifespan. So it's a better measure than simply looking at the trends in the number of births because it accounts for changes in the size and age structure of the population.

    MF
    So it has a sort of multi-dimensional value then statistically that you can use to infer various things about the age at which people are likely to have children, and how many they're likely to have.

    JT
    That's exactly right. So we've seen some changes in the total fertility rate in recent years. So if you've heard the expression 2.4 children as describing the average number of children per family it's now considerably lower than that. In fact, it hit a record low in 2020 when the total fertility rate was 1.58.

    MF
    That's a sharp decline. In fact, though, you've got to go as far back as 1970, when the current series began, that's when it really was 2.4. What's really striking is if you look at that graph, the decline that happened between 1970 and about 1977 - very sharp decline there. Do we know what happened during that period? What were the factors driving that particularly?

    JT
    I think there can be all sorts of socio-economic factors affecting the fertility rate: improved access to contraception, reduction in mortality rates of children under five, which can result in women having fewer children. And also, more recently, as we've seen the average age of mothers going up, we might see some lower levels of fertility due to difficulties conceiving because of that postponement in childbearing.

    MF
    Sarah, I can see you want to come in on this.

    SARAH CAUL
    So my mother had three children by the time she was 30, and growing up I would just assume that that was the route I was going to take because it was what I've known. I am now 31 and I think if I was pregnant, that thought would scare me. I don't think I've grown up enough to have a child. I’m a dog mum, but those don't come into the statistics.

    MF
    So there was a bit of fanciful talk about people in lockdown finding - how should we put it delicately? - you know, things to do with their time, and that might lead to a boom in births. But that didn't really transpire?

    JT
    The increase in 2021 would actually coincide with conceptions across the second and third lockdowns. So yes, there was some speculation that people may have had enough of board games and were occupying their times in other ways, but I think it's actually more likely that it's a result of people delaying having children earlier on in the pandemic because of the uncertainty that was around at that point. And then towards the end of 2020 people had moved on from that and we saw a bit of an increase.

    MF
    Nonetheless though, historic data shows that there is a most common time of the year for conceptions to take place and that has something to do with the festive period, doesn't it?

    JT
    That's right. So the most common birthday is generally - almost always in fact - towards the end of September. So it doesn't take a statistician to work out that means the most popular time to conceive is over the Christmas and New Year periods. So that could be due to the Christmas festivities, but it might be also be something a bit less romantic than that. Some people, for example, might consider that there's an advantage to children being older in their year in school for example.

    MF
    The ONS also publishes the list of most popular baby names every year, and it is apparently one of the most downloaded and most popular bits of content on the ONS website. James, a lot of people scoff at this as an exercise. Is there any value in this list of baby names? Or is it something the ONS just produces because people like it?

    JT
    As you say it is one of our most popular releases and I think people use it to inform their own choices of names, and it can also tell us some really interesting things about culture in the country at the time. The top of the league table hasn't been that interesting, to be honest. So Oliver and Olivia have been the most popular names for the last few years, but it's beneath that that there's some really interesting trends emerging. So there's always a lot of interesting names that are going extinct. For example, last year, it was picked up a lot in the press about the name Nigel, which joined the list of critically endangered names like Gordon, Carol and Cheryl, and we do also see some really interesting influences of popular culture. And also royal babies always have a big influence. Some of the interesting ones from the last few years - we've seen some more Maeves and Otis’, which are characters from the TV series ‘Sex Education’, and even some Lucifers from the series of the same name. But generally you'd expect there to be positive associations with baby names so you do almost always see an influence of royal babies - we've already seen that with George but might be predicting a rise in Archies with Prince Harry’s son.

    MF
    And it’s quite interesting, seeing the cyclical thing with names that you might have associated with previous generations coming back into popularity, and Archie is a great example of that, isn't it? Sarah was one of the most popular girls names for a long time, certainly in the 80s and the 90s. But Sarah it's dropped out of the top 100 altogether.

    SC
    It has dropped down, but there's a Sarah in every single generation in my family. I think we're all named after each other. So my family is doing its best to keep it alive.

    JT
    Just a bit of a question for you. Where would you put the name Miles in the ranking?

    MF
    Well, it’s probably not in the top 100 James.

    JT
    Yeah, I'm afraid it's not quite top 100 material, but it is number 144. There were 390 Miles in 2020. And it's actually been on a bit of a roll recently. So that's the highest ranked it's been since 2002.

    MF
    Perhaps it’s the growing popularity of this podcast James, or maybe something else at work. Anyway... One thing worth noting about this before we move on, it should be pointed out that producing the baby names list is not an expensive exercise for the ONS.

    JT
    No, the data is very straightforward to collect. It's just a matter of compiling it into something that can be easily accessible and interesting for people to look at.

    MF
    And it's also one of the reasons that we don't compare the spelling of different names, because there's this long running thing isn't there about how if you added up the different spellings of the name Muhammad, then that would be the most popular boys name in England. That's not something the ONS does because, quite simply, we're just seeing the spelling that people enter on the system.

    JT
    Yeah, that's exactly right. And I think increasingly that could become even more of a task to compile those, because we're seeing an increasing use of shortened versions of names or alternative spellings. And if we were to try to compile those into one then that would definitely increase the time that we spent on it.

    MF
    Well, there you are, everything you need to know about baby names and - more seriously - the measurement of births and fertility. Plenty more information of course on the ONS website.

    With that, we must turn to the other end of life, and that is measuring deaths - a topic which has been very much in the news for the last couple of years since the outbreak of the pandemic. Right at the centre of that has been my colleague Sarah Caul, who's sat with us this afternoon. Sarah, you're recognised for your achievements during that period with an MBE, official honour, which you collected from Windsor Castle.

    SC
    It was definitely very surprising. I wasn't expecting it, but I'm very thankful for it. It's quite a proud moment in my life. If you ever see my mum, she'll just scream at you: “My daughter’s got an MBE”, so that's always nice.

    MF
    Recognised now then as an authority in this area - it's fair to say that the ONS was publishing this list of weekly deaths very quietly, almost unnoticed, for many years. And then of course, sadly, that changed at the start of the pandemic.

    SC
    With ‘weekly deaths’ it did have a small audience, to the point where they were considering actually not publishing it anymore. Pre-pandemic it wasn't a very large part of my job, because it was just something very quick and easy to do. My main analysis would be on annual data - we release annual data the summer after the end of the reference period. We would look at different causes of death and see where we could investigate further to help monitor the picture of what people are dying from, and if that can be prevented.

    MF
    That all changed of course March / April 2020 with the arrival of COVID-19.

    SC
    We started quite early thinking of what we could do with COVID and we added just one line into the spreadsheet, which was the number of deaths. It went from something like five to over 100 in one week and we were like “okay, we have to do a lot more of this now”. It just grew bigger and bigger because we were having more and more deaths and we needed to get out, as quick as possible, as much information as we could. We would be doing something that would usually take us months to do in a matter of days, every week. And we're actually still doing it to the same level now because we are still seeing COVID death - it hasn't completely gone away.

    MF
    Incredible demand for information from government, from everybody, of course - desperately concerned about what was happening. There was suddenly this incredible focus and attention, and huge pressure, on you to get those numbers out very quickly.

    SC
    Those first few months were quite a blur, because we were publishing weekly and monthly and were constantly adapting and constantly trying to figure out what people were interested in seeing. And getting that information out into the public domain is probably the most challenging time that I've had here. I don't think I've ever worked at that pace before. But we have got so many experts in the health analysis and life events area that we're in. We had expert coders, experts in different causes of death. It was great to see everybody come together and work really well together. Despite the enormous amount of pressure, we were having to deliver things that would normally take us months in days, and sometimes hours.

    MF
    Your team were actually among the first to see the full impact, because there wasn't so much testing going on among people who have been infected. And it was in those mortality figures that the real impact was first being revealed.

    SC
    It wasn't until our death certificate information came out, because testing was so limited in the early days, that you could kind of see the impact, and see how quickly it was increasing.

    MF
    How do we gather those numbers?

    SC
    So when somebody dies, the informant - or family member usually - will register the death, usually within five days, but depending on if it needs to go to a coroner, it could take months or even years to register that death. And we don't know about a death until it is registered. When that information gets put through all of the causes of death listed on the death certificate comes through to us at the same time with an assigned underlying cause of death, as well as contributory cause of death. So we have all of that information on each and every death registered in England and Wales.

    MF
    And it's very important to understand you can have more than one cause of death because this is very relevant to understanding how many people might actually have died because of COVID.

    SC
    The majority of deaths, regardless of cause, have more than one cause listed on the death certificate because you have complications, and one cause could lead to another cause. So the way we categorise it is deaths ‘due to’ COVID - where COVID was the underlying cause of death or any other condition - and then deaths ‘involving’ it - so where it was mentioned on the death certificate as the underlying cause or a contributory factor.

    MF
    Do you think a lot of people were actually confused by that?

    SC
    One of the things that people struggled to understand sometimes during the pandemic was that this is a different number to the public health measure. So somebody could test positive for COVID-19 but not have COVID-19 on the death certificate, because it didn't contribute to the death. So the example that gets told quite a lot is if somebody tests positive and then gets hit by a bus, it's very unlikely that COVID will be mentioned on the death certificate.

    MF
    And that's absolutely vital in understanding how many people have died ‘from’ COVID as opposed to a death ‘involving’ COVID.

    SC
    Yeah, so it's very important. The public health measure’s great because it's really fast, and it gives us a more instant knowledge of what's happening. Our statistics come out about 11 days later, but it's where COVID contributed to the death, and not just was present time of death

    MF
    That helps us to really understand what the mortality impact of COVID-19 has been so far.

    SC
    It is really important. So from the start of the pandemic to the week ending 13th of May, we know there's about 195,000 death certificates that had COVID on them, and that's the whole UK as we've worked with colleagues in Northern Ireland and Scotland to bring a UK figure together, as usually we only report on England and Wales. And then that enabled us to do further investigations about who was most at risk of dying from COVID. And we did a lot by age, place of death and any breakdowns we thought possible to try and help identify those most at risk.

    MF
    Another great strength you might say of the ONS numbers is the comprehensive nature of the way the information is gathered centrally and reported very quickly. And that was evident during the pandemic when you saw the UK numbers coming along and influencing policy decisions really quite rapidly, compared to similar countries around the world. Central to that is the whole concept of ‘excess deaths’. That's a good objective measure of impact, regardless of what doctors have written on the death certificate. Sarah, tell us how that works, particularly what is its statistical value, and what's it been saying?

    SC
    We use ‘excess deaths’, which is the number of deaths we see in a period compared to what we would expect - and to get the expected number we use an average of the previous five years. By doing this, it takes into account the direct and indirect impact of COVID, so we have a fuller measure. It's really useful as well for international comparisons, because we're not relying on everybody recording deaths in the same way. It's just a straightforward “how many deaths above what we would expect are we seeing?”

    MF
    And what has it shown so far - what has been the impact on excess deaths?

    SC
    So we've seen quite a high number of excess deaths during the pandemic. In 2020, we saw over 75,000 more deaths than we were expecting originally. In 2021 that is lower - we saw around 54,000 deaths more than we'd expect. And currently to date for 2022 we are seeing the number of deaths slightly below what we'd expect looking at our five year average.

    MF
    Do we know yet - at the least the early indications - for what this might all mean for life expectancy?

    SC
    We have released some life expectancy statistics for 2018 to 2020 as we do three-year combined, and we do see a bit of a dip in the last year because of the high number of deaths in 2020, which was due to the pandemic. We're still seeing the numbers are significantly higher than at the start of our time period, which was 2001 to 2003. Somebody in England in 2018 to 2020 would live to about 79 years as a male, or 83 years as a female. Whereas in 2001 to 2003 it was more like 76 years old for males and 81 years old for females.

    MF
    So in recent history we've seen these really quite pronounced increases in life expectancy for men and women.

    SC
    People are living longer. It’s increased more for males than it has for females. It's reducing that inequality gap, because we do see that women do tend to live longer.

    MF
    Do we know why men are catching up with women in terms of life expectancy? Is it lifestyle, nature of work perhaps?

    SC
    There is a lot more of a decline in heart diseases, and especially in males, so I think that could indicate healthier choices, which would then increase somebody's life expectancy.

    MF
    Another important concept when understanding how the ONS looks at mortalities is the whole question of ‘avoidable deaths’. So how does that work and what is it been telling us?

    SC
    So ‘avoidable mortality’ is defined as a cause of death that is either preventable - so for example COVID and appendicitis is included in this – or are treatable - so this would be different types of cancer. For those aged under 75 in 2020, 22.8% of all deaths in Great Britain were considered avoidable. This is around 153,000 deaths out of 672,000. The categories where we've seen the biggest increase since the start of our time series was alcohol and drug related disorders, which is the only group of causes where the mortality rate is significantly higher in 2020 when compared to 2001. But the biggest driver of avoidable mortality would be the cancers.

    MF
    So those figures for avoidable death might suggest then that there is still considerable disparity in life expectancy between different groups.

    SC
    So we see through our data that those living in the most deprived areas have a substantially higher rate of death from avoidable causes - with deaths due to COVID-19, drugs and alcohol being notably higher in the most deprived areas. Avoidable deaths accounted for 40% of all male deaths in the most deprived areas of England, compared to 18% in the least deprived areas in 2020. And then we see the difference again for females. It was 27% of deaths in 2020 in the most deprived areas, and then 12% in the least deprived areas. So this gap in avoidable mortality between the most and least deprived areas - it's actually at its highest level since 2004 for males, and since the data began in 2001 for females.

    MF
    James what are the factors that are driving those disparities which are, on the face of it, pretty serious?

    JT
    The difference between the most and least deprived areas is one of the most striking statistics we produce actually. And I think it really shows the importance of looking beyond those top level figures. And that's the ability we have here to look at the minute detail of the data. I mean, there's all sorts of factors that can go into life expectancy. So there are things like access to health care, nutritional aspects - there's plenty of things that can drive that gap, but it's really, really striking and definitely needs looking into.

    MF
    What's the direction of our work in this area? Because for some areas are we not seeing actually a sustained reversal of life expectancy, not just shorter life expectancy, but one that's actually getting shorter.

    JT
    I think you mentioned earlier Miles about how this mortality data had kind of risen from obscurity I think. During the COVID pandemic the spotlight has been shone on deaths and Coronavirus itself, but really there's going to be a period where we're really going to have to make best use of that data to look at the indirect effects of Coronavirus as well. So, take for example just within the pandemic we saw a big increase in alcohol related deaths in 2020, and that tallies with other research that shows that patterns of drinking have changed during that time with heavy drinkers drinking more. So beyond the pandemic as well - we're looking at things like delays to treatment times for certain diseases. So there’s plenty of analysis still to do on the impacts of the pandemic.

    MF
    Deep in the recesses of the ONS data though, the causes of deaths that are recorded - some of them are, you have to say, they're unusual, they're quite remarkable. Sarah, can you give me some examples of some of the most unusual deaths that have been recorded?

    SC
    So I’ve got a few of the least common causes of death, and I don't want to scare anyone - the numbers that I've got here are over an eight year period, so they're very rare. So I don't know if you want to see if you can take a guess at how many people are ‘bitten or stung by non venomous insects and other non venomous arthropods’?

    MF
    People attacked by bees and wasps, that kind of thing.

    SC
    Not because they're venomous, but because of the incident themselves.

    MF
    Well I'd like to think there was a very small number. I don't know - over an eight year period - hopefully less than 50 or so?

    SC
    It was less than 50. It was 12 - which is more than I was expected. Another one we have is ‘fall involving ice skates, skis, roller skates, or skateboards’?

    MF
    Can be very dangerous. I don't know, 5?

    SC
    Three! Very good guess. I for some reason thought it would be more than being bitten or stung by an insect. We've got ‘victim of lightning’?

    MF
    Rare again. Highly unusual. I don't know... 10?

    SC
    Seven! You're quite good at the guesses. I’m very impressed.

    MF
    You know, you hang around the ONS long enough and you start to get a feel for these things. What do we think is the most unusual cause of death that we've recorded?

    SC
    We've got a lot that only have one death. One of the ones that springs to mind is ‘bitten by rat’. I did expect more people to die from that than some of the other ones we've got, like ‘contact with powered lawnmower’. But I guess that's quite a dangerous thing to do, especially if you're like me and start doing it in your flip flops. So yeah, dangerous.

    MF
    Definitely not recommended.

    So we've looked at births and we've looked at deaths. But what's the balance between the two at the moment, James, and what's the impact on our population of all this overall?

    JT
    Population change is driven by the number of live births and the number of deaths and the balance between those, but also the migration that takes place each year. So the difference between the number of births and the number of deaths is a component known as ‘natural change’. Over the last decade or so, although we've generally seen more births than deaths, we've actually seen a narrowing of the gap. So all else being equal, that means that the population growth will slow. Also, we did actually see a blip in 2020 when for the first time for a while the deaths exceeded births, but that's going to be due to the very high number of deaths that we sadly had from Coronavirus in that year.

    MF
    And that was highly unusual - the first time in many years we've seen that.

    JT
    Yes, that's right. So the general trend has been more births than deaths and we've seen a return to that in 2021.

    MF
    Well, there we are, proof that the ONS really does cover us from the cradle to the grave.

    ‘Statistically Speaking’ comes to you from the Office for National Statistics. I’m Miles Fletcher, thank you very much for listening. Join us for the next episode, which you can hear by subscribing to this podcast on Spotify, Apple podcasts and all the other major podcast platforms.

    Our producers at the ONS are Julia Short, and Steve Milne.

    ENDS

  • Our topic this time, and it's a big one, is the economy. The science of measuring rapid change in a complex, globalised and now increasingly turbulent economic situation. In this episode Miles is joined by second permanent secretary Sam Beckett, and head of inflation, Mike Hardie, to look at how the ONS is keeping on top of rising prices, and how two of the biggest economic shocks in recent history have helped shape the Office for National Statistics' (ONS) current approach to collecting its key economic data.

    TRANSCRIPT:

    Miles Fletcher

    Sam, one angle I'd like to explore with you is the extent to which everything has changed recently, and the way that the ONS measures the UK economy, the extent to which that was informed by the experience of what happened 14 years ago now and the financial crisis. Could you talk us through what happened there in terms of the ability of the statistical system to actually spot what was going on? And what lessons were learned during that period?

    Sam Beckett
    Yes, certainly. That is going back quite a while now, isn't it? But I think one of the key things that you can really compare and contrast with where we are now compared to then, is about the timeliness of GDP. Back at the time of the global financial crisis the Office for National Statistics was very slow to spot the turning point. We were dealing with crucial data for the economy's output. And it was probably about six months before we were able to sort of scale the downturn in the economy and see the economy going into recession.

    MF
    Meanwhile, during that period, of course, people were being hit quite badly by that economic downturn. But the official statistics that were available had nothing to say about what was happening.

    SB
    No, that's right. So we would have been waiting to find out the extent of the downturn as people were seeing it hit their livelihoods, for something like six months back in 2008. If you fast forward then to the experience that we've had over the pandemic. You know, our monthly GDP statistics are out about six weeks after the period they refer to so you're getting a very timely indicator on what is happening to the real economy now. So you can really compare a sort of six months gap to a six weeks gap now. And if you think about the way the pandemic played out with, you know, the economy being closed down to try and limit transmission and then opened up again successively, and in the waves, if we'd been waiting three months or six months to find out what was happening, it really would have been a hopeless situation. But we got those very timely official statistics on GDP, but not only those but even more timely statistics from business surveys, and opinions and lifestyle surveys that we've done, where we can actually get a two week turnaround on what is happening to the economy and how people are responding.

    MF

    So it was really a question of learning from that experience and putting in place the kind of mechanisms that can help us as a country to actually find out what was going on closer to the point it was actually happening out there in the real world. Has the rest of the world learned that lesson as well, or is the UK among countries that have been quicker onto this do you think?

    SB
    We're certainly one of only a handful of countries that publish a monthly GDP figure. So I think in that big kind of headline and official statistic, we're still in a relatively select group that publish as frequently as monthly and as close to the time. We're also looking at financial card transactions data; we are looking a lot at admin data on the labour force, and trying to bring together a host of statistics that shine a light on what is going on, on the ground during the economy. And I think we count ourselves amongst a relatively small group of national statistical institutes that are cutting edge in their use of innovative data sources.

    MF
    So by the time the pandemic then comes along, two years ago now, the ONS is in a better state to actually find out what's happening, but nevertheless, was there a certain extent to which the organisation had prepared for another downturn like 2008, rather than what actually happened which nobody had foreseen, a widespread pandemic including a serious risk to life?

    SB
    Indeed, I mean, who would have thought that you know, we would have been hit by a pandemic of such a global scale and impact? I think one of the things that is a huge advantage for the government and the UK economy has been to have this objective handle on the level of infection out in the community. And that is something that the Office for National Statistics signed up to deliver really early on in the pandemic. So, our COVID infection survey, which has now swabbed millions of people on their doorstep, gave us a great handle on just how many people have had COVID, not just relying on the data of people who were turning up at doctors and hospitals, who had symptoms already. So you know, the COVID infection survey was a more random sample of the community and gave us that objective handle on how many people had COVID and indeed, some of them asymptomatic, you know, no symptoms of COVID but tested positive on the doorstep and that gave us a great insight over the pandemic and helped advise the government on what should be done to try and limit transmission.

    MF
    So meanwhile, as well as setting up that very important survey, there were a lot of other very quick changes that were put in place as well to measure the economic impact, the impact on individuals, on businesses as well. Can you talk us through some of the work that was done there to give that very quick turnaround, the fast indicators, that quick view of how items in the shops are being affected; how people in the workforce were being affected; and how the country and the effects of lockdown - to what extent they were actually hitting the economy in real time?


    SB
    I mean, starting with those quick turnaround surveys, there's two really that are really good companions to each other. The first is the business insights and conditions survey - and that surveys about 40,000 businesses and asks them questions around, you know, what is happening to their customer base, what is happening to their workforce. And there's about a two-week turnaround on that information. So, we could ask questions of businesses about how many of their staff, for example, they were intending to put on furlough and get that information just two weeks later to give us a handle on what a big uptake there would be on that scheme. The companion one is the opinions and lifestyle survey and through that we were able to ask people things like were they wearing a mask when they went to the shops? You know, were they staying at home as per the guidance and what were they leaving the home to do? And you know, were they washing their hands more and all those non pharmaceutical interventions that were so important in controlling the early stages of the pandemic. And again, between that sort of survey of households and individuals and businesses, you could track those two sides of how the pandemic and the government's measures to control it were impacting on people's lives and livelihoods.

    MF
    So in the old world of statistics, where paper forms would have been sent off, we'd have been able to produce an estimate in, ooh I don't know, a couple of months. But actually with the onset of the pandemic, this information was being fed into government, directly into government within a matter of a few days and informing that response, the actual action that was being taken on the ground.

    SB
    Absolutely. And I think also looking at some of our more traditional statistics, there had to be huge effort to keep the show on the road. Labour market statistics, I mean, incredibly important, over a period of economic turbulence, we had to go from what had been a face to face survey to a telephone based survey. And we reinforced that picture by getting information from payrolls from HMRC’s PAYE database, to understand what was happening to the labour market and keep that total picture, even though our standard survey had to move rapidly to a telephone based one. But I should add, you know, when people think about that admin data, I would like to emphasise that we're incredibly careful that none of that would identify anything about individuals. And we're extremely careful to ensure that we don't collect data that we don't need and that everything is de-identified.

    MF
    And that's a very important point now, because it's not just a question of people taking part in surveys is it? It's about the ONS having relationships with the credit card companies, for example, with mobile phone providers as well. And while these huge datasets give a fantastic up to the minute picture of of what's going on - money being spent and how movement is being affected as well - people are going to be understandably concerned about government having access to that sort of data. So how do we ensure that that is working in the public interest, only producing information that's genuinely needed for the public good?

    SB

    Our reputation rides on treating people's data incredibly carefully, and by abiding by all the regulations that are appropriate to personal data and business data. So we're incredibly scrupulous and careful in this regard. We don't gather data that can identify people if it is not needed, and we have got very reliable methods to de-identify data before we use it for analysis or indeed publish it. So you know, that's incredibly important to maintaining public trust in our statistics.

    MF
    So what have we been doing to try and measure the individual impacts that some of the price rises we've seen recently have had on households with different incomes?

    SB
    We are facing a period of some time to come where I think this is going to be incredibly high profile in the public debate about the challenges of the economy and what people are facing and indeed of measurement for us as an office of statistics. What we've been doing is trying to think about ways in which you can dig under that very average national figure of inflation. Now that is going up and most forecasters, such as the Bank of England will expect it to go up further, but it does, as you say, fail to show how different people can be impacted. You know, if they drive a lot and the cost of fuel has gone up a lot, relatively poor households spend a high proportion of their money on energy bills and on food and we know that both of those categories have been affected. So we have published some statistics that seek to look at inflation cut by different income brackets of households.

    MF
    Given that there is now so much data from supermarket scanners, from credit cards, from an incredible range of digital sources. What are the limits of all this do you think?

    SB
    Data is a by-product of the productive economy these days, isn't it? You know, data is being produced in all the other activities that we undertake online in our lives. So along with that, computing power has got so much cheaper and you put those two things together, and you just have this enormous capacity to measure activity in so many different ways, and so much more up to date, I mean, compared to anything we could have done, instead of 10 years ago, or 20 years ago, and the cost of them has come down massively. And with that, the sort of potential to get insight from them has expanded.

    MF
    Now we’ve mentioned GDP several times of course – that’s Gross Domestic Product - the traditional very long-established way of measuring activity in the economy. And it's held by many still to be the single most important national economic statistic. But at the same time, there's a debate going on at the moment about the continuing usefulness and relevance of GDP, particularly as it takes no account of the environmental dimension as well. And of course, in this country and internationally, that environmental dimension and climate change has become evermore important. So what are we doing as an organisation to factor the environment into the economic picture?

    SB
    GDP is an important measure of the productive economy. I think it's here to stay. But even in terms of it measuring the productive economy we're continually trying to improve its quality and make it more timely as we've talked about, but also more granular, you know, get more of a sense of what is happening down at a more granular level of geography. What we're trying to do is develop further, all aspects of our kind of economic welfare measures and bring things into the kind of spotlight that GDP has that are really important to all our futures. And I think, you know, climate and net zero, and those environmental statistics are one area where we're working really hard to try and give them a due prominence. I mean, we are relatively far ahead of international averages in terms of our level of development here. We've been publishing natural capital accounts for some 10 years. So we're starting from a good base, but there's so much more we can do. So, we've got two strands of work here. First, we've got an approach which tries to extend that concept of GDP, the production and asset boundaries that it measures to natural capital in the environment, as you've mentioned, but also human capital, as well. You know, the extent to which the skills of the UK workforce are being enhanced, and other aspects of economic activity, which currently fall outside of GDP, like household production, like unpaid for household work, which also really ought to be in your concept of how productive you are as an economy. So, we're developing this suite of measures that sort of extends the national accounts into these harder to measure areas that we also know are really important to our sense of economic progress and prosperity as a nation. And so that's that sort of integrated set of extending the concept of GDP to these broader concepts. But also, alongside that, we are doing some things that are a little bit more tactical and fleet of foot. They have a framework to them, like our Climate Statistics Portal, but that brings together all kinds of climate statistics from across government into a kind of one stop shop for users to explore things like climate and weather and emissions by different area, impacts and mitigations and provide insights from that. Now, not in a way that you can really aggregate with the GDP number, but in a way that would give you sort of broad insight as to progress towards net zero and what is happening to our climate and weather. So, this is a huge agenda. We call it the ‘Beyond GDP’ agenda, something where we are a relatively leading internationally but so much more work that we can do. We've got some really interesting stuff coming out later this month that will look at some of these issues and you can obviously catch up with that on our website.

    MF
    So much more change still to come. Finally, Sam Beckett, a very wise economist once said - slightly tongue in cheek – that the chief function of economic forecasting is to make astrology seem respectable. Do you think the point will come at ONS when the data becomes so good and so rapid, that actually the ONS could get into the whole business of forecasting the economy with a great deal of accuracy?

    SB
    Well, I think we are increasingly getting up to the moment, if I can put it like that in terms of our economic statistics. Yes, there's still some time lag between the observation and the publication of the data in in most cases, but we're getting closer and closer. And we are using techniques where even where some data might be missing, we can use sophisticated economic modelling techniques to bring it up to date. So, a good example there would be if we didn't have a full local breakdown of GDP data for last month, we could make up for that using what we know about the other areas, and how they changed in GDP, and also the past performance of the missing areas. So, we can put together this picture that brings things really up to date using some of those modern techniques. I think the world of measurement is different from the world of forecasting, quite fundamentally. And, you know, we leave that to colleagues at the Office for Budget Responsibility and the Bank of England, who do kind of look ahead and try and paint that future picture. But the two are interconnected. And I think you can only produce good forecasts, if you've got really reliable readings on what is happening now and what past trends have been. So, they are hand in glove and I wouldn't want to say those were two distinct but we do have our own particular objective, which is about you know, economic and societal measurement. We're not yet in that forecasting game. But we are bringing it as up to the minute as possible.

    MF
    So, while not actually trying to predict the future, at least we can measure the very, very recent past. Sam, thank you very much for speaking to me.

    Now, after decades of relatively low inflation, rising prices are back in the news. Tracking the impact of that on households is of course, vitally important work and at the ONS, that's the responsibility of the head of inflation, Mike Hardie.

    Well, Mike, anyone who follows the news and particularly recently with concern about the rising cost of living will understand the importance of inflation. But there are lots of different measures of it. Can you talk us through the different ways in which ONS measures inflation, and why each of them is significant?

    Mike Hardie
    So we have a range of inflation measures. The first family of statistics are consumer price statistics. And so we have the consumer prices index which most people will be familiar with and the consumer prices index including owner occupied housing costs, and they are our macro economic measures of inflation that are based on economic principles. We also have a second group of statistics which are called the household cost indices, and they are specifically designed to measure the changing costs and prices faced by different household groups. And that completes our family a consumer price statistics. And then beyond those, we produce business prices. So those measure what we describe as output or ‘factory gate’ prices. So those are the prices of goods leaving the factory gate and we also produce input prices as well. So all of the component parts that are used in the production process to produce a final product, how the price of those has changed over time, too. And that completes our business statistics. And then beyond that, we also produce house prices as well, which is very topical at the moment given the buoyant housing market in the UK.

    MF
    And underlying all those different measures of inflation is a very large data gathering operation. Now, there's a lot of change going on in that area at the moment, but first of all, describe for us how this traditionally has been done.

    MH
    Traditionally, in order to produce our consumer price statistics, we have sent price collectors out across the UK. We have over 300 price collectors, they go to over 140 different locations in the UK, with mini clipboards, and they go into stores and they price a range of different items. So at the start of the year, we construct a large shopping basket, a virtual shopping basket, which is based on what UK consumers spend their money on. And there's a list of approximately 700 different items. And we send the price collectors out to collect information on those items. And we also have some collection within the ONS as well. So we have a couple of teams that go online and collect a wide range of prices too. We also have some admin data as well. So for example, we get admin data on how the price of insurance has changed. And then we aggregate all of that data together to construct our consumer price statistics.

    MF
    Rail fares of course are always a big driver of inflation as well. Where does that come from at the moment?

    MH
    So that comes directly from the uplift that consumers face every year. So, when rail fares are increased on an annual basis, we capture that increase in our inflation measures. But one of the developments that we're actually undertaking at the moment is to move to using data from the rail delivery group. So that's essentially a census of all rail journeys in the UK. So, it gives us a much more detailed picture of how rail prices are changing across the country.

    MF
    So, we have groups of people out with clipboards, moving up and down the aisles in the supermarket; people looking at the web; some companies like rail companies, obviously providing information about their fares. But was that sufficient to provide a really good accurate measure of inflation or was it felt that there was much more that can be done

    MH
    So, it was sufficient to provide a high -level accurate measure of inflation. These are economy wide averages that we publish on a monthly basis. We're moving away from the manual collection that I described, where we send price collectors out into stores, where we are working with a number of leading retailers to get access to their electronic point of sale data. So, whenever you go to a supermarket for example, and spend money on your weekly shop, that information is captured by the retailer. We have a number of partnerships in place. Co-Op are one of the retailers that are happy to be named, where we get information directly from their supermarket tills directly to our systems at ONS, and we can use that data then instead of sending people into stores to compile our inflation estimates. And that data is extremely detailed. So, when we send people into store obviously there's cost implication to that. And they collect prices of narrowly defined items. So, they may for example, go in to collect the price of a loaf of bread off the shelf - we try to price the most commonly available item. What the electronic point of sale data will give us is a census of all of the prices within that store, and more importantly, not just the prices, but how much of each product have been purchased by consumers. So that fixed basket approach that I mentioned, where we set the basket at the start of the year, that will change likely for areas of the basket where we're using these new data sources, because it'll essentially be a dynamic basket that updates every month because we will have a summary of what consumers are spending their money on in real time which is really exciting.

    MF
    That's a real step change in approach then. How does the UK compare - are other countries doing this, moving away from the traditional approach into this much more dynamic and data driven way of setting inflation?

    MH
    It’s the general direction of travel. So other National Statistics Institute such as the Netherlands and Australia have been doing this. It's really difficult to do, because utilising those new data sources such as scanner data requires the development of new methods, and also new systems as well. So just to give you an idea of the size of some of these data sources. We currently use around 200,000 price quotes to compile our consumer price statistics every month at the moment. And it's likely we'll be moving to several hundreds of millions of prices every month. So, we need to change our systems in order to manage the sheer size of the data essentially.

    MF
    This really is big data in action.

    MH

    It is really exciting and gives you additional insights into changing consumer spending patterns and how prices are evolving across the UK economy.

    MF
    Does that mean the annual updating of the basket of goods - which is always quite a popular occasion as we look to see what's in and what’s out - is that going to go then?

    MH
    Not in the short term. So, there are specific areas of the basket that we're targeting with these new data sources. I've mentioned groceries, we've also touched on rail fares already and also used cars. But for the remainder of the basket, we will use traditionally collected data, so sending people out into stores and data that we've received directly and collect at ONS. So, we will still need to update that basket to reflect wider consumer spending patterns. Also, if you think about groceries, we have these new data sources for larger retailers. But in order to ensure that our statistics remain representative of price changes in the economy, we also need to capture prices from smaller retailers as well. Some of them won't have the facilities to provide us with data - so there will still be an element of manual collection.

    MF
    Now all this change - and very exciting change too - comes at a time of heightened concern about the rising cost of living and also the frequently expressed opinion that what appears to be the headline rate of inflation doesn't actually reflect people's own experience of rising prices that they face, particularly recently in the supermarket. How has the ONS been responding to that?

    MH
    So, the inflation measures that most people are familiar with such as the consumer price index is an average and when you dig into that average there will be some variation. So, everyone has their own personal rate of inflation depending on what you spend your money on. So, in terms of how we responded as an organisation, you can go on to the ONS website, and use our personal inflation calculator and outline what you were spending your money on every month. And based on that spending pattern we can work out your personal rate of inflation and how that compares to the headline. We're also undertaking some work on a set of measures called the household cost indices. And these are designed to measure the changing costs and prices faced by different household groups. So, you can break down those statistics into income decile you can break them down to expenditure decile, households with or without pensioners, and with or without children. So, you can see how changing prices and costs are affecting different household groups. And another piece of work that we're doing at the moment that’s particularly interesting is we are aiming to publish over the next month a low cost index. So, this has been widely covered in the media, where some consumers who purchase value brands in supermarkets are being forced to move to more expensive brands because those value brands are no longer available. So, what we are looking at is for the price of those lower priced products when people are forced to move to higher priced products, what that means for price changes and the implications for the household budget on a weekly basis. So that's another piece of work that we're doing to provide further insights into the recent rise in the cost of living and how that's impacting different groups of people.

    MF
    And that could shed important light on people's actual experience of shopping when they find out that the cheap packet of pasta they used to buy simply isn't there anymore.

    MH
    Yeah, so one of the fundamental principles of a price index is that we control for quantity and quality changes over time, because we want to isolate that price change. So, what you've just described there wouldn't necessarily be captured by a price index, but it obviously has implications for people in terms of the household budget. So, we're looking at producing, you know, a range of supplementary statistics to complement our headline measures of inflation, to provide insights into these types of changes, which are having an impact on people's household budget.

    MF
    Now one of the big debates, one of the big issues surrounding the measurement of inflation in recent years, has of course been the retail prices index. Tell us a little bit about that - the criticisms of the RPI as a statistic, as a measure of inflation, and how ONS has responded to that.

    MH
    So we currently produce the retail prices index as a legacy measure of inflation. Our position on this statistic has been clear for some time. We think it is a poor measure of inflation, that tends to over or underestimate inflation. And we don't think it has the potential to become a good measure either. And if you were to address all of the shortcomings of the retail prices index, you move close to our headline measure of inflation, which is the CPIH, which is the consumer prices index including owner occupiers housing costs. So, we made a proposal to bring the data sources and methods from the CPIH into the RPI and that is due to take place in 2030. But we only produce it currently as a legacy measure as we acknowledge as an organisation that it is used for a wide variety of purposes across the economy.

    MF
    So, we've had the CPI measure of inflation for quite some time. It's very important of course, it's used by the Bank of England to target the reduction of inflation. It's also used very widely around Europe. But it doesn't include that measure of housing costs. Why is it so important to include housing costs as an element? What are the challenges of measuring that given that some people live in their own houses and other people rent them? That's the problem isn't it - trying to measure how those costs are changing for different people.

    MH
    It's a large part of people's expenditure every month. So, it's essential that it is reflected in our inflation measures. It's conceptually quite challenging to measure. So, we use an approach called rental equivalence and we use rental prices as a proxy for owning and maintaining and living in your own home. And we have very detailed information for the valuation office agency, which we use to compile our measure of owner occupiers housing costs.

    MF
    And that comes up essentially within a notional figure of what it would cost you to rent your own home.

    MH
    Essentially, and this is the direction of travel internationally as well. So other NSIs are moving towards using a measure including owner occupiers housing costs. At the moment, the consumer prices index is the Bank of England's inflation target and is widely covered in the media every month, but our aim in the medium term is to move our stakeholders towards using the CPI.

    MF
    Looking into the future then, a lot of exciting changes going on. And we continue to report inflation on a monthly basis. Can you see the time when perhaps there might be a more frequent reporting of inflation, perhaps even coming down to weekly or even daily?

    MH
    That is possible with the new data sources - we could produce more timely estimates. Producing our inflation statistics on a monthly basis is really challenging. It's quite a tight timetable, you know, to send price collectors out to bring in all admin data sources, and in future, the scanner data that we've discussed, as well. So, there's quite a tight turnaround. So, it's very likely that CPI and CPIH will continue to be produced on a monthly basis, but it is possible that we could produce supplementary statistics that are maybe more timely, but our focus at the moment is improving our headline measures of inflation.

    MF
    Inflation in the news, as it hasn't been for many years at the moment - you must be very conscious of the impact that your numbers have when they come out. Describe for us the importance of the work that you're doing.

    MH
    Well inflation statistics impact pretty much every aspect of UK society. They’re used to uprate pensions, government guilts, student loans, various benefits, taxes. So, we have a very low risk appetite in terms of transforming our statistics because it is absolutely essential that we get them right because the implications are enormous if we do not. And that's been one of the challenges in bringing in these new data sources and developing new methods and systems. We've had to move carefully. We're very ambitious, but it needs to be measured ambition, because we need to ensure that while transforming our consumer price statistics we get them right, and produce robust statistics that are used across the UK economy.

    MF
    Because once reported, there's no going back - there are no revisions to inflation are there?

    MH
    No, so RPI is an un-revisable index, so we do not revise. And for CPIH and CPI, there is some scope to revise the indices, but it would have to be extreme circumstances for us to do that. And thankfully, to date, we haven't had any errors in CPIH or CPI so we haven’t had to cross that bridge just yet.

    MF
    Thank you for listening to Statistically Speaking and please join us for our next episode, which is quite literally a matter of life and death. To ask a question or suggest ideas for future podcasts, please do so via our Twitter feed @ONSfocus. I'm Myles Fletcher and our producers at the ONS are Steve Milne and Julia Short.

    ENDS

  • In the third episode of Statistically Speaking we talk to Professor Sir Ian Diamond, the UK’s National Statistician, and Dr Louisa Nolan, Chief Data Scientist at the ONS Data Science Campus about the past, present and future of stats. We explore how the pandemic has been transformative for the use and understanding of public data and how the data revolution and the fight against COVID are changing UK stats forever.

    Transcript:

    MILES FLETCHER
    Welcome to Statistically Speaking: the podcast where numbers talk and we talk to the people behind them. In this third episode, we meet professor Sir Ian Diamond, UK National Statistician and Dr Louisa Nolan, Chief Data Scientist at the ONS Data Science Campus. We explore how the pandemic has been transformative for the use and understanding of public data and how the data revolution and the fight against COVID are together changing UK stats forever. But to begin I asked Sir Ian what led him to a life of stats

    SIR IAN DIAMOND
    Okay, well, I'm going to be absolutely honest Miles: genetics. I have no idea why I was always interested in numbers and statistics but I always was. And so something in my genes said I like numbers. Something else in my genes said I like numbers but numbers which have an application and a practical application. And that led me to not only be interested in statistics, but to study statistics and then to work as a statistician in academia for some decades, but always interested in numbers and their application to policy and to improving the lives of people. And if you take that as a starting point, then it's what I've always done, and led me to at times work in partnership with different government departments. And that led me to partnerships with ONS, which has led me here.

    MILES FLETCHER
    A lot of people sort of regard statistics as numbers on a page, something that can seem quite abstract, but they exist of course to help people make important decisions. Can you think of an example in your pre-ONS career, your pre-National Statistician life, where you first used numbers and statistics to actually help solve a real-world problem?

    SIR IAN DIAMOND
    Well, yes, I mean, if I go back to the very early 1980s, at that time, the observation was made, that there had been a decline in the number of children born in the UK. That was going to be a decline of around 30% in the number of 18-year-olds, and it was suggested that therefore there would be a reduction in the demand for higher education. Working initially with Fred Smith and then subsequently on my own, I was able to project the future demand for higher education, on the basis of some assumptions that the number of women going into higher education would increase, that there would be social mobility in the country as a whole. And also, that there would be an increase in what we now call widening participation. When you bring all those things together, you get a very, very different number for the demand for higher education than from simply following the number of births. And that had an impact alongside work that other people did on influencing policy for higher education.

    MILES FLETCHER
    So a busy, very successful academic career is followed then by stint as National Statistician. You're in the job, what six months last March, just as the pandemic, as we as we came to know, was starting to break. At what point did you realise that it was going to be as big as it turned out to be and that a very special response was going to be required from the statistical system, the UK statistical system, ONS, and all the statisticians in government departments, the system that you're responsible for?

    SIR IAN DIAMOND
    I mean, I think early in 2020 Miles. We identified, very sadly, the first death from COVID at the beginning of March 2020. We now think there might have been one earlier but, you know, I think very early on we at ONS recognised that this was something that the statistical community needed to really step up for, not least working with the wider international community to define a cause of death as being due to COVID. I'd say March 2020 is when we really became aware there was going to need to be some really fast and accurate estimates of all kinds of things around the pandemic, whether it was impacting on the economy, or indeed the pandemic itself, and that led to us in April to putting together a survey which estimated both prevalence but also the level of antibodies, and subsequently now of course, issues around vaccination.

    MILES FLETCHER
    So it was a very important decision point where it was realised that the traditional, if you put it that way, the main data sources that ONS and others in government were producing were not going to be enough to measure a very, very important factor in this, that's actually how many people have got the virus at any at any one time. What point did that arise and what happened next?

    SIR IAN DIAMOND
    We had a conversation early in April. We said ONS could use our ability to be able to design nationally representative surveys and to pivot some of those designs into collecting the biomedical data that are important in order to be able to identify both prevalence and antibodies, but we will only do so in partnership with other experts. And so we very, very quickly set up partnerships with the University of Oxford, the Wellcome Trust, and the Department of Health and the Office of Life Sciences. We were able to set up a team that in one week, was able to move from a decision to go for it, to design, to ethics to the first field workers collecting some data.

    MILES FLETCHER
    And it was mounting, what was by anybody's standards, a huge field operation, as you say, in very short order to get around households up and down the United Kingdom eventually, when the survey was running at full scale. To do that very, very quickly, a huge operation…

    SIR IAN DIAMOND
    Two stages Miles: the first of which is we stood it up as a nationally representative sample, which would make estimates for England. And, you know, it takes a lot of things at pace. So getting from the field workers getting the swabs to the laboratories, getting the tests, getting them back, doing some really quite sophisticated statistical analysis to make estimates. Getting all that done requires a lot of logistics, and I think the team deserves an enormous pat on the back for so doing. And then that success led to the scaling up. So that we can make original estimates so that we can make age-specific estimates. And we were able to do that. But then that was a huge scale up in September of 2020 and I think again, the logistics of scaling that up was incredibly challenging, but successful. And at the same time working with our colleagues in Scotland, Wales and Northern Ireland, to be able to produce estimates for those administrations too was something that I'm very proud of.

    MILES FLETCHER
    And the record shows exactly what was achieved during those pressured early months of the pandemic. And of course, right at the start there were plenty of people around who doubted whether the statistical system, whether the ONS and others were really capable of doing that job. Was it satisfying to confound those critics?

    SIR IAN DIAMOND
    I didn't hear them, I just got on and did it, to be absolutely honest, Miles. I knew what we could achieve in terms of both the survey which was able to measure prevalence and antibodies, but also the social survey because you need to know how people are feeling about the restrictions. You need to know how people are feeling about the pandemic. Were they anxious or not? And then as people started to talk about, for example, face coverings. What were people's attitudes to those things and, and were people adhering to the restrictions? So, there was a social survey, that was producing weekly estimates as well. That was incredibly important, and we were producing economic statistics, as well. So I have to say it wasn't a question of was the statistical system standing up and delivering a survey to estimate prevalence of the pandemic. But it was addressing a whole set of other questions, which required not only statistical collection, but in some cases, further analysis, and data linkage and a whole range of sophisticated statistical methods to be able to provide information for the government and for the population so that they understood exactly where we were at any time.

    MILES FLETCHER
    And what do you think that all that has done for the general trust the public have in the statistics that they see from us or from the media?

    SIR IAN DIAMOND
    ONS has always been a very trusted organisation. I mean, one of the important things that we have in the UK is the independence of the ONS and I think that’s incredibly important and the public in all the surveys that we have done over many years have demonstrated great trust in the statistics that we produce. And I think that the public has continued to show that trust over the pandemic. And I hope although at this stage I stress I'm hoping, that the public will feel that the ONS has delivered during the pandemic and therefore will be prepared to continue to trust the ONS in the future.

    MILES FLETCHER
    Talking about the public and involvement, coinciding with this pandemic has been census of course in England and Wales and we asked every household once again to complete the census. Again, at the beginning, some said it couldn't be done because of the pandemic and others even more said it shouldn't be done because of the cost. How has it all gone? And will it tell us what we now urgently need to know about our population?

    SIR IAN DIAMOND
    We had a really very good and very strong response. We're now in the process of doing the analysis so that we can produce really accurate results and that's going to be incredibly important. Should we do a census? Well, I think a census is a statement of great confidence from a country that is prepared to say that on one day, this is a picture of what that country is and how many people there are and their characteristics. And that is so important for all kinds of reasons. So yes, it was incredibly important I think that we did. Yes, it was incredibly important that we did it at the time of the pandemic, because we needed to know where we were at that time. Of course, we will be working very hard to update our statistics over time to really understand the post pandemic world. I'd have to say also that you know, the cost is high, no question. And we will be working very, very hard over the next 18 months or so, to produce a set of recommendations as to the future of population data collection. Do we need another census or can we do things that administrative way. In 2014 we thought about this with regards to 2021 and a really good report done by the late Chris Skinner, together with John Hollis and Mike Murphy, recommended that this census that we've just done, digital first census, should go ahead, but we should aim to make a recommendation about the future. And that's what we're planning to do. It will require support from many other parts of government. I'm confident that we will get that support. And the one thing I can say Miles is that over the next 18 months or so we'll be working flat out to be able to make a recommendation that is extremely tight and extremely evidence based.

    MILES FLETCHER
    Now this whole question of whether there should be another census, actually it chimes with a reaction that we saw coming back from the public, and we did certainly get a good response rate. We reckon 97 percent of households did take part in the census and that's as good a response as there's ever been - perhaps there was a certain advantage to holding it during lockdown even - but some people asked why they have to fill in this census because surely the government should already have all this information to hand by now. How far are we down the road to be able to gather all the information from other sources already as many countries do.

    SIR IAN DIAMOND
    Well other countries do and other countries for example, particularly those in Scandinavia require a Population Register where you have to if you leave the country, come back into the country, you have to register that you are there. And if you move you have to register. We don't do that. So we do not require you to register that, for example, you have moved house or register with the Office for National Statistics. You may register with the land registry but if you don't, if you just move, we don't require you to register that. Interestingly, there is no one source for occupation in this country other than the census. So, while you may think that data are held everywhere, Miles, they actually aren't. And so, while there are a lot of government data, there are no single sources which cover a lot of the things that a Census does and also there are one or two questions that one has in the census which are attitudinal, for example. So, you ask about well being. Well the only way you can ask people about wellbeing is to ask them, so you actually need to collect those data on a census. So there's a whole set of things that we ask on the census that very simply we don't ask elsewhere. And therefore, it's important, I think that we do get those data.

    MILES FLETCHER
    And of course data has to be fast to be effective now, or certainly faster. During the pandemic again we've seen advances in how new data sources have been used: anonymised credit card data, traffic camera data, mobile phone data, shipping data to provide these really fast readings of economic impact. Novel and brought in, in some cases, and as a specific response to the urgencies of the pandemic. But will these last now?

    SIR IAN DIAMOND
    One hundred percent. I think one of the things we've seen over the last few years has been the increase in born digital data, and we need to recognise the potential benefits of those data for our understanding of society and the economy, and indeed the environment and we need to be using them at pace in every way possible. And asking the question, do they replace things that we always have? Or are they in addition? And if they are, in addition, are they really adding value? Very easy to get involved in what you might call a data deluge. Yeah, there's loads of data out there so we’d better have it. I think you have to be very, very focused on whether any particular data add value and insight to the subject under study. If they do, then I think that it's important for us to use them and to access them. If they're just simply adding some more data then we do not need to follow them up. So data for insight, not data for data's sake.

    MILES FLETCHER
    So we've had two years driven mainly, but not wholly by the pandemic, but two years of incredible progress in our statistical system. Looking to the next decade, what comes next, what do you think we're going to see in statistics and data, how it's going to be used and what sort of issues are we going to be addressing?

    SIR IAN DIAMOND
    We will be able to process ever bigger datasets and to do so ever faster. So all the kinds of things we have been talking about, about more digital data, analysis of texts, as well as numbers and data produced at speed and at pace will be the norm. But that doesn't stop us wanting to continue to collect some pretty important data, for example, GDP or inflation data and to do so, perhaps, in a new way. In the last year we've calculated GDP using some innovative data sources, but in a way which enables those long time series that we started talking about at the beginning of this conversation Miles, to be maintained. I think it’s incredibly important that we do maintain time series while at the same time produce evermore exciting and new data sources. And I return finally to the point that we will still want attitudes. If you want attitudes, we'll need to continue to do surveys. So I think it’s an exciting time, one of the other areas that I think we will see, real progress is improved data visualisation and improved interoperability with people. And I think that's important when it comes back to trust, if people are able to go on and manipulate the data themselves very, very easily, then again, the transparency and the openness and the use of data will be something that will remain at the heart of what we do.

    MILES FLETCHER
    That's Sir Ian Diamond, the National statistician. Now if there was one single development that made the ONS and perhaps the whole of the UK statistical system ready to cope with the pandemic, it was arguably the ONS Data Science Campus. Established in 2017 its mission is to work at the frontier of Data Science and Artificial Intelligence, building skills and applying tools, methods and practices it says, to create new understanding and improve decision making for the public good. So what does that all mean in practice, and what has the campus achieved in its first four years? Questions I put to Dr. Louisa Nolan, its chief data scientist. Louisa to take it from the top as it were: tell us, what is the data science campus and what are you out to achieve?

    LOUISA NOLAN
    The data science campus was set up four and a half years ago, and our mission is to explore new types of data, new types of technology, new techniques in data science, to make sure that we're making the most out of the data that's available, the ever increasing types of data that are available to us. And we also build capability in data science not just in ONS but across government and the wider public sector as well. So data science is really about the analysis of that data, getting that data together. But we need to get hold of the data. We need the right tools and platforms to use that data, particularly big data. It's about testing those technologies and how we do that to build those insights as well.

    MILES FLETCHER
    And when does data that you harvest, when does it become statistics?

    LOUISA NOLAN
    That's a really interesting question. And different people probably would give different answers. Statistics, I would say is a summary. So it's a summary, it might be the average the mean, or it might be a trend, it's looking at the overall picture, whereas data might be your input. So the satellite picture or the information somebody's given on the census, and statistics really is turning it into something that we can then understand broadly, what's going on and why those things are going on.

    MILES FLETCHER
    And it's your job then, in essence, to find how best to use that, those mountainous volumes of data and transfer them into usable, useful statistics and insights.

    LOUISA NOLAN
    Absolutely, and there's the technical part of that the techniques but also understanding those new types of data, understanding their quality and their bias and how we can best use them so that we produce something that's useful for decision making and not misleading.

    MILES FLETCHER
    The data science campus has been around for just a couple of years really, but what have you achieved in the time since it's been running?

    LOUISA NOLAN
    We've achieved a lot. So on the capability side we've set up data analytics apprenticeships, the graduate data science programme, the data masterclass, which is about teaching senior leaders data literacy, we've delivered face to face training, we've trained more than 600 analysts across government to be data scientists in that time. We've built data science community activities, and then we've also delivered a vast range of projects, including things around faster indicators, counting cows from space, text analysis to help automate and understand big government consultations. So it's been a really wide range of stuff.

    MILES FLETCHER
    What have you been doing, for example, with economic statistics?

    LOUISA NOLAN
    So we've been doing some really interesting stuff with economic statistics. Back two years ago, seems like it was longer ago but I think it was only two years ago, we were asked to see if we could find faster indicators which would help to kind of test the health of the economy much earlier than our GDP and official outputs. And this isn't as a replacement for GDP, just to get some faster information a bit earlier. So we had a look at what was available. And we wanted to make sure that we had data that was high frequency and low latency, obviously, if we want to understand what's going on bit quicker. But also to make sure that it had some kind of relationship to economic concepts. In the past people have looked at things like lipstick sales, or men's pants sales or…

    MILES FLETCHER
    Counted cranes?

    LOUISA NOLAN
    Counting cranes! Counting cranes is maybe slightly better, but not all of these are very robust, and actually they're terribly subjective. And if you look at them over the long term, they don't really work. So we wanted things that really related to economic concept, even if they weren't the same as GDP. We're not trying to measure GDP. So we had a look at the various datasets that were available and the first set of faster indicators that we produced covered three different datasets, all of them really interesting in their own right. So the first one was creating a diffusion index from VAT returns. So a diffusion index just tells you the proportion of businesses whose turnover have gone up since they last reported, and obviously if that starts to drop off, that's a bit of a warning signal and you might want to go and have a bit more of a look and see what's going on or why that's happening. The other two were really different. We've used VAT data before, but the other two were really different for ONS. Firstly, road traffic data. So this comes from sensors in roads, particularly used for active traffic management, and it counts the number of vehicles passing those sensors and you can also tell how big the vehicles are, so you can separate out cars from HGVs. And we think this ought to be quite a good indicator of what's going on in the economy. Because the amount of stuff moving around the country, people travelling to and from work, quite interesting and you'd expect that to be related to economic health and the movement of people and goods. And then the last one was perhaps the most interesting dataset because it's the biggest. It’s a global dataset on shipping. Every ship has a tracker. When it's in motion, if it's above a certain size, when it's in motion, it has to say where it is every second and then when it's at rest it needs to say where it is every couple of minutes. So this is an amazing dataset that tracks all the big ships. So we had a look at ships coming into UK ports, the number of visits, the type of ships coming in and how long they stayed there for. We created, I think it was about 300 different time series from these and published them very quickly. The first time that ONS had done something like this, possibly the first time in the world that this kind of faster indicators had been published by a national statistics institute on a regular basis. Really interesting data. And I think that kind of set the scene. So we've gone from those initial three datasets. Over COVID, huge appetite for faster information because things were happening so rapidly, lots of changes in the economy that were unpredicted two years ago. And so both data science campus and ONS have built on that initial faster indicator output. There's now a suite of I think more than ten different faster indicators based on things like job vacancies, footfall, traffic, camera information, all kinds of things that are feeding into that picture of what's going on very rapidly. High frequency, not much delay between the data and the reporting.

    MILES FLETCHER
    To what extent has the pandemic then hastened the pace of progress in the data science campus, and to what extent have the indicators that you produced been corroborated or vindicated by the subsequent classical data that ONS produce?

    LOUISA NOLAN
    And so as COVID hit, obviously, there was a huge desire to know what was going on how well people were complying with restrictions. Were people really moving about or have they complied and stopped moving about, and also understanding the impact of that on the economy. So the campus was well placed because of our skills and the way we're set up to rapidly pick up some new datasets and have a look at them. So we very quickly got some mobility dataset. So this is about how the bulk of the population is moving about to look at how well people were, not individuals, but how well the population was complying with restrictions. And I should say here that we're we've never been interested in tracking individuals. It's all about the bulk movements, what goes on. So we very quickly got that managed to quickly stand up someregular outputs. At one point we were reporting daily on what was happening because things were happening so quickly. And as time has gone on, I think it's fair to say that the narrative from some of those faster datasets has been broadly correct. But obviously as you get the more detailed information and more of the breakdowns, the information in, you can have a more robust, accurate measurement, not just the “well it looks like it's falling really rapidly”, or “it looks like it's coming back up again” kind of interpretation.

    MILES FLETCHER
    In terms of speed, the delay between data creation and data analysis is getting ever and ever shorter. How fast can this get at what point will we be able to be able to read daily readings of the economy for example, daily readings of population shift?

    LOUISA NOLAN
    I think that it's becoming possible. I don't think you'd ever, I don't think you would have daily GDP because there's so many elements in GDP that you couldn't collect on a daily basis. The question is, particularly around the economy: How useful is having daily outputs on the economy? If you knew GDP daily, how would that help your decision making? But for population if you know what population density and how that changes over a day that might be really useful because that will tell you something about where there’s high density areas, how people are travelling about how people are not travelling about , over COVID. And that would help with things like your local planning, with managing big events and so on, and help us to spend money more effectively because we know where people are and we've got a better and quicker understanding of where populations might be both in the short term over the timescale of a day and in the longer term.

    MILES FLETCHER
    You mentioned observing cows from outer space as well. I've got to ask you what that involved?

    LOUISA NOLAN
    Oh, counting cows, we love this. We have a data science hub that's embedded with the Foreign, Commonwealth and Development Office in East Kilbride. They focus on supporting the UK’s mission to support developing countries around the world. And one of the projects that our team is doing, our team there is doing, is counting cows. So in South Sudan, where agriculture is a much bigger percentage of GDP, a huge part of GDP for them than it is in the UK. And cattle is really important, but it's quite difficult to go out and count all the cows is a huge country. Not great roads. They've had various different issues with weather and conflict there as well. So the question was, can we get a good picture a good census of the cattle in South Sudan using satellite data? And actually, it's quite it's quite promising. We have ever better quality of satellite data, higher resolution. You can see where the camps are and you can make some estimates around the number of cows there. Getting hold of your ground truth data to check whether your estimate from spaces right is probably the hardest part of that, but it's quite exciting. And of course, if this works, what else can we do with satellite data that's helpful and means that you don't have to send individual real people out over these vast areas to count things.

    MILES FLETCHER
    That's operating on the global scale as well, but you've also been working on ways of minutely examining documents that are submitted to government in very large numbers and bypassing human intelligence to use artificial intelligence to interrogate those documents and draw conclusions from them.

    LOUISA NOLAN
    That's right. I mean, one thing government is good at is having lots of words and documents and turning those documents from data, if you like, into information and insights is a big part of what we do. So we use natural language processing to do text analysis, and we worked with the Department for International Trade on one of their big consultations, they had more than 400,000 responses. And we were able to automate that to identify themes and topics in the responses in a faster way than you can do by hand. They also covered this in the traditional ways so we were able to compare our results with the manual approach as well. Certainly the automation is faster. And I think sometimes when you've got that much information, you can get different insights, new insights from automating. But when we look at AI and approaches like that, you really want to take the human in the loop approach. So you run the things that are automated, for the bits where it makes sense, where you can find out things, you can make things go faster. But if there's something which is difficult for the AI to come to a conclusion on, that's when you bring your human in to go, oh what does that look like? Where should that sit? How should we interpret that? And it's that combination of automation, getting humans to do the bit humans are good at that's really powerful.

    MILES FLETCHER
    So the campus is a campus in both senses really. It's a campus and that it has projects and enterprise and things getting started up, but it's also a campus in the academic sense as well. And you're training people some of whom have no background in in these sorts of disciplines at all. Tell us about what's been achieved there.

    LOUISA NOLAN
    So our capability team were set a task to train 500 data scientists by March 2021. Well, we far exceeded that we trained 680 something in that time through a range of different programmes that we run. These include the MDataGov, the master's in data science for government, which we run in partnership with four universities. The graduate programme, the apprenticeship programme, face to face learning and our accelerator mentoring programme, which is brilliant. So this is open to everybody across the public sector. Pitch a project. If your project is successful, then you get for 12 weeks, you get a data science mental for one day a week to do that project and that project will be something that's important to your home department and also help the individual to build the skills as well. There's been a massive range of projects and departments who've taken part in this. I think we've had more than 250 people through the accelerator so far. It's great. So we're always looking for more mentors as well. So if this sounds interesting, always, always looking for people to help out with the mentoring.

    MILES FLETCHER
    And in the apprentices, you're getting people coming in from the local communities in many areas around where you're based in, in South Wales, and coming in cold in many cases with no background in working in these sort of disciplines at all.

    LOUISA NOLAN
    That's right. For the apprentices it's about enthusiasm and potential rather than anything that's happened before. We've had a range of people from a huge range of different backgrounds, a huge range of different ages from straight out of school all the way to people who've had several careers beforehand who've wanted to retrain. It's a brilliant way to get diversity into data science, and I'm hugely supportive of this approach. It's great.

    MILES FLETCHER
    And how do you go about applying then for any of these opportunities?

    LOUISA NOLAN
    So we advertise them, the best place to go is to look at the data science campus websites where we advertise all of our learning and development programmes. And also we talk about our projects and the other things that we're doing so you can find out all kinds of information there. For jobs and recruitment, like the recent round of recruitment for the graduate data science programme, that will be on civil service jobs, but the first place to come as the data science campus website.

    MILES FLETCHER
    What are the challenges that immediately lie ahead for the campus then, what are you getting your teeth into now?

    LOUISA NOLAN
    So I think one of our challenges is a good challenge, which is that data and data science has never been a higher priority. I think so we have a lot of asks on us. I think in four years things have changed. So four years ago, there weren't so many data science teams across government, there are more now. So we need to think, make sure that what we're offering is still the right level as other departments mature as well. I think the desire for ever faster information is not going to go away at all. So more of that, and also thinking about how we can use data, novel data and data science to support the government's big programmes like net zero and levelling up and also continuing to support our response to COVID. And thinking about what we learn from that, how we can use what we learn from that for other aspects of health as well.

    MILES FLETCHER
    And Will everybody be a data scientist in the future rather than just a statistician? Dare I ask?

    LOUISA NOLAN
    Oh, I don't know. That’s a very controversial question that. I think data science, data scientists aren't unicorns there are aspects of data science, that is a subset, or if you imagined a Venn diagram have overlaps with statistics, with operational research, with economics, a lot of economists really interested in data science and big data. But also with the digital skills as well. So overlaps with data engineering and software engineering. So my hope, my dream, I don't have a dream data science person, it’s always a team who's made up of all of those different skills. And I hope that more people will have an opportunity to build at least some of those skills, even if they don't call themselves data scientists. One of the other programmes that I'm really proud for the campus to be leading which we developed in partnership with the Number 10 delivery unit is the data senior leaders data masterclass. So this is a masterclass designed for public sector, senior leaders talking about data, why it's important, how you can use it for evidence how you can use it for evaluation, not expecting people to come out coding in Python, but having a better understanding of what's possible and what the right questions to ask are. So we rolled it out to all permanent secretaries. We're hoping to roll it out across the senior civil service. Also the fast stream and some of the future leaders development programmes across government and it's also open to senior leaders from the wider public sector as well. I'm really pleased about this because I think if we can build those skills at the top level, get people understanding what the opportunities are then that helps us build that capability, increase the number of people who can do that coding, improve efficiency and help use data better to make better decisions.

    MILES FLETCHER
    That’s Dr. Louisa Nolan from the ONS Data Science Campus and before that National Statistician Sir Ian Diamond. In the next episode of Statistically Speaking we turn to the economy. With the rising cost of living on everybody’s minds, how does the ONS keep tabs on inflation? Is there more to national prosperity than mere GDP? And is economic forecasting really just a way of making astrology seem respectable? Join us then. You can subscribe to new episodes of this podcast on Spotify, Apple Podcasts and all the other major podcast platforms. You can also get more information by following the @ONSFocus twitter feed. The producers of statistically speaking are Joe Ball, Elliot Cassley and Julia short. I'm Miles Fletcher, goodbye.