<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Real World Data Science</title>
<link>https://realworlddatascience.net/the-pulse/</link>
<atom:link href="https://realworlddatascience.net/the-pulse/index.xml" rel="self" type="application/rss+xml"/>
<description></description>
<image>
<url>https://realworlddatascience.net/images/rwds-logo-150px.png</url>
<title>Real World Data Science</title>
<link>https://realworlddatascience.net/the-pulse/</link>
<height>83</height>
<width>144</width>
</image>
<generator>quarto-1.9.37</generator>
<lastBuildDate>Wed, 15 Apr 2026 00:00:00 GMT</lastBuildDate>
<item>
  <title>Call for Submissions: is AI statistics?</title>
  <dc:creator>Editorial Board</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2026/04/15/AI-is-stats-CFS.html</link>
  <description><![CDATA[ 





<p>The Royal Statistical Society has recently set out a clear and compelling message: <a href="https://rss.org.uk/news-publication/news-publications/2026/general-news/ai-is-statistical-that-matters/">AI is Statistics</a>. This simple phrase captures a powerful truth about the foundations, practice, and future of artificial intelligence—and the central role of statistical thinking within it. It is also, of course, intentionally provocative and necessarily simplifies a highly complex and nuanced area.</p>
<p>This nuance and complexity is acknowledged and addressed in the paper itself, but no single publication can fully capture the breadth of perspectives on this topic - which is why we’ve launched a call for submissions to encourage a richer, more multidisciplinary dialogue. We are inviting writers, researchers, and practitioners across disciplines to respond to this theme with original pieces that inform, challenge, and inspire.</p>
<p>We are particularly interested in contributions that:</p>
<ul>
<li>Illuminate how statistical ideas underpin modern AI methods</li>
<li>Explore the relationship between data, uncertainty, and decision-making in AI systems</li>
<li>Offer case studies of statistics in real-world AI applications</li>
<li>Examine ethical, societal, or policy implications through a statistical lens</li>
<li>Challenge or expand the “AI is Statistics” framing in thoughtful ways</li>
<li>Communicate complex ideas accessibly to a broad audience</li>
</ul>
<p>You might want to watch <a href="https://realworlddatascience.net/the-pulse/posts/2026/03/10/rwds_big_questions_ai_statistics.html">our panel of data scientists’ recent discussion</a> for inspiration.</p>
<p>We welcome a range of formats, including opinion pieces, explainers, case studies, and thought leadership essays.</p>
<p>This is an opportunity to shape an important narrative, one that positions statistics not just as a supporting discipline, but as a driving force behind trustworthy, effective, and responsible AI.</p>
<p><strong>Help us tell the story: AI is Statistics.</strong></p>
<p>To make your submission, please review our <a href="https://realworlddatascience.net/contributor-docs/contributor-guidelines.html">contributor guidelines</a> and email us at rwds@rss.org.uk</p>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2025 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>. Thumbnail photo by <a href="https://unsplash.com/@johnsonvr?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Virgina Johnson</a> on <a href="https://unsplash.com/photos/turned-on-red-open-neon-sigange-QmNnZj_Ok-M?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Unsplash</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Real World Data Science Editorial Board. 2025. “Call for Submissions” Real World Data Science, April 15th, 20256. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/07/AI-is-stats-CFS.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>



 ]]></description>
  <category>Call for contributions</category>
  <category>Updates</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2026/04/15/AI-is-stats-CFS.html</guid>
  <pubDate>Wed, 15 Apr 2026 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/08/Images/open.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>RWDS Big Questions: How do we highlight the role of statistics in AI?</title>
  <dc:creator>Annie Flynn</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/posts/2026/03/10/rwds_big_questions_ai_statistics.html</link>
  <description><![CDATA[ 





<p>Artificial intelligence may be today’s headline act, but behind many of its most powerful systems lies something older, deeper, and quietly essential: statistics. This week, the RSS released <a href="https://rss.org.uk/RSS/media/File-library/Policy/2026/AI-is-Statistics-FINAL.pdf">a landmark position paper titled <em>AI is Statistics</em></a>. Introducing the paper, Donna Philips, Chair of the Society’s AI Task Force which led its development, argues: <a href="https://rss.org.uk/news-publication/news-publications/2026/general-news/ai-is-statistical-that-matters/">“AI systems are built on statistical pattern recognition. They need to be developed, evaluated and governed with rigorous statistical precision.”</a></p>
<p>That this is not widely understood is problematic for many reasons. If AI is seen as magic rather than applied statistics, it becomes easier to believe it is objective, infallible, or autonomous—when in reality it is probabilistic and assumption-driven. Organisations may prioritise tools and branding over rigorous data collection, experimental design, and evaluation. Without a statistical lens, questions like “How certain are we?”, “Compared to what?”, and “Under what conditions?” are less likely to be asked. And, ultimately, the demand for “AI talent” may overlook the statistical expertise required to build reliable systems.</p>
<p>In this latest episode of Real World Data Science Big Questions, our expert panel tackles a deceptively simple question: How can we better highlight the role of statistics in AI? Watch below, and read on for some key takeaways and analysis.</p>
<section id="watch-the-discussion" class="level2">
<h2 class="anchored" data-anchor-id="watch-the-discussion">Watch the discussion</h2>
<div class="quarto-video ratio ratio-16x9"><iframe data-external="1" src="https://www.youtube.com/embed/nrpglKlimXA" title="" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></div>
</section>
<section id="takeaways-at-a-glance" class="level2">
<h2 class="anchored" data-anchor-id="takeaways-at-a-glance">Takeaways at a glance</h2>
<ul>
<li><strong>AI is built on statistical thinking</strong> – even when it’s not labelled that way.</li>
<li><strong>Job titles change; core skills don’t.</strong></li>
<li><strong>Statistics sometimes undersells itself</strong> by focusing on mechanics over impact.</li>
<li><strong>Communication and visualisation are central,</strong> not peripheral, to modern statistical work.</li>
<li><strong>Kindness, collaboration, and trust</strong> are professional assets.</li>
<li><strong>The future belongs to skill-based identities, not title-based ones.</strong></li>
</ul>
</section>
<section id="key-themes-and-analysis" class="level2">
<h2 class="anchored" data-anchor-id="key-themes-and-analysis">Key themes and analysis</h2>
<p><strong>The “rebranding” meme</strong></p>
<p>The panel opens with a familiar joke: take statistics, put a new frame around it, call it machine learning or AI, and suddenly everyone pays attention. It’s humorous—but revealing.</p>
<p>Many roles advertised today as “AI” or “data science” positions are deeply statistical at heart. They involve modelling uncertainty, validating assumptions, managing bias, evaluating performance, and interpreting results. In other words: core statistical competencies.</p>
<p>Rather than resisting this relabelling, the panel suggests recognising it as part of the natural evolution of fields. The key question becomes not “What should we call ourselves?” but “What value are we delivering?”</p>
<p><strong>Identity versus skills</strong></p>
<p>One of the strongest messages from the discussion is this: don’t over-identify with a job title.</p>
<p>“Statistician,” “data scientist,” “AI specialist” are all potentially transient labels, whereas the skills underpinning them remain the same:</p>
<ul>
<li>Framing problems carefully</li>
<li>Questioning assumptions (“Are you sure? Are you sure-sure-sure?”)</li>
<li>Quantifying uncertainty</li>
<li>Designing analyses that are robust and defensible</li>
</ul>
<p>The panel suggests that the healthiest professional stance is to focus less on identity and more on what you can do and what you care about.</p>
<p><strong>The communication gap: loving the sausage-making</strong></p>
<p>Statisticians, the panel observes, sometimes make things harder than they need to be—at least in how they explain their work.</p>
<p>“We’re too interested in the mechanics,” one panellist notes. “Nobody cares how you made the sausage.”</p>
<p>This doesn’t mean rigour is unimportant. It means that impact must lead the narrative. Instead of focusing first on models, methods, and diagnostics, statisticians might begin with:</p>
<ul>
<li>What problem was solved?</li>
<li>How did this make life easier, safer, or better?</li>
<li>What decision did this enable?</li>
</ul>
<p>AI has been marketed effectively because it is framed in terms of transformation and possibility. Statistics can claim that space too, without sacrificing integrity.</p>
<p><strong>Visualisation and bringing data to life</strong></p>
<p>Visualisation is a key bridge between statistical thinking and real-world impact. Good visualisation:</p>
<ul>
<li>Makes uncertainty legible</li>
<li>Builds trust</li>
<li>Enables decision-making</li>
<li>Tells stories grounded in evidence</li>
</ul>
<p>In a world flooded with dashboards and generative outputs, the ability to present data clearly and responsibly is not a soft skill. It is core infrastructure.</p>
<p><strong>Trust, collaboration, and professional culture</strong></p>
<p>People want to work with statisticians they trust, which flows not only from technical competence but from clarity, openness, and collaboration.</p>
<p>As AI systems become more powerful—and more controversial—the professionals who can explain, contextualise, and responsibly deploy them will be in</p>
</section>
<section id="from-background-discipline-to-visible-foundation" class="level2">
<h2 class="anchored" data-anchor-id="from-background-discipline-to-visible-foundation">From background discipline to visible foundation</h2>
<p>If AI continues to evolve—as it surely will—so too will the labels attached to those who work in it. But uncertainty, inference, modelling, and critical thinking aren’t going anywhere.</p>
<p>We would love to receive contributions to the site that tackle this issue.</p>
<p>Is statistics undervalued in the AI conversation, or quietly thriving?</p>
<p>Where, in your experience, does statistical thinking most visibly shape AI work?</p>
<p>And where is it least acknowledged?</p>
<p>Have you seen statistical work rebranded as AI in your organisation?</p>
<p>We are actively seeking submissions on these topics so, if you would like to be part of the conversation, <a href="mailto:rwds@rss.org.uk"><strong>get in touch</strong></a>.</p>
<div class="article-btn">
<p><a href="https://realworlddatascience.net/the-pulse/posts/2026/01/21/rwds-big-questions-challenges-today.html">Explore more videos in the series</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-12">
<dl>
<dt>About the author:</dt>
<dd>
<a href="https://www.linkedin.com/in/annieroseflynn/">Annie Flynn</a> is Head of Content at the <a href="rss.org.uk">Royal Statistical Society</a>.
</dd>
</dl>
<div class="g-col-12 g-col-md-6">
<p><strong>Copyright and licence</strong> : © 2026 Annie Flynn<br>
<a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"> <img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"> </a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">International licence</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<p><strong>How to cite</strong> :<br>
Flynn, Annie 2026. “<strong>RWDS Big Questions: What are the Key Challenges Facing Data Scientists Today?</strong>” <em>Real World Data Science</em>, 2026. <a href="https://realworlddatascience.net/the-pulse/posts/2026/01/rwds-big-questions-challenges-today.html">URL</a></p>
</div>
</div>
</div>


</div>
</section>

 ]]></description>
  <category>AI</category>
  <category>Governance</category>
  <category>Policy</category>
  <guid>https://realworlddatascience.net/the-pulse/posts/2026/03/10/rwds_big_questions_ai_statistics.html</guid>
  <pubDate>Tue, 10 Mar 2026 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/posts/2026/03/10/images/hjnm.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Real World Data Science Featured on Practical Significance Podcast</title>
  <dc:creator>Annie Flynn</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/posts/2026/03/04/rwds_big_questions_ai_regulation.html</link>
  <description><![CDATA[ 





<p>We have some exciting news to share!</p>
<p>Some representatives of our editorial board were interviewed on this month’s episode of <a href="https://magazine.amstat.org/podcast-2/">Practical Significance</a>, the podcast from the <a href="https://www.amstat.org/">American Statistical Association</a>. The conversation was a pleasure so we hope you can check it out.</p>
<p>Practical Significance is a lively, thought-provoking series that examines how statistics and data science shape real-world problems, careers, and decisions. In the featured episode, hosts Donna LaLonde and Ron Wasserstein take a deep dive into what we’re building at Real World Data Science — from our commitment to clear explanation and practical examples to the methodological depth that underpins our articles. Together, we explore why this combination is resonating with a global community of data practitioners, researchers, and decision-makers.</p>
<p>In the conversation, we reflect on the themes our editorial team is most excited about, our ambitions for the site, and how practicing data scientists can turn their day-to-day experiences into compelling, publishable articles. Whether you’re a seasoned practitioner, an emerging data scientist, or simply someone who appreciates a well-told data story, the episode offers insight into the ideas driving our work.</p>
<p>Tune in to listen to the episode wherever you get your podcasts.</p>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-12">
<dl>
<dt>About the author:</dt>
<dd>
<a href="https://www.linkedin.com/in/annieroseflynn/">Annie Flynn</a> is Head of Content at the <a href="rss.org.uk">Royal Statistical Society</a>.
</dd>
</dl>
<div class="g-col-12 g-col-md-6">
<p><strong>Copyright and licence</strong> : © 2026 Annie Flynn<br>
<a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"> <img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"> </a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">International licence</a>.</p>
</div>
</div>
</div>


</div>

 ]]></description>
  <guid>https://realworlddatascience.net/the-pulse/posts/2026/03/04/rwds_big_questions_ai_regulation.html</guid>
  <pubDate>Wed, 04 Mar 2026 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/posts/2026/03/04/images/podthu.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>RWDS Big Questions: How do we balance innovation and regulation in the world of AI?</title>
  <dc:creator>Annie Flynn</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/posts/2026/02/18/rwds_big_questions_ai_regulation.html</link>
  <description><![CDATA[ 





<p>AI development is accelerating, while regulation moves more deliberately. That tension creates a core challenge: how do we maintain momentum without breaking the things that matter? The aim isn’t to slow innovation unnecessarily, but to ensure progress happens at a pace that protects individuals and society. Responsible actors should not be disadvantaged — yet safeguards are essential to maintain trust.</p>
<p>For the latest video in our RWDS Big Questions series, our panel explores this delicate balance. From risk-based frameworks and transparency to global inequality in AI development, the conversation surfaces the tensions, trade-offs and practical realities facing policymakers, technologists and data scientists alike.</p>
<section id="watch-the-discussion" class="level2">
<h2 class="anchored" data-anchor-id="watch-the-discussion">Watch the discussion</h2>
<div class="quarto-video ratio ratio-16x9"><iframe data-external="1" src="https://www.youtube.com/embed/L69nxuy9caI" title="" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></div>
</section>
<section id="takeaways-at-a-glance" class="level2">
<h2 class="anchored" data-anchor-id="takeaways-at-a-glance">Takeaways at a glance</h2>
<ul>
<li><strong>Innovation and regulation are not opposites</strong> – both are essential, but difficult to balance.</li>
<li><strong>Responsible progress requires proportionality</strong> – not all AI applications carry the same level of risk.</li>
<li><strong>Transparency enables better governance</strong> – open dialogue between developers and regulators is key.</li>
<li><strong>Risk-based frameworks provide structure</strong> – distinguishing low-, high-, and unacceptable-risk uses helps focus oversight.</li>
<li><strong>Global disparities complicate regulation</strong> – some regions are regulating advanced AI systems, while others are still building foundational capacity.</li>
<li><strong>Innovation needs protected space</strong> – experimentation, iteration, and even failure are critical before formal standardisation.</li>
</ul>
</section>
<section id="key-themes-and-analysis" class="level2">
<h2 class="anchored" data-anchor-id="key-themes-and-analysis">Key themes and analysis</h2>
<p><strong>Proportional regulation through risk</strong></p>
<p>Not all AI systems pose the same level of harm. A risk-based approach — distinguishing low-, high-, and unacceptable-risk uses — offers a practical middle ground. It avoids blanket restrictions while ensuring stronger oversight where impact is greatest. The debate becomes less about whether to regulate, and more about how proportionate that regulation should be.</p>
<p><strong>Transparency as common ground</strong></p>
<p>Openness can bridge the gap between technologists and regulators. Clear communication about capabilities, limitations and risks enables more informed policy decisions. When innovation happens transparently and in dialogue with regulators, governance can evolve alongside technology rather than lagging behind it.</p>
<p><strong>The global unevenness of AI governance</strong></p>
<p>AI regulation is developing unevenly across regions. While parts of the West are formalising frameworks, many countries are still building foundational AI capacity. This raises difficult questions about sequencing: should regulation lead innovation, or follow it? A one-size-fits-all model may not reflect global realities.</p>
<p><strong>Protecting space to experiment</strong></p>
<p>Innovation requires room to test, iterate and occasionally fail. Early experimentation should not be overburdened with rigid controls — but successful, scalable systems must eventually transition into more standardised and regulated environments. The challenge is designing pathways that support both creativity and accountability.</p>
</section>
<section id="looking-ahead" class="level2">
<h2 class="anchored" data-anchor-id="looking-ahead">Looking ahead</h2>
<p>As AI continues to evolve, the balance between innovation and regulation will remain dynamic — and contested. This conversation opens up important questions, and we would love to hear our readers’ thoughts about how we move some of the principles mentioned in the video into practice.</p>
<ul>
<li>How do we facilitate transparent channels of communication between those developing AI and those designing the regulatory frameworks that will govern it?</li>
<li>What should determine whether an AI system is low, high, or unacceptable risk?</li>
<li>How do we define a “safe speed” for AI development — and who gets to decide?</li>
</ul>
<p>We are actively seeking submissions on these topics so, if you would like to be part of the conversation, <a href="mailto:rwds@rss.org.uk"><strong>get in touch</strong></a>.</p>
<div class="article-btn">
<p><a href="https://realworlddatascience.net/the-pulse/posts/2026/01/21/rwds-big-questions-challenges-today.html">Explore more videos in the series</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-12">
<dl>
<dt>About the author:</dt>
<dd>
<a href="https://www.linkedin.com/in/annieroseflynn/">Annie Flynn</a> is Head of Content at the <a href="rss.org.uk">Royal Statistical Society</a>.
</dd>
</dl>
<div class="g-col-12 g-col-md-6">
<p><strong>Copyright and licence</strong> : © 2026 Annie Flynn<br>
<a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"> <img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"> </a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">International licence</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<p><strong>How to cite</strong> :<br>
Flynn, Annie 2026. “<strong>RWDS Big Questions: What are the Key Challenges Facing Data Scientists Today?</strong>” <em>Real World Data Science</em>, 2026. <a href="https://realworlddatascience.net/the-pulse/posts/2026/01/rwds-big-questions-challenges-today.html">URL</a></p>
</div>
</div>
</div>


</div>
</section>

 ]]></description>
  <category>AI</category>
  <category>Governance</category>
  <category>Policy</category>
  <guid>https://realworlddatascience.net/the-pulse/posts/2026/02/18/rwds_big_questions_ai_regulation.html</guid>
  <pubDate>Wed, 18 Feb 2026 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/posts/2026/02/18/images/BQthumb.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Colorado’s AI Law Pause: What It Means for People Working in Data Science</title>
  <dc:creator>Dr. Stefani Langehennig, University of Denver Daniels College of Business</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/posts/2026/02/02/colorado-AI.html</link>
  <description><![CDATA[ 





<p>In 2024, Colorado became the first U.S. state to pass a <a href="https://leg.colorado.gov/bills/sb24-205">comprehensive law</a> aimed at regulating “high-risk” artificial intelligence systems-models used in areas such as hiring, housing, credit, and healthcare. The law adopted a risk-based approach, placing additional obligations on systems that shape consequential decisions, including requirements around documentation, monitoring, and human oversight. Less than a year later, lawmakers delayed its implementation and began reconsidering key provisions, citing uncertainty about feasibility, cost, and enforcement.</p>
<p>Colorado’s approach drew explicitly on international models, most notably the <a href="https://artificialintelligenceact.eu/">European Union’s AI Act</a>, which similarly classifies AI systems by risk and ties higher-risk uses to stronger accountability requirements. Colorado’s experience is not only a story about state politics. It serves as a useful case study for a more practical question: what happens when ambitious AI governance principles meet the realities of building and maintaining production data systems?</p>
<p>For data scientists, analysts, machine learning engineers, and others responsible for real-world data products, this moment signals that AI governance is no longer a peripheral policy concern. It is becoming an operational constraint.</p>
<section id="from-governance-principles-to-technical-work" class="level2">
<h2 class="anchored" data-anchor-id="from-governance-principles-to-technical-work">From Governance Principles to Technical Work</h2>
<p>Colorado’s law followed a pattern increasingly visible in global AI governance, particularly the European Union’s AI Act. These frameworks share a risk-based logic in that systems that influence consequential decisions face higher expectations for transparency, oversight, and accountability.</p>
<p>At a high level, these expectations (fairness, consumer protection, responsible use) sound abstract. In practice, they translate directly into technical work:</p>
<ul>
<li>Clear documentation of model purpose, training data, and limitations</li>
<li>Records showing where data comes from and how it changes over time</li>
<li>Reproducible experiments and versioned artifacts</li>
<li>Ongoing monitoring for performance drift and unintended impacts</li>
<li>Defined processes for human review and intervention</li>
</ul>
<blockquote class="blockquote">
<p>None of this lives in legislation. It lives in scripts, workflows, dashboards, deployment systems, and operational infrastructure.</p>
</blockquote>
<p>Colorado’s stalled implementation of AI policy surfaced a familiar pattern, as many organizations are well equipped to optimize model performance, but far less prepared to operationalize accountability at scale. The friction emerged not because governance goals were controversial, but because the supporting technical infrastructure was uneven.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://realworlddatascience.net/the-pulse/posts/2026/02/02/images/thumb.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</section>
<section id="why-uncertainty-becomes-a-design-risk" class="level2">
<h2 class="anchored" data-anchor-id="why-uncertainty-becomes-a-design-risk">Why Uncertainty Becomes a Design Risk</h2>
<p>One challenge Colorado encountered was definitional ambiguity. For example, what qualifies as “high risk”, what safeguards are sufficient, and how should harms should be assessed? These questions are not merely legal - they are technical and context dependent.</p>
<p>Different data sources, deployment approaches, and users lead to different answers. For teams building data systems today, that uncertainty creates risk. When teams cannot easily see how data moves through a system, how models change over time, or how decisions are produced, adapting later becomes costly and disruptive.</p>
<p>Recent federal signals add another layer of complexity. President Trump’s executive order discouraging state-level AI regulation aims to reduce fragmented policy on AI, but it does not replace state experimentation with a concrete national policy. Teams now operate in a moving landscape shaped by state initiatives, evolving federal priorities, and international regimes like the EU AI Act. In this environment, aiming for minimal compliance is risky. Teams are better served by designing systems that are flexible and easy to observe from the start.</p>
</section>
<section id="responsibility-does-not-end-at-deployment" class="level2">
<h2 class="anchored" data-anchor-id="responsibility-does-not-end-at-deployment">Responsibility Does Not End at Deployment</h2>
<p>A lesson emerging from both policy debates and practice is that accountability does not stop when a model goes live. Responsibility shifts across teams over time, from data scientists to engineers, product owners, operators, and decision-makers.</p>
<p>This challenge is the focus of the <a href="https://senseaboutscience.org/responsible-handover-of-ai/">Responsible Handover of AI framework</a> developed by <a href="https://senseaboutscience.org/">Sense about Science</a>, which emphasizes the need for clear transitions of responsibility as AI systems move from development into real-world use. Rather than treating deployment as a handoff to “the business”, the framework highlights the risks that arise when assumptions, limitations, and responsibilities are not carried forward with the system.</p>
<p>For practitioners, this framing maps governance concerns onto familiar operational questions, such as who monitors systems after deployment, which development assumptions still matter in production, how limitations are communicated to users, and what happens when systems are updated or handed over to new teams.</p>
<p>Without explicit handover practices, accountability gaps emerge because responsibility becomes diffuse as systems evolve. From this perspective, many regulatory requirements are not adding entirely new work, rather they formalize practices teams already rely on. This includes documentation that travels with systems, monitoring in production, and clear escalation paths when something goes wrong.</p>
</section>
<section id="practical-steps-teams-can-take-now" class="level2">
<h2 class="anchored" data-anchor-id="practical-steps-teams-can-take-now">Practical Steps Teams Can Take Now</h2>
<p>Regardless of how U.S. and international regulation ultimately settles, many investments pay off immediately while reducing future risk, including:</p>
<ul>
<li><em>Standardizing documentation</em>. Ensure model summaries and data descriptions travel with systems as they move between teams</li>
<li><em>Build end-to-end visibility</em>. Version datasets, features, models, and configurations so results can be reproduced</li>
<li><em>Instrument monitoring early</em>. Track input drift, unstable predictions, performance decay, and downstream impacts once systems are in production</li>
<li><em>Clarify governance workflows</em>. Define who approves releases, who monitors systems, and how responsibility shifts over time</li>
<li><em>Translate risk for leadership</em>. Gaps in documentation and visibility tend to come back later as messy, expensive fixes; addressing them early saves time and pain</li>
</ul>
<blockquote class="blockquote">
<p>These practices are not limited to machine learning. Any system that informs decisions can create similar accountability challenges.</p>
</blockquote>
</section>
<section id="governance-lives-in-the-data-stack" class="level2">
<h2 class="anchored" data-anchor-id="governance-lives-in-the-data-stack">Governance Lives in the Data Stack</h2>
<p>There’s still no settled agreement on how AI should be governed. But for people building real-world data systems, its implications are already concrete. Accountability increasingly lives in the data stack in how workflows are instrumented, how models are monitored, and how decisions can be examined after the fact.</p>
<p>This is not simply about regulatory compliance. It is about building systems that are transparent, resilient, and trustworthy at scale. Organizations that treat governance as a core technical problem (rather than an external policy constraint imposed later) will be best positioned to navigate whatever regulatory balance ultimately emerges.</p>
<div class="article-btn">
<p><a href="../../../../../the-pulse/index.html">Explore more data science ideas</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-12">
<dl>
<dt>About the author:</dt>
<dd>
<a href="https://www.linkedin.com/in/stefani-langehennig-phd-418820144/">Dr.&nbsp;Stefani Langehennig</a> is an Assistant Professor of the Practice in the Business Information &amp; Analytics Department at the University of Denver’s Daniels College of Business. She is also the lead director for the Center for Analytics and Innovation with Data (CAID). As a former data scientist, she has worked with both academic and industry partners in the U.S. and abroad, helping organizations evaluate and implement data analytics and AI solutions. Her research focuses on computational social science methods, the impact of data transparency on political behavior, and legislative policy capacity.
</dd>
</dl>
<div class="g-col-12 g-col-md-6">
<p><strong>Copyright and licence</strong> : © 2026 Stefani Langehennig<br>
<a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"> <img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"> </a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">International licence</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<p><strong>How to cite</strong> :<br>
Langehennig, Stefani. 2026. “<strong>Colorado’s AI Law Pause: What It Means for People Working in Data Science</strong>.” <em>Real World Data Science</em>, 2026. <a href="https://realworlddatascience.net/the-pulse/posts/2026/02/02/colorado-AI.html">URL</a></p>
</div>
</div>
</div>


</div>
</section>

 ]]></description>
  <category>AI governance</category>
  <category>Applied data science</category>
  <category>Data engineering and MLOps</category>
  <category>Technology policy</category>
  <category>Operational risk</category>
  <guid>https://realworlddatascience.net/the-pulse/posts/2026/02/02/colorado-AI.html</guid>
  <pubDate>Fri, 06 Feb 2026 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/posts/2026/02/02/images/thumb.png" medium="image" type="image/png" height="96" width="144"/>
</item>
<item>
  <title>RWDS Big Questions: What Are the Key Challenges Facing Data Scientists Today?</title>
  <dc:creator>Annie Flynn</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/posts/2026/01/21/rwds-big-questions-challenges-today.html</link>
  <description><![CDATA[ 





<p>Data science is operating in a moment of paradox. We have more data, more tools, and more computational power than ever before — yet many of the core challenges feel stubbornly human.</p>
<p>In this video, experienced practitioners from varied backgrounds reflect on what they see as the biggest obstacles facing the profession today.</p>
<p>This video is part of our thought-leadership series, RWDS Big Questions, where members of our community answer one key question in multiple ways, offering diverse perspectives from across the industry.</p>
<p>Watch the video below to hear insights that span technical, organisational, and personal dimensions. Together, they reveal a set of deeply connected themes and, importantly, opportunities for the field to mature. Scroll down for analysis and practical takeaways.</p>
<hr>
<section id="video-what-are-the-key-challenges-facing-data-scientists-today" class="level2">
<h2 class="anchored" data-anchor-id="video-what-are-the-key-challenges-facing-data-scientists-today">Video: What Are the Key Challenges Facing Data Scientists Today?</h2>
<div class="quarto-video ratio ratio-16x9"><iframe data-external="1" src="https://www.youtube.com/embed/4WgCvhkTCiQ" title="" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://realworlddatascience.net/the-pulse/posts/2026/01/21/images/challengesinfo1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</section>
<section id="the-patterns-behind-the-problems" class="level2">
<h2 class="anchored" data-anchor-id="the-patterns-behind-the-problems">The Patterns Behind the Problems</h2>
<p>Although the challenges raised span technical, organisational, and personal domains, they are connected by a small number of deeper themes that shape modern data science.</p>
<section id="the-gap-between-capability-and-understanding" class="level3">
<h3 class="anchored" data-anchor-id="the-gap-between-capability-and-understanding">The gap between capability and understanding</h3>
<p>Across multiple perspectives, there is a recurring mismatch between what our tools can do and how well we understand their limitations. From AI systems trained on poor-quality data to models built on artificial or incomplete datasets, technical capability is often outpacing validation, interpretation, and critical scrutiny.</p>
<p>This gap widens further as advanced tools become more accessible to non-specialists, increasing the risk of confident but flawed outputs.</p>
</section>
<section id="speed-amplifies-existing-weaknesses" class="level3">
<h3 class="anchored" data-anchor-id="speed-amplifies-existing-weaknesses">Speed amplifies existing weaknesses</h3>
<p>Pressure to move quickly doesn’t create new problems so much as it magnifies existing ones. Poor data quality, weak validation, and organisational silos become far more consequential when decisions must be made rapidly.</p>
<p>The demand for instant answers leaves little room for reflection, experimentation, or uncertainty — despite these being essential to good data science.</p>
</section>
<section id="data-science-is-constrained-by-its-environment" class="level3">
<h3 class="anchored" data-anchor-id="data-science-is-constrained-by-its-environment">Data science is constrained by its environment</h3>
<p>Many of the challenges raised point away from algorithms and towards the environments in which they are deployed. Organisational readiness, digital infrastructure, and especially incentive structures strongly shape how data science is practiced and whether it creates impact.</p>
<p>When teams are rewarded for control rather than collaboration, silos persist, data sharing becomes risky, and even the most robust models struggle to influence decisions.</p>
</section>
<section id="uncertainty-is-a-constant" class="level3">
<h3 class="anchored" data-anchor-id="uncertainty-is-a-constant">Uncertainty is a constant</h3>
<p>The personal experience of data scientists mirrors these structural challenges. In a field defined by rapid change, uncertainty about where to focus, what to learn, and how to stay relevant is common.</p>
<p>This is not just a skills issue, but a signal that data science is still evolving, without a single, stable definition of what “good” looks like.</p>
</section>
</section>
<section id="looking-ahead" class="level2">
<h2 class="anchored" data-anchor-id="looking-ahead">Looking Ahead</h2>
<p>Taken together, these themes suggest that the biggest challenges in data science are not isolated problems to be solved individually. They are interconnected tensions between speed and rigour, access and expertise, innovation and organisational inertia.</p>
<p>Addressing them requires interdisciplinary, systems-level thinking.</p>
<p>Which of these challenges resonates most with your own experience in data science? How can practitioners use these tensions as inflection points to actively shape the field, rather than simply react to it?</p>
<div class="article-btn">
<p><a href="../../../../../applied-insights/index.html">Explore more data science ideas</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-12">
<dl>
<dt>About the author:</dt>
<dd>
<a href="https://www.linkedin.com/in/annieroseflynn/">Annie Flynn</a> is Head of Content at the <a href="rss.org.uk">Royal Statistical Society</a>.
</dd>
</dl>
<div class="g-col-12 g-col-md-6">
<p><strong>Copyright and licence</strong> : © 2026 Annie Flynn<br>
<a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"> <img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"> </a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">International licence</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<p><strong>How to cite</strong> :<br>
Flynn, Annie 2026. “<strong>RWDS Big Questions: What are the Key Challenges Facing Data Scientists Today?</strong>” <em>Real World Data Science</em>, 2026. <a href="https://realworlddatascience.net/the-pulse/posts/2026/01/rwds-big-questions-challenges-today.html">URL</a></p>
</div>
</div>
</div>


</div>
</section>

 ]]></description>
  <category>Big Questions</category>
  <category>Data science</category>
  <category>Practice</category>
  <category>Careers</category>
  <guid>https://realworlddatascience.net/the-pulse/posts/2026/01/21/rwds-big-questions-challenges-today.html</guid>
  <pubDate>Wed, 21 Jan 2026 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/posts/2026/01/21/images/thumb.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Book Review: AI in Business: Towards the Autonomous Enterprise by Sarah Burnett</title>
  <link>https://realworlddatascience.net/the-pulse/posts/2025/08/04/AI-in-Bus-Review.html</link>
  <description><![CDATA[ 





<center>
As AI continues to reshape industries, business leaders are increasingly seeking guidance on how to harness its potential responsibly and effectively. Keeping abreast of these conversations is crucial for data practitioners seeking to guide their non-technical stakeholders and foster cross-functional collaboration. The recently released second edition of <strong><a href="https://shop.bcs.org/page/detail/ai-in-business/?SF1=work_exact&amp;ST1=AIINBUSINESS2">AI in Business: Towards the Autonomous Enterprise</a></strong> (BCS, 2024) aims to help decision-makers understand the strategic opportunities and challenges of AI in a business context. We asked Ed Rochead, Chair of the <a href="https://alliancefordatascienceprofessionals.com/">Alliance for Data Science Professionals</a>, to reflect on the themes, frameworks and case studies it offers.
</center>
<div id="fig-cde" class="quarto-float quarto-figure quarto-figure-center anchored" data-fig-align="center">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-cde-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="https://realworlddatascience.net/the-pulse/posts/2025/08/04/Images/bookcover.jpg" class="img-fluid quarto-figure quarto-figure-center figure-img">
</div>
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-cde-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Figure&nbsp;1: AI in Business: Towards the Autonomous Enterprise
</figcaption>
</figure>
</div>
<p>The key words in the title of the book are ‘autonomous enterprise’: this is a book about using AI to make enterprises more effective with autonomy (AI), rather than (as the cover picture of a robot might suggest) embed autonomous robotic systems in a business.</p>
<p>The volume is in three main parts. The first introduces AI, with useful explanations of terms like “generative AI” and gives some thoughts about how AI might be used for innovation and efficiency in a business. The second part gives some case studies on how real, recognisable organisations have successfully used AI to automate their operations with some success. The third part reflects upon the future, focusing on how an organisation might start the journey towards autonomy, as well as some thoughts on the impact of autonomy on society.</p>
<p>A chapter I found particularly helpful is ‘What You Need to Know About AI.’ It seeks to explain the relevant terms at the level required by an industry or business leader, rather than giving an in-depth technical account of the concepts, and in this the author achieved their intent.</p>
<p>At the heart of the book are the case studies, involving organisations that include international companies, an NHS Trust, and a district council. The reader is likely to have some personal experience of these types of organisations, for instance as a patient or resident. This made the case studies even more engaging, at least to me, as not only could I put myself in the shoes of a leader in the organisation concerned, it was also possible to empathise with those involved in the system. This knowledge and interest really brought the content to life, and this made the author’s choice of case studies inspired.</p>
<p>The closing section is intriguing. The chapter introducing the first steps an organisation might take towards autonomy is helpful, as it illustrates example of the stages of autonomy as applied to buying a car (spoiler alert – in the last stage it is manufactured and then drives itself to the consumer’s home!) The second chapter in this section, looking towards the future, gets the reader thinking more broadly about the impact of automation on society. This includes the thorny issues of ethics and impact on things like (un)employment; both areas were covered engagingly and thought provokingly.</p>
<p>Although impressed with the content of the book, I found the typeface very cramped and small, and joked with friends that 200 pages of material is crammed into 160. This makes the contents a harder read than they might otherwise be.</p>
<p>In one sense, each of the three sections would make a good, if short, book but, read together in the sequence provided, they become more than the sum of their parts, with the first part informative, the second part engaging, and the third thought provoking. <em>AI in Business</em> is an excellent read for an organisational leader seeking inspiration to automate. It gives enough language and concept familiarity to enable such a reader to ask sensible questions of technical experts, and an idea of the art of the possible. Someone with an interest in how AI might change the world around us could also find this a fascinating and informative read.</p>
<div class="article-btn">
<p><a href="../../../../../the-pulse/index.html">Discover more The Pulse</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>About the author</dt>
<dd>
<a href="https://www.linkedin.com/in/prof-edward-r-17768847/">Professor Edward Rochead, M.Math (Hons), PGDip, CMath, FIMA</a> is a mathematician employed by the government, currently leading work on STEM Skills and Data. Ed is chair of the <a href="https://alliancefordatascienceprofessionals.com/">Alliance for Data Science Professionals</a>, a Visiting Professor at Loughborough University, an Honorary Professor at the University of Birmingham, Chartered Mathematician, and Fellow of the IMA and RSA. Copyright and licence
</dd>
<dd>
© 2025 Royal Statistical Society
</dd>
<dd>
Thumbnail image by <a href="https://www.bcs.org/">BCS</a> <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">International licence</a>.
</dd>
</dl>
</div>
<div class="g-col-12 g-col-md-6">

</div>
</div>
</div>



 ]]></description>
  <category>AI</category>
  <category>Data Science</category>
  <category>Machine learning</category>
  <category>Collaboration</category>
  <guid>https://realworlddatascience.net/the-pulse/posts/2025/08/04/AI-in-Bus-Review.html</guid>
  <pubDate>Mon, 04 Aug 2025 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/posts/2025/08/04/Images/bookcover2.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>From medical history to medical foresight: why the NHS needs its own foundation AI model for prevention</title>
  <link>https://realworlddatascience.net/the-pulse/posts/2025/07/28/NHS-foundation-AI.html</link>
  <description><![CDATA[ 





<p>The point of this article is to, in a mildly entertaining way, persuade you that developing a sovereign foundation AI model should be a priority for the NHS, professional bodies and patients but we need to get the research right.</p>
<p><strong>Risk is personal</strong></p>
<p>How do we move from treating disease to preventing disease? The traditional approach has been to publicise well evidenced public health interventions; don’t smoke, drink less, eat vegetables, exercise, vaccinate, wear sunscreen. This is all very good advice at the population level but for the individual it’s hard to know what to worry about and what to prioritise. I, being a clumsy man with bad ankles and a lack of spatial awareness, am at risk of going to A&amp;E with (another) concussion. You will be different.</p>
<p><strong>A little bit of history</strong></p>
<p>Individualised risk models in healthcare are not new. Traditional statistical approaches have used tabular data to predict healthcare events and have done a good job. These models are converted into questionnaires that clinicians can use to make decisions based on your risk. If you have had the NHS health check; a clinician will have measured blood pressure, cholesterol, height, weight and a few questions on your medical history. They will then feed this into a model and the output is the risk of you having a heart attack or stroke over the next ten years (1). There are also automated approaches built into the systems your GP uses that help stratify the population based on individual risk of things like frailty (2).</p>
<p>These kind of models are usually based on a snapshot of data and require bespoke data pipelines and engineering to massage the data into the right shape for the model, wonderful news for data scientists and statisticians as it leads to a proliferation of finely tuned models which can keep us in gainful employment for many years. However, each one has significant costs to develop, test, validate, deploy and integrate into clinical practice.</p>
<p>Another issue with these traditional models is that they squash a medical history into a single row of data for each patient, losing the chronology of health. Intuitively, we would expect that the sequence of events matters in predicting healthcare outcomes and traditional approaches struggle to capture this.</p>
<p><strong>Using a sequence of events to predict a sequence of events</strong></p>
<p>Sequences of events are easier for data engineers too. It’s much simpler to join together all the data into a sequence than perform a series of complex aggregations and transformations for every model. The simpler the data engineering needed to create the inputs the easier it is to scale as you are making fewer assumptions about the data.</p>
<p>So, if they are easier to engineer, and they capture more information why are they not the standard way of predicting health outcomes? Because modelling sequences is harder than modelling a row of data. As the model sees more of a sequence it has to hold that memory somewhere so that the model can accumulate the appropriate information. Models that could do this started appearing in machine learning literature in the early 1990’s (3) but for a long time we had neither the data, the computing power nor quite the right kind of algorithms to make them useful. Today they have become feasible in healthcare due to the rise in electronic healthcare records, standardised codes for classifying events and the rise of the transformer model. Transformer models combine the ability to hold an internal “memory” of the sequence with the capacity to pay attention to different aspects of the sequence, which basically make them magic.</p>
<p>These models have demonstrated state of the art accuracy in predicting future events using electronic patient histories. Examples for those interested in reading more include BEHRT (4), Med-Bert (5), TransformerEHR (6) and the more recent generative transformer model ETHOS (7). These can be used for a range of healthcare prediction tasks whilst delivering state of the art predictive accuracy, again, magic.</p>
<p>A recent preprint (8) from Microsoft has also demonstrated that these EHR models act in a similar way to the large language models like those backing ChatGPT; their performance scales predictably with processing power, data and the size of the model. This means that more data will probably lead to a better model and we can optimise this model performance to a given computational budget.</p>
<p><strong>So what?</strong></p>
<p>Why should you care about this? If we can take these architectures and train them on data at the scale of the NHS then each individual patient could have a relatively accurate prediction of their most likely next healthcare events(9). It would be your medical history projected forward, providing a narrative that is easier to understand than a page of risk scores. It’s your potential medical future. This could help with changing behaviour to reduce future risk, something we all struggle with. I think of it like the medical version of the ghost of Christmas future but using a chain of events rather than clinking ghost chains.</p>
<p>We are already seeing heavy usage of publicly available large language models for healthcare. 10% of a representative sample of Australians used ChatGPT for medical advice rising to 26% of 25-34 year olds (10) , I assume the UK is similar. It seems that the public is much more ready than the health system to use these models and regulation is struggling to keep up, and for good reason, they may not actually help.</p>
<p><strong>The underwhelming evidence</strong></p>
<p>As of August 2024 there were 950 AI models approved by the FDA, with a significant proportion of those for clinical decision support, but only 2.4% of these are supported by randomised controlled trials (11).</p>
<p>This is important, as what works on a machine learning researcher’s infrastructure may not work in a clinical setting. In 2018, a comprehensive health economic evaluation of a risk prediction model for identifying people at risk of hospital admission found that those in the treatment arm had a higher healthcare cost and there was no significant impact on the number of people being admitted to hospital, despite accurate predictions (12). Some prediction models even cause harmful self-fulfilling prophecies when used for decision making (the paper is well worth a read) (13).</p>
<p><strong>The prize</strong></p>
<p>The UK government is clear about the ambition to be an “AI maker” not an “AI taker”. Given the expected improvement in accuracy from scaling these EHR models, there is an opportunity for the UK to leverage what should be one of its greatest data assets (decades of longitudinal electronic healthcare records from cradle to grave) and create a sovereign foundational model that supports patient care. These are being developed now in the US and elsewhere. A meta-analysis in 2023 found over 80 foundational healthcare models, there are many more today and there is concern that at some point it will be cheaper for the NHS to bring one in and pay for it than to train its own.</p>
<p><strong>Foresight</strong></p>
<p>Fortunately we have made some progress in the UK with NHS data. Foresight (14), a transformer model developed in London on data from 1.4 million patients has demonstrated impressive results . This model has been taken on for covid research to see if the same approach can better predict disease/COVID-19 onset, hospitalisation and death, for all individuals, across all backgrounds and diseases using national data made available during the pandemic for research specifically on covid. This is being done through the British heart foundation’s collaboration with NHS England’s secure data environment (15).</p>
<p>However, just because we can do this, it does not mean that we should. Researchers need to be careful to stay within the bounds of their project and make extraordinary efforts to engage with the public. We have to ensure that our data is not being exploited inappropriately for commercial gain. The Royal College of General Practitioners has raised concerns that this model goes beyond what they agreed to, Professor Kamila Hawthorne, Chair of the Royal College of GPs, said “As data controllers, GPs take the management of their patients’ medical data very seriously, and we want to be sure data isn’t being used beyond its scope, in this case to train an AI programme.” The project has been paused for the time being despite being approved and specifically targeted at covid for research.</p>
<p>The best model for predicting outcomes from covid or the risk factors involved in covid is likely to be a population scale generative transformer model. This research will determine whether that hypothesis is true and whether this kind of data could provide more accurate predictions for patients. The NHS data and the model are kept inside a secure data environment with personal identifiers stripped out. No patient details are passed to researchers and no data or code leaves that environment without explicit permission. This research seems like something we should do.</p>
<p>Despite the potential of AI assisted clinicians for differential diagnosis (with recent evidence that they perform better than both clinicians alone and clinicians using search (16) and the attractiveness of having your medical history and your medical future in your pocket, we are a way off this reality. The gap between research and demonstrating the cost-effectiveness of AI solutions in the real world is significant but all the component parts needed to close this gap exist; the data, the models, the research capability and the political will.</p>
<p>We will get there. Foundational models in healthcare are no longer a theoretical possibility, but an imminent reality. The UK has a rare opportunity to lead, not follow, by building a sovereign AI model trained on NHS data to accelerate the transition from treating disease to preventing disease. To get there, we must confront hard questions about patient engagement and real-world benefit. But to stop research based solely on the sophistication of the method is to misunderstand the moment. I think patients expect us to do better.</p>
<div class="keyline">
<hr>
</div>
<p><strong>References</strong></p>
<ol type="1">
<li><p>Hippisley-Cox, J., Coupland, C.A.C., Bafadhel, M. et al.&nbsp;Development and validation of a new algorithm for improved cardiovascular risk prediction. Nat Med 30, 1440–1447 (2024). https://doi.org/10.1038/s41591-024-02905-y</p></li>
<li><p>Clegg A, Bates C, Young J, Ryan R, Nichols L, Ann Teale E, Mohammed MA, Parry J, Marshall T. Development and validation of an electronic frailty index using routine primary care electronic health record data. Age Ageing, May;45(3):353-60, (2016) https://doi.org/10.1093/ageing/afw039.</p></li>
<li><p>Jeffrey L. Elman,Finding structure in time,Cognitive Science,Volume 14, 179-211, (1990). https://doi.org/10.1016/0364-0213(90)90002-E</p></li>
<li><p>Li, Y., Rao, S., Solares, J.R.A. et al.&nbsp;BEHRT: Transformer for Electronic Health Records. Sci Rep 10, 7155 (2020). https://doi.org/10.1038/s41598-020-62922-y</p></li>
<li><p>Rasmy, L., Xiang, Y., Xie, Z. et al.&nbsp;Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. npj Digit. Med. 4, 86 (2021). https://doi.org/10.1038/s41746-021-00455-y</p></li>
<li><p>Yang, Z., Mitra, A., Liu, W. et al.&nbsp;TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records. Nat Commun 14, 7857 (2023). https://doi.org/10.1038/s41467-023-43715-z</p></li>
<li><p>Renc, P., Jia, Y., Samir, A.E. et al.&nbsp;Zero shot health trajectory prediction using transformer. npj Digit. Med. 7, 256 (2024). https://doi.org/10.1038/s41746-024-01235-0</p></li>
<li><p>Grout R, Gupta R, Bryant R, Elmahgoub MA, Li Y, Irfanullah K, Patel RF, Fawkes J, Inness C. Predicting disease onset from electronic health records for population health management: a scalable and explainable Deep Learning approach. Front Artif Intell. 2024 Jan 8;6:1287541. doi: 10.3389/frai.2023.1287541.</p></li>
<li><p>Sheng Zhang et al.&nbsp;Exploring Scaling Laws for EHR Foundation Models (2025) arXiv:2505.22964v1</p></li>
<li><p>Julie Ayre, Erin Cvejic and Kirsten J McCaffery. Use of ChatGPT to obtain health information in Australia, 2024: insights from a nationally representative survey Med J Aust (2025). doi: 10.5694/mja2.52598</p></li>
<li><p>Windecker D, Baj G, Shiri I, Kazaj PM, Kaesmacher J, Gräni C, Siontis GCM. Generalizability of FDA-Approved AI-Enabled Medical Devices for Clinical Use. JAMA Netw Open. 2025 Apr 1;8(4):e258052. doi: 10.1001</p></li>
<li><p>Snooks H et al.&nbsp;Predictive risk stratification model: a randomised stepped-wedge trial in primary care (PRISMATIC). Southampton (UK): NIHR Journals Library; 2018 Jan.&nbsp;PMID: 29356470.</p></li>
<li><p>van Amsterdam WAC, van Geloven N, Krijthe JH, Ranganath R, Cinà G. When accurate prediction models yield harmful self-fulfilling prophecies. Patterns (N Y). 2025 Apr 11;6(4):101229. doi: 10.1016/j.patter.2025.101229.</p></li>
<li><p>Kraljevic, Zeljko et al.&nbsp;Foresight—a generative pretrained transformer for modelling of patient timelines using electronic health records: a retrospective modelling study. The Lancet Digital Health, Volume 6, Issue 4, e281 - e290</p></li>
<li><p>CVD-COVID-UK/COVID-IMPACT: Projects CCU078: Foresight: a generative AI model of patient trajectories across the COVID-19 pandemic https://bhfdatasciencecentre.org/projects/ccu078/</p></li>
<li><p>McDuff, D., Schaekermann, M., Tu, T. et al.&nbsp;Towards accurate differential diagnosis with large language models. Nature 642, 451–457 (2025). https://doi.org/10.1038/s41586-025-08869-4</p></li>
</ol>
<div class="article-btn">
<p><a href="../../../../../the-pulse/index.html">Discover more The Pulse</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>About the author</dt>
<dd>
<a href="https://www.linkedin.com/in/will-browne-391b1930/">Will Browne</a> is co-founder of healthcare technology company <a href="https://www.emrys.health/">Emrys Health</a>, where he works on the development of infrastructure for transformative, equitable and accessible healthcare. He is Events Secretary of the <a href="https://rss.org.uk/membership/rss-groups-and-committees/sections/data-science-section/">RSS Data Science and AI section</a> and a member of the <a href="https://rss.org.uk/policy-campaigns/policy-groups/ai-task-force/">RSS AI Taskforce</a>. Copyright and licence
</dd>
<dd>
© 2025 Royal Statistical Society
</dd>
<dd>
Thumbnail image by <a href="https://unsplash.com/@tugcegungormezler?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Tugce Gungormezler</a> / on Unsplash. <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">International licence</a>.
</dd>
</dl>
</div>
<div class="g-col-12 g-col-md-6">

</div>
</div>
</div>



 ]]></description>
  <category>AI</category>
  <category>Data Science</category>
  <category>Machine learning</category>
  <category>Deep learning</category>
  <category>Econometrics</category>
  <guid>https://realworlddatascience.net/the-pulse/posts/2025/07/28/NHS-foundation-AI.html</guid>
  <pubDate>Mon, 28 Jul 2025 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/posts/2025/07/28/Images/NHS.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Call for Submissions</title>
  <dc:creator>Editorial Board</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/08/relaunch-CFS.html</link>
  <description><![CDATA[ 





<p>Get ready to engage with Real World Data Science as we unveil an exciting editorial refresh! We’re thrilled to announce that submissions are now open across our four dynamic sections: The Pulse, Applied Insights, Foundations &amp; Frontiers, and People &amp; Pathways. Join us as we redefine the conversation in data science with fresh perspectives and insights. Real World Data Science is relaunching to meet the pace and complexity of today’s data-driven world in real time, with the RSS’s trademark steadying presence. We will be publishing high-quality case-studies, tutorials and think-pieces that bridge the gap between rigorous analysis and real-time relevance, and that speak directly to latest events and emerging trends.</p>
<p>All submissions will be peer-reviewed by members of the Real World Data Science <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2022/10/18/meet-the-team.html">Editorial Board</a>.</p>
<section id="our-audience" class="level2">
<h2 class="anchored" data-anchor-id="our-audience">Our Audience:</h2>
<p>People working in data science looking for practical insights, methodological rigour and thought-leadership that informs their work and decision-making.</p>
</section>
<section id="our-voice" class="level2">
<h2 class="anchored" data-anchor-id="our-voice">Our Voice:</h2>
<p>Authoritative, Trustworthy, Cutting Edge</p>
</section>
<section id="our-editorial-sections" class="level2">
<h2 class="anchored" data-anchor-id="our-editorial-sections">Our Editorial Sections:</h2>
<p>Real World Data Science has four editorial sections. Please read through and consider where your piece would fit best. Each piece we publish needs to be tailored towards the focus of one of these sections.</p>
<p><a href="https://realworlddatascience.net/the-pulse/">THE PULSE</a></p>
<p>News, updates and real time commentary.</p>
<p>Purpose: To respond to current events, trends and debates in the data science world with rigour, insight and relevance.</p>
<p>Content Types: Articles that speak directly to current events/trends/launches</p>
<p>Example Call To Action: Invite readers to share your commentary with their networks as a trusted voice in the space. Invite engagement, discussion and debate over the topics.</p>
<p><a href="https://realworlddatascience.net/applied-insights/">APPLIED INSIGHTS</a><br>
How data science is used to solve real-world problems in business, public policy and beyond.</p>
<p>Purpose: To showcase real-world applications of data science, including hands-on tutorials, project walk-throughs, and case studies from industry, academia, or public service.</p>
<p>Content Types:</p>
<ul>
<li>High-quality step-by-step tutorials with code<br>
</li>
<li>Case studies detailing a problem, approach, and outcome<br>
</li>
<li>Lessons learned from real-world deployments</li>
</ul>
<p>Example Call To Action: Readers should walk away with something to try.</p>
<p><a href="https://realworlddatascience.net/foundation-frontiers/">FOUNDATIONS &amp; FRONTIERS</a><br>
The ideas behind the impact: the concepts, tools and methods that make data science possible.</p>
<p>Purpose: To deepen understanding of the theoretical and ethical foundations of data science, and to spotlight thought leadership and emerging ideas.</p>
<p>Content Types:</p>
<ul>
<li>Think-piece style articles with an engaging angle on methodology, ethics and standards<br>
</li>
<li>Interviews with thought-leaders<br>
</li>
<li><a href="https://realworlddatascience.net/foundation-frontiers/datasciencebites/">Data Science Bites</a> - our handy summaries/explainers of academic papers</li>
</ul>
<p>Example Call To Action: Invite discussion and engagement – pose questions and challenges to the reader.</p>
<p><a href="https://realworlddatascience.net/people-paths/">PEOPLE &amp; PATHS</a><br>
Strategic reflections on careers, leadership and professional evolution in data science.</p>
<p>Purpose: To explore the evolving nature of data science careers through the lens of experience, leadership, and long-term impact. This section highlights how professionals shape and are shaped by the field—through roles, decisions, and philosophies.</p>
<p>Content Types:</p>
<ul>
<li>Profiles of/interviews with senior professionals reflecting on career philosophy and leadership<br>
</li>
<li>Roundtables with experts on hiring, mentoring, or organisational design</li>
<li>Commentary on career-defining trends, such as the rise of AI governance or the shift toward interdisciplinary teams</li>
</ul>
<p>Example Call To Action: Encourage readers to share our strategic insights with their community.</p>
</section>
<section id="use-of-ai-in-submissions" class="level2">
<h2 class="anchored" data-anchor-id="use-of-ai-in-submissions">Use of AI in Submissions</h2>
<p>We recognise that LLMs and other generative AI tools are increasingly part of the data science workflow, from code generation and data cleaning to drafting documentation and shaping analysis. We welcome a transparent approach in submissions that have made use of these tools, and ask that authors include a declaration outlining where and how AI was used in the development of their submission. This helps us maintain transparency, uphold standards of reproducibility, and better understand the evolving role of AI in real-world data science practice.</p>
<p>To make your submission, please review our <a href="https://realworlddatascience.net/contributor-docs/contributor-guidelines.html">contributor guidelines</a> and email us at rwds@rss.org.uk</p>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2025 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>. Thumbnail photo by <a href="https://unsplash.com/@johnsonvr?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Virgina Johnson</a> on <a href="https://unsplash.com/photos/turned-on-red-open-neon-sigange-QmNnZj_Ok-M?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Unsplash</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Real World Data Science Editorial Board. 2025. “Call for Submissions” Real World Data Science, July 7, 2025. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/07/relaunch-CFS.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>


</section>

 ]]></description>
  <category>Call for contributions</category>
  <category>Updates</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/08/relaunch-CFS.html</guid>
  <pubDate>Mon, 07 Jul 2025 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/08/Images/open.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>We’re Back: Real World Data Science Relaunches</title>
  <dc:creator>Editorial Board</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/07/editors-relaunch.html</link>
  <description><![CDATA[ 





<p>You may have noticed our brief hiatus. Since publishing our series on AI - which covered <a href="https://realworlddatascience.net/foundation-frontiers/posts/2024/04/29/gen-ai-human-intel.html">the quest for human-level intelligence</a>, <a href="https://realworlddatascience.net/foundation-frontiers/posts/2024/05/07/ai-series-3.html">data-set risks</a>, <a href="https://realworlddatascience.net/foundation-frontiers/posts/2024/05/14/ai-series-2.html">ethical considerations</a> and much more - the ongoing deluge of content and commentary on AI in the wider world has continued to accelerate. This year has seen a surge in developments that sit at the intersection of data science and AI: from the growing use of synthetic data to overcome privacy and bias challenges, to the rise of multi-modal models that demand increasingly sophisticated data engineering and integration techniques. The emergence of Agentic AI has sparked new conversations around data provenance, model interpretability, and the reproducibility crisis in machine learning. Meanwhile, the meteoric rise of open-source disruptor DeepSeek triggered stock-market ruptures and industry panic, before <a href="https://www.theguardian.com/technology/2025/jan/27/deepseek-cyberattack-ai">cyber-attacks</a>, <a href="https://www.wiz.io/blog/wiz-research-uncovers-exposed-deepseek-database-leak">data leaks</a> and a <a href="https://gizmodo.com/deepseek-gets-an-f-in-safety-from-researchers-2000558645?utm_source=pocket_shared">failed safety test</a> complicated its standing - a parable for the volatility of the space, where data governance failures and safety oversights can rapidly derail innovation. Meanwhile, governments worldwide are <a href="https://www.aa.com.tr/en/europe/macron-announces-112b-in-ai-investment-over-coming-years/3477218?utm_source=pocket_saves">investing heavily</a> in <a href="https://assets.publishing.service.gov.uk/media/67851771f0528401055d2329/ai_opportunities_action_plan.pdf?utm_source=substack&amp;utm_medium=email">national data infrastructure</a> and advanced analytics capabilities, while grappling with how best to regulate a field that is evolving faster than policy can keep up.</p>
<p>The world of data science has been a dizzying place over the last few months, so we took a moment to pause and take stock. In the face of rapid change and constant noise, it felt important to reflect with intention on the role Real World Data Science can and should play in this evolving landscape. Now we’re back - ready to rejoin the conversation with renewed clarity and purpose.</p>
<p>As a project from the <a href="https://rss.org.uk/">Royal Statistical Society</a>, in partnership with the <a href="https://www.amstat.org/">American Statistical Association</a>, we are backed by organisations with nearly two centuries of history in championing sound evidence, rigorous methodology and ethical data use. These values form the foundation of our next phase - distilled into the essential pillars: data, evidence and decision. With an esteemed editorial board representing the cutting-edge of industry and academia, and an international network of practitioners working at the coalface of modern data science, we are uniquely placed to navigate the pace and complexity of today’s data-driven world. Real World Data Science will meet that world in real time with the RSS’s trademark steadying presence, bridging the gap between rigorous analysis and real-time relevance.</p>
<p>We are now returning with a slightly refreshed site, encompassing four editorial sections:<br>
<a href="https://realworlddatascience.net/the-pulse/">The Pulse</a> - covering news, updates and real-time commentary<br>
<a href="https://realworlddatascience.net/applied-insights/">Applied Insights</a> - exploring how data science is used to solve real-world problems in business, public policy and beyond<br>
<a href="https://realworlddatascience.net/foundation-frontiers/">Foundations &amp; Frontiers</a> - unpicking the ideas behind the impact: the concepts, tools and methods that make data science possible<br>
<a href="https://realworlddatascience.net/people-paths/">People &amp; Paths</a> - offering strategic reflections on careers, leadership and professional evolution in data science.</p>
<p>You can find the full details of these sections, plus guidance around submitting to them, in our new <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/08/relaunch-CFS.html">Call for Submissions</a>.</p>
<p>Despite these updates, we remain committed to providing content that is useful and relevant for practicing data scientists seeking to learn good practices in the field and new potential applications.</p>
<p>The choices we make now will shape how data and AI serve society for years to come. If you’re working on the front lines of these changes, whether through research, practice, or critical reflection, we invite you to share your insights and help us build a future for data science that is thoughtful, transparent and grounded in real world understanding.</p>
<p><a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2022/10/18/meet-the-team.html">Meet the Team</a></p>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2025 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Real World Data Science Editorial Board. 2025. “We’re Back: Real World Data Science Relaunches” Real World Data Science, July 7, 2025. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/07/editors-relaunch.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>



 ]]></description>
  <category>Call for contributions</category>
  <category>Updates</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/07/editors-relaunch.html</guid>
  <pubDate>Mon, 07 Jul 2025 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2025/07/07/images/team.png" medium="image" type="image/png" height="77" width="144"/>
</item>
<item>
  <title>New open access journal - RSS: Data Science and Artificial Intelligence</title>
  <link>https://realworlddatascience.net/the-pulse/posts/2024/08/01/RWDS-journal.html</link>
  <description><![CDATA[ 





<p><img src="https://realworlddatascience.net/the-pulse/posts/2024/08/01/images/RSS-DSAI-Logo-blue.png" class="img-fluid" style="width:80.0%" alt="RSS Data Science and AI logo"><br>
</p>
<p>The Royal Statistical Society (RSS) is launching a new fully open access journal, <em>RSS: Data Science and Artificial Intelligence</em>. Created in recognition of the growing importance of data science and artificial intelligence in science and society, the new journal’s remit spans the breadth of data science; you can <a href="https://academic.oup.com/rssdat/pages/general-instructions">submit articles</a> covering disciplines including statistics, machine learning, deep learning, econometrics, bioinformatics, engineering and computational social science.</p>
<p>As well as three primary paper types - method papers, applications papers and behind-the-scenes papers - RSS: Data Science and Artificial Intelligence will publish editorials, op-eds, interviews, and reviews/perspectives in line with <a href="https://academic.oup.com/rssdat/pages/about">its goal to become a primary destination for data scientists</a>.</p>
<p>Published by Oxford University Press, this new journal is the first addition to the RSS family of world class statistics journals since 1952.</p>
<p><a href="https://academic.oup.com/rssdat/pages/why-publish">Learn more</a> about why <em>RSS: Data Science and Artificial Intelligence</em> is the ideal platform for showcasing your research.</p>
<div class="keyline">
<hr>
</div>
<section id="meet-the-journals-editors-in-chief-and-editorial-board" class="level3">
<h3 class="anchored" data-anchor-id="meet-the-journals-editors-in-chief-and-editorial-board">Meet the journal’s editors-in-chief and editorial board</h3>
<p>&nbsp;</p>
<div class="grid">
<div class="g-col-12 g-col-md-4">
<p><img src="https://realworlddatascience.net/the-pulse/posts/2024/08/01/images/Mukherjee_Sach.jpg" class="img-fluid" alt="Photo of Mukherjee, Director of Research in Machine Learning for Biomedicine at the MRC"></p>
<p><strong>Sach Mukherjee</strong> is Director of Research in Machine Learning for Biomedicine at the Medical Research Council (MRC) Biostatistics Unit, University of Cambridge, and Head of Statistics and Machine Learning at the German Center for Neurodegenerative Diseases.</p>
</div>
<div class="g-col-12 g-col-md-4">
<p><img src="https://realworlddatascience.net/the-pulse/posts/2024/08/01/images/silvia-chiappa.jpeg" class="img-fluid" alt="Silvia Chiappa, Research Scientist at Google DeepMind"></p>
<p><strong>Silvia Chiappa</strong> is a Research Scientist at <a href="https://deepmind.com/">Google DeepMind</a> London, where she leads the Causal Intelligence team, and Honorary Professor at the <a href="https://www.ucl.ac.uk/computer-science/">Computer Science Department</a> of University College London.</p>
</div>
<div class="g-col-12 g-col-md-4">
<p><img src="https://realworlddatascience.net/the-pulse/posts/2024/08/01/images/neil-lawrence.png" class="img-fluid" alt="Neil Lawrenece, DeepMind Professor of Machine Learning at the University of Cambridge"></p>
<p><strong>Neil Lawrenece</strong> is the inaugural DeepMind Professor of Machine Learning at the University of Cambridge. He has been working on machine learning models for over 20 years. He recently returned to academia after three years as Director of Machine Learning at Amazon.</p>
</div>
</div>
<p><br>
</p>
<p><strong>View the full editorial board here:</strong> <a href="https://academic.oup.com/rssdat/pages/editorial-board">Editorial Board | RSS Data Science | Oxford Academic (oup.com)</a></p>
<div class="article-btn">
<p><a href="../../../../../the-pulse/index.html">Discover more The Pulse</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2023 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">International licence</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">

</div>
</div>
</div>


</section>

 ]]></description>
  <category>AI</category>
  <category>Data Science</category>
  <category>Machine learning</category>
  <category>Deep learning</category>
  <category>Econometrics</category>
  <guid>https://realworlddatascience.net/the-pulse/posts/2024/08/01/RWDS-journal.html</guid>
  <pubDate>Thu, 01 Aug 2024 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/posts/2024/08/01/images/RSS-DS-AI-cover.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Editor’s note: Not saying goodbye, just saying…</title>
  <dc:creator>Brian Tarran</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/03/editors-note.html</link>
  <description><![CDATA[ 





<p>It’s not easy to leave a brilliant group of people you’ve worked with for almost a decade, but in a month’s time I’ll be moving on from the Royal Statistical Society (RSS).</p>
<p>When I joined RSS in June 2014 I was looking for new challenges. I wanted to find out more about the ways statistics and data are used to understand and solve problems and inform decisions in science, business and industry, public policy, health… I could go on! Working for the RSS certainly delivered on that front: as editor of <a href="https://significancemagazine.com/">Significance</a> for eight years and of Real World Data Science more recently, I have had many opportunities to learn.</p>
<p>Pretty much every day of my working life for the past nine years, eight months or so involved speaking with expert statisticians and data scientists or reading about their work. When there were things I didn’t understand, they were always happy to explain. When I shared my ideas for how to make their articles clearer or more readable, they took the time to listen. Together, we worked to create accessible, engaging stories about statistics and data. There have been hundreds of these collaborations over the years – too many to namecheck individually – but I have enjoyed them all, and I’ve learned something from each of them.</p>
<p>Before I head off to pursue a new set of challenges and learning opportunities, I want to say a big thank you to all the RSS staff and members, past and present, that I’ve been lucky to call my colleagues. Thank you also to the staff and members of the American Statistical Association who have been valued partners on Significance over the years and now RWDS too. It’s been a privilege to work with you all.</p>
<p>The chance to launch RWDS has been a particular highlight of my time at RSS, and I am grateful to have had the support and input of The Alan Turing Institute and many of its wonderful staff and researchers on this project. I’m excited to see the site continue to grow and develop into a valuable resource for the data science community, and I look forward to reading an upcoming series of articles that will explore the statistical and data science perspectives on AI – stay tuned for more on this soon.</p>
<p>Statistics and data will continue to be a big part of my life, so this isn’t “goodbye.” Instead, I’ll just say, let’s keep in touch – and thank you for reading!</p>
<div class="article-btn">
<p><a href="../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2024 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>. Thumbnail photo by <a href="https://unsplash.com/@peet818?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Pete Pedroza</a> on <a href="https://unsplash.com/photos/thank-you-text-VyC0YSFRDTU?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Unsplash</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Tarran, Brian. 2024. “Editor’s note: Not saying goodbye, just saying…” Real World Data Science, March 6, 2024. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/03/06/editors-note.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>



 ]]></description>
  <category>People</category>
  <category>Updates</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/03/editors-note.html</guid>
  <pubDate>Wed, 06 Mar 2024 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/03/images/thank-you.png" medium="image" type="image/png" height="105" width="144"/>
</item>
<item>
  <title>£10m for UK regulators to ‘jumpstart’ AI capabilities, as government commits to white paper approach</title>
  <dc:creator>Brian Tarran</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/02/08/llms-whitepaper-response.html</link>
  <description><![CDATA[ 





<p>The UK government this week announced a £10 million investment to “jumpstart regulators’ AI capabilities” as part of its commitment to a “pro-innovation approach to AI regulation.” But will this be sufficient to answer criticisms that it has so far been “too slow” to give regulators the tools they need to police the growing usage of AI?</p>
<p>It was March last year when a Department for Science, Innovation and Technology (DSIT) white paper first set out the government’s <a href="https://www.gov.uk/government/publications/ai-regulation-a-pro-innovation-approach/white-paper">principles- and context-based approach to regulating artificial intelligence</a>. This proposed to focus regulatory attention on “the context in which AI is deployed” rather than target specific technologies. Under this model, existing regulators, including the Information Commissioner’s Office, Ofcom, and the Competition and Markets Authority, would be responsible for ensuring that technologies deployed within their domains adhered to established rules – e.g., data protection regulation – and a common set of principles:</p>
<ul>
<li>Safety, security and robustness.</li>
<li>Appropriate transparency and explainability.</li>
<li>Fairness.</li>
<li>Accountability and governance.</li>
<li>Contestability and redress.</li>
</ul>
<p>The approach was broadly well received, as was clear from a debate at techUK’s <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/08/digital-ethics-summit.html">Digital Ethics Summit</a> last December. However, concerns were expressed about whether regulators would be funded sufficiently to meet the expectations set out in the March white paper. Also, the Royal Statistical Society, in <a href="https://rss.org.uk/RSS/media/File-library/Policy/2023/RSS-AI-white-paper-response-v2-2.pdf">its response to the white paper</a>, worried that “splitting responsibilities for regulating the use of AI between existing regulators does not meet the scale of the challenge,” and that “central leadership is required to give a clear, coherent and easily communicable framework that can be applied to all sectors.”</p>
<p>While the DSIT white paper proposed that a range of “central functions” be created to support regulators, <a href="https://committees.parliament.uk/committee/170/communications-and-digital-committee/news/199728/uk-will-miss-ai-goldrush-unless-government-adopts-a-more-positive-vision/">evidence presented to a House of Lords inquiry</a> last November suggested that regulators “did not appear to know what was happening” with these mooted teams and were “keen to see progress” on this front.</p>
<p>In reporting the outcomes of its inquiry last week, the House of Lords Communications and Digital Committee concluded that government was being “too slow” to give regulators the tools required to meet the objectives set out in the white paper, and that “speedier resourcing of government‑led central support teams is needed.”</p>
<p>“Relying on existing regulators to ensure good outcomes from AI will only work if they are properly resourced and empowered,” the committee said.</p>
<p>The £10 million funding for regulators announced this week is therefore likely to be welcomed. Money is earmarked to “help regulators develop cutting-edge research and practical tools to monitor and address risks and opportunities in their sectors, from telecoms and healthcare to finance and education,” according to <a href="https://www.gov.uk/government/news/uk-signals-step-change-for-regulators-to-strengthen-ai-leadership">a DSIT press release</a>. Speaking on February 6 at <a href="https://parliamentlive.tv/event/index/68ecee17-2896-4002-8736-1608229db364?in=15:27:41">a hearing of the Lords Communications and Digital Committee</a>, Michelle Donelan, Secretary of State for Science, Innovation and Technology, said that the government would “stay on top” of what regulators need to be able to fulfil their responsibilities for regulating the use of AI in their sectors.</p>
<section id="consultation-response" class="level2">
<h2 class="anchored" data-anchor-id="consultation-response">Consultation response</h2>
<p>News of the funding for regulators came as part of <a href="https://www.gov.uk/government/consultations/ai-regulation-a-pro-innovation-approach-policy-proposals/outcome/a-pro-innovation-approach-to-ai-regulation-government-response">a long-awaited response by the government to the consultation on its AI regulation white paper</a>. The response essentially confirmed that the government was proceeding with its principles- and context-based approach to regulating AI, having received “strong support from stakeholders across society.”</p>
<p>This approach is right for today, the government said, “as it allows us to keep pace with rapid and uncertain advances in AI.” However, it acknowledged that “the challenges posed by AI technologies will ultimately require legislative action in every country once understanding of risk has matured.”</p>
<p>“Highly capable general-purpose AI systems” would, for example, present a particular challenge to the government’s current approach. It explained: “Even though some regulators can enforce existing laws against the developers of the most capable general-purpose systems within their current remits, the wide range of potential uses means that general-purpose systems do not currently fit neatly within the remit of any one regulator, potentially leaving risks without effective mitigations.”</p>
<p>As a next step in delivering on the white paper approach, the government is asking key regulators to publish an update on their strategic approach to AI by the end of April. This was welcomed by Royal Statistical Society (RSS) president Andrew Garrett, who said:</p>
<blockquote class="blockquote">
<p>“Urgency is certainly warranted, and the directive for key regulators to disclose their approach in the coming months is a positive development. Ensuring consistency and coherence not only among key regulators but also those who follow is crucial.”</p>
</blockquote>
<p>Garrett also <a href="https://realworlddatascience.net/foundation-frontiers/interviews/posts/2023/10/25/evaluating-ai.html">reiterated the need for government to engage with statisticians and data scientists</a>, particularly through its <a href="https://realworlddatascience.net/the-pulse/posts/2023/12/06/ai-fringe.html">new AI Safety Institute</a> (AISI). In the white paper consultation response, AISI is billed as being “fundamental to informing the UK’s regulatory framework”: it will “advance the world’s knowledge of AI safety by carefully examining, evaluating, and testing new frontier AI systems” and will also “research new techniques for understanding and mitigating AI risk.” Garrett said:</p>
<blockquote class="blockquote">
<p>“As always, fostering diversity of representation within government and regulatory bodies remains paramount; it cannot solely rely on input from major tech companies. It is especially important that the AI Safety Institute engages with a diverse array of voices, including statisticians and data scientists who play a pivotal role in both the development of AI systems and novel evaluation methodologies.”</p>
</blockquote>
</section>
<section id="risks-and-opportunities" class="level2">
<h2 class="anchored" data-anchor-id="risks-and-opportunities">Risks and opportunities</h2>
<p>Calls for a “diversity of representation within government and regulatory bodies” certainly chime with a warning bell sounded by the Lords Communications and Digital Committee last week, in the February 2 release of its <a href="https://committees.parliament.uk/committee/170/communications-and-digital-committee/news/199728/uk-will-miss-ai-goldrush-unless-government-adopts-a-more-positive-vision/">inquiry report into large language models and generative AI</a>. “Regulatory capture” by big commercial interests was highlighted as a danger to be avoided, amid concern that “the AI safety debate is being dominated by views narrowly focused on catastrophic risk, often coming from those who developed such models in the first place” and that “this distracts from more immediate issues like copyright infringement, bias and reliability.”<sup>1</sup></p>
<p>The committee called for enhanced governance and transparency measures in DSIT and AISI to guard against regulatory capture, and for a rebalancing away from a “narrow focus on high-stakes AI safety” toward a “more positive vision for the opportunities [of AI] and a more deliberate focus on near-term risks” including cyber security and disinformation.</p>
<p>It also wants to see greater action by the government in support of copyright. “Some tech firms are using copyrighted material without permission, reaping vast financial rewards,” reads the report. “The legalities of this are complex but the principles remain clear. The point of copyright is to reward creators for their efforts, prevent others from using works without permission, and incentivise innovation. The current legal framework is failing to ensure these outcomes occur and the Government has a duty to act. It cannot sit on its hands for the next decade and hope the courts will provide an answer.”</p>
<p>Again, here’s RSS president Andrew Garrett’s take on the Lords committee report:</p>
<center>
<iframe src="https://www.linkedin.com/embed/feed/update/urn:li:share:7159583475350585344" height="1091" width="504" frameborder="0" allowfullscreen="" title="Embedded post">
</iframe>
</center>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2024 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>. Thumbnail photo by <a href="https://unsplash.com/@yaopey?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Yaopey Yong</a> on <a href="https://unsplash.com/photos/white-concrete-building-near-body-of-water-during-night-time-flmPTUCjkto?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Unsplash</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Tarran, Brian. 2024. “£10m for UK regulators to ‘jumpstart’ AI capabilities, as government commits to white paper approach.” Real World Data Science, February 8, 2024. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/02/08/llms-whitepaper-response.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>


</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>See, for example, <a href="https://realworlddatascience.net/the-pulse/posts/2023/06/05/no-AI-probably-wont-kill-us.html">“No, AI probably won’t kill us all – and there’s more to this fear campaign than meets the eye.”</a>↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>AI</category>
  <category>Large language models</category>
  <category>Public policy</category>
  <category>Risk</category>
  <category>Regulation</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/02/08/llms-whitepaper-response.html</guid>
  <pubDate>Thu, 08 Feb 2024 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/02/08/images/parliament-and-thames.png" medium="image" type="image/png" height="105" width="144"/>
</item>
<item>
  <title>UK government sets out 10 principles for use of generative AI</title>
  <dc:creator>Brian Tarran</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/22/gen-ai-framework.html</link>
  <description><![CDATA[ 





<p>The UK government has published <a href="https://www.gov.uk/government/publications/generative-ai-framework-for-hmg/generative-ai-framework-for-hmg-html">a framework for the use of generative AI</a>, setting out 10 principles for departments and staff to think about if using, or planning to use, this technology.</p>
<p>It covers the need to understand what generative AI is and its limitations, the lawful, ethical and secure use of the technology, and a requirement for “meaningful human control.”</p>
<p>The focus is on large language models (LLMs) as, according to the framework, these have “the greatest level of immediate application in government.”</p>
<p>It lists a number of promising use cases for LLMs, including the synthesise of complex data, software development, and summaries of text and audio. However, the document cautions against using generative AI for fully automated decision-making or in contexts where data is limited or explainability of decision-making is required. For example, it warns that:</p>
<blockquote class="blockquote">
<p>“although LLMs can give the appearance of reasoning, they are simply predicting the next most plausible word in their output, and may produce inaccurate or poorly-reasoned conclusions.”</p>
</blockquote>
<p>And on the issue of explainability, it says that:</p>
<blockquote class="blockquote">
<p>“generative AI is based on neural networks, which are so-called ‘black boxes’. This makes it difficult or impossible to explain the inner workings of the model which has potential implications if in the future you are challenged to justify decisioning or guidance based on the model.”</p>
</blockquote>
<p>The framework goes on to discuss some of the practicalities of building generative AI solutions. It talks specifically about the value a multi-disciplinary team can bring to such projects, and emphasises the role of data scientists:</p>
<blockquote class="blockquote">
<p>“data scientists … understand the relevant data, how to use it effectively, and how to build/train and test models.”</p>
</blockquote>
<p>It also speaks to the need to “understand how to monitor and mitigate generative AI drift, bias and hallucinations” and to have “a robust testing and monitoring process in place to catch these problems.”</p>
<p>What do you make of the <a href="https://www.gov.uk/government/publications/generative-ai-framework-for-hmg/generative-ai-framework-for-hmg-html">Generative AI Framework for His Majesty’s Government</a>? What does it get right, and what needs more work?</p>
<div class="callout callout-style-simple callout-note callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>And in case you missed it…
</div>
</div>
<div class="callout-body-container callout-body">
<p>New York State issued a policy on the <a href="https://its.ny.gov/acceptable-use-artificial-intelligence-technologies">Acceptable Use of Artificial Intelligence Technologies</a> earlier this month. Similar to the UK government framework, it references the need for human oversight of AI models and rules out use of “automated final decision systems.” There is also discussion of fairness, equity and explainability, and AI risk assessment and management.</p>
</div>
</div>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2024 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>. Thumbnail photo by <a href="https://unsplash.com/@therawhunter?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Massimiliano Morosinotto</a> on <a href="https://unsplash.com/photos/brown-tower-clock-under-cloudy-sy-paINk01G8Xk?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Unsplash</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Tarran, Brian. 2024. “UK government sets out 10 principles for use of generative AI.” Real World Data Science, January 22, 2024. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/22/gen-ai-framework.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>



 ]]></description>
  <category>AI ethics</category>
  <category>Large language models</category>
  <category>Monitoring</category>
  <category>Public policy</category>
  <category>Risk</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/22/gen-ai-framework.html</guid>
  <pubDate>Mon, 22 Jan 2024 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/22/images/uk-parliament.png" medium="image" type="image/png" height="105" width="144"/>
</item>
<item>
  <title>When will the cherry trees bloom? Get ready to make and share your predictions!</title>
  <dc:creator>Brian Tarran</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/18/cherry-blossom.html</link>
  <description><![CDATA[ 





<p>The <a href="https://competition.statistics.gmu.edu/">2024 International Cherry Blossom Prediction Competition</a> will open for entries on February 1, and Real World Data Science is once again proud to be a sponsor.</p>
<p>Contestants are invited to submit predictions for the date cherry trees will bloom in 2024 at five different locations – Kyoto, Japan; Liestal-Weideli, Switzerland; Vancouver, Canada; and Washington, DC and New York City, USA.</p>
<p>The competition organisers will provide all the publicly available data they can find for the bloom dates of cherry trees in these locations, and contestants will then be challenged to use this data “in combination with any other publicly available data (e.g., climate data) to provide reproducible predictions of the peak bloom date.”</p>
<p>“For this competition, we seek accurate, interpretable predictions that offer strong narratives about the factors that determine when cherry trees bloom and the broader consequences for local and global ecosystems,” say the organisers. “Your task is to predict the peak bloom date for 2024 and to estimate a prediction interval, a lower and upper endpoint of dates during which peak bloom is most probable.”</p>
<p>So that organisers can reproduce the predictions, entrants must submit all data and code in a <a href="https://quarto.org/">Quarto document</a>.</p>
<p>There’s cash and prizes on offer for the best entries, including having your work featured on Real World Data Science. <a href="https://competition.statistics.gmu.edu/">Head on over to the competition website for full details and rules</a>.</p>
<p>And, if you are looking for some inspiration, check out this <a href="https://realworlddatascience.net/applied-insights/tutorials/posts/2023/04/13/flowers.html">tutorial on the law of the flowering plants</a>, written by Jonathan Auerbach, a co-organiser of the prediction competition.</p>
<p>Good luck to all entrants!</p>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2024 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>. Photo by <a href="https://unsplash.com/@ajny?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">AJ</a> on <a href="https://unsplash.com/photos/pink-flowers-McsNra2VRQQ?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Unsplash</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Tarran, Brian. 2024. “When will the cherry trees bloom? Get ready to make and share your predictions!” Real World Data Science, January 18, 2024. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/18/cherry-blossom.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>



 ]]></description>
  <category>Coding</category>
  <category>Prediction</category>
  <category>Reproducible research</category>
  <category>Statistics</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/18/cherry-blossom.html</guid>
  <pubDate>Thu, 18 Jan 2024 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/18/images/cherry-blossom.png" medium="image" type="image/png" height="105" width="144"/>
</item>
<item>
  <title>Creating a web publication with Quarto: the Real World Data Science origin story</title>
  <dc:creator>Brian Tarran</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/03/posit-conf-video.html</link>
  <description><![CDATA[ 





<p>When I attended posit::conf(2023) in Chicago last year, I gave a talk about creating Real World Data Science using Quarto, the open source publishing system developed by Posit. That talk is now online, along with all the other conference talks and keynotes.</p>
<p>My talk, “From Journalist to Coder: Creating a Web Publication with Quarto,” is embedded below. You can also find a selection of talks on <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/09/19/positconf-blog.html">our posit::conf highlights blog</a>. The <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oRFZslRGHwHuwea7SvAATHp">full conference playlist is on YouTube</a>.</p>
<div class="quarto-video ratio ratio-16x9"><iframe data-external="1" src="https://www.youtube.com/embed/ncDEqHxMWnE?si=A1GmLphRPlmspJCj" title="" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></div>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2024 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Tarran, Brian. 2024. “Creating a web publication with Quarto: the Real World Data Science origin story.” Real World Data Science, January 03, 2024. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/03/posit-conf-video.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>



 ]]></description>
  <category>Coding</category>
  <category>Communication</category>
  <category>Events</category>
  <category>Communities</category>
  <category>Open source</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/03/posit-conf-video.html</guid>
  <pubDate>Wed, 03 Jan 2024 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2024/01/03/images/video-grab.png" medium="image" type="image/png" height="105" width="144"/>
</item>
<item>
  <title>A Christmas card in R for the Real World Data Science community</title>
  <dc:creator>Brian Tarran</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/rwds-xmas-card.html</link>
  <description><![CDATA[ 





<p>A few weeks back, I managed to catch Nicola Rennie’s presentation to the <a href="https://www.meetup.com/en-AU/oxford-r-user-group/events/297417319/">Oxford R User Group on how to create Christmas cards in R</a>. It was a fun session, and thanks to Nicola’s clear and concise explanations, I felt emboldened to attempt my own design, using her code as a base.</p>
<p>If you missed the Meetup session, Nicola has kindly written <a href="../../../../../../applied-insights/tutorials/posts/2023/12/12/xmas-cards.html">a tutorial for Real World Data Science</a> that walks through all the necessary steps to create a snowman against a snowy night’s sky. You’ll want to read that tutorial first before returning to this blog.</p>
<p>My design uses the same basic setting as Nicola’s but updates the scene to reflect the Real World Data Science (RWDS) brand colours, and I replace the snowman with a Christmas tree adorned with coloured baubles.</p>
<section id="snowy-sky" class="level2">
<h2 class="anchored" data-anchor-id="snowy-sky">Snowy sky</h2>
<p>We begin by loading in the following packages, adding a couple extra to the ones Nicola uses:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggforce)</span>
<span id="cb1-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(sf)</span>
<span id="cb1-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(png)</span>
<span id="cb1-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(patchwork) </span></code></pre></div></div>
<p>Then we add the sky, now recoloured in RWDS purple using <code>fill</code> and <code>color</code>:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">s1 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_void</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb2-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(</span>
<span id="cb2-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">plot.background =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">element_rect</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#939bc9"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#939bc9"</span>)</span>
<span id="cb2-5">  )</span>
<span id="cb2-6">s1</span></code></pre></div></div>
<p>We use the same code as Nicola to create the snowflakes, but we do this step first before adding snow on the ground, as we’re using the RWDS site background colour, hex code <code>#f0eeb</code>, to represent our settled snow:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># add snowflakes</span></span>
<span id="cb3-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set.seed</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20231225</span>)</span>
<span id="cb3-3">n <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span></span>
<span id="cb3-4">snowflakes <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(</span>
<span id="cb3-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(n),</span>
<span id="cb3-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(n)</span>
<span id="cb3-7">)</span>
<span id="cb3-8">s2 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> s1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(</span>
<span id="cb3-10">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> snowflakes,</span>
<span id="cb3-11">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mapping =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(</span>
<span id="cb3-12">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> x,</span>
<span id="cb3-13">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> y</span>
<span id="cb3-14">    ),</span>
<span id="cb3-15">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"white"</span>,</span>
<span id="cb3-16">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">pch =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span></span>
<span id="cb3-17">  )</span>
<span id="cb3-18">s2</span>
<span id="cb3-19"></span>
<span id="cb3-20"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># snow on ground</span></span>
<span id="cb3-21">s3 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> s2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-22">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(</span>
<span id="cb3-23">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">geom =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rect"</span>,</span>
<span id="cb3-24">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xmin =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xmax =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,</span>
<span id="cb3-25">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ymin =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ymax =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>,</span>
<span id="cb3-26">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#f0eeeb"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#f0eeeb"</span></span>
<span id="cb3-27">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-28">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">xlim</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-29">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ylim</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-30">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb3-31">s3</span></code></pre></div></div>
<div class="quarto-layout-panel" data-layout-ncol="2">
<div class="quarto-layout-row">
<div class="quarto-layout-cell" style="flex-basis: 50.0%;justify-content: center;">
<p><img src="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/images/s2.png" class="img-fluid" alt="Purple square with white snowflakes."></p>
</div>
<div class="quarto-layout-cell" style="flex-basis: 50.0%;justify-content: center;">
<p><img src="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/images/s3.png" class="img-fluid" alt="Purple square with white snowflakes and off-white rectangle at the bottom."></p>
</div>
</div>
</div>
</section>
<section id="oh-christmas-tree" class="level2">
<h2 class="anchored" data-anchor-id="oh-christmas-tree">Oh, Christmas tree</h2>
<p>To build her snowman, Nicola created a series of circles that were stacked and overlaid. A simple Christmas tree, though, requires a series of triangles. So, taking Nicola’s snowman’s nose (also a triangle) as our starting point, we coded three sets of coordinates – <code>tree_pts1</code>, <code>tree_pts2</code>, and <code>tree_pts3</code> – for three triangles of decreasing size that would sit on top of one another.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># coordinates for tree base</span></span>
<span id="cb4-2">tree_pts1 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(</span>
<span id="cb4-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(</span>
<span id="cb4-4">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,</span>
<span id="cb4-5">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.6</span>,</span>
<span id="cb4-6">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.8</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,</span>
<span id="cb4-7">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span></span>
<span id="cb4-8">  ),</span>
<span id="cb4-9">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,</span>
<span id="cb4-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">byrow =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb4-11">)</span>
<span id="cb4-12"></span>
<span id="cb4-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># coordinates for tree middle</span></span>
<span id="cb4-14">tree_pts2 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(</span>
<span id="cb4-15">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(</span>
<span id="cb4-16">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,</span>
<span id="cb4-17">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.7</span>,</span>
<span id="cb4-18">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.7</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,</span>
<span id="cb4-19">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb4-20">  ),</span>
<span id="cb4-21">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,</span>
<span id="cb4-22">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">byrow =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb4-23">)</span>
<span id="cb4-24"></span>
<span id="cb4-25"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># coordinates for tree top</span></span>
<span id="cb4-26">tree_pts3 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(</span>
<span id="cb4-27">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(</span>
<span id="cb4-28">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.4</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.65</span>,</span>
<span id="cb4-29">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.75</span>,</span>
<span id="cb4-30">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.6</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.65</span>,</span>
<span id="cb4-31">    <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.4</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.65</span></span>
<span id="cb4-32">  ),</span>
<span id="cb4-33">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,</span>
<span id="cb4-34">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">byrow =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb4-35">)</span>
<span id="cb4-36"></span>
<span id="cb4-37"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># put tree together</span></span>
<span id="cb4-38">tree <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_multipolygon</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(tree_pts1),</span>
<span id="cb4-39">                             <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(tree_pts2),</span>
<span id="cb4-40">                             <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(tree_pts3)))</span>
<span id="cb4-41">s4 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> s3 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-42">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(</span>
<span id="cb4-43">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> tree,</span>
<span id="cb4-44">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"chartreuse4"</span>,</span>
<span id="cb4-45">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"chartreuse4"</span></span>
<span id="cb4-46">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-47">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb4-48">s4</span></code></pre></div></div>
<p>A tree also requires a trunk, so we borrowed one of the rectangles from Nicola’s snowman’s hat for this purpose:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">s5 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> s4<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb5-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(</span>
<span id="cb5-3">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">geom =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rect"</span>,</span>
<span id="cb5-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xmin =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.45</span>,</span>
<span id="cb5-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xmax =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.55</span>,</span>
<span id="cb5-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ymin =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>,</span>
<span id="cb5-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ymax =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,</span>
<span id="cb5-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"brown"</span></span>
<span id="cb5-9">  )</span>
<span id="cb5-10">s5</span></code></pre></div></div>
<p>And, of course, no Christmas tree is complete without decorations. The “rocks” that formed the buttons and eyes on Nicola’s snowman were updated to become gold and red baubles for our tree:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># add gold baubles</span></span>
<span id="cb6-2">s6 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> s5 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"gold"</span>,</span>
<span id="cb6-4">             <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(</span>
<span id="cb6-5">               <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.4</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.6</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.57</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.62</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.45</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>),</span>
<span id="cb6-6">               <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.325</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.4</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.45</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.35</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.57</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.52</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.6</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.7</span>),</span>
<span id="cb6-7">               <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4.5</span>)</span>
<span id="cb6-8">             ),</span>
<span id="cb6-9">             <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mapping =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> x, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> y, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> size)</span>
<span id="cb6-10">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-11">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_size_identity</span>()</span>
<span id="cb6-12">s6</span>
<span id="cb6-13"></span>
<span id="cb6-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># add red baubles</span></span>
<span id="cb6-15">s7 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> s6 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-16">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red3"</span>,</span>
<span id="cb6-17">             <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(</span>
<span id="cb6-18">               <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.7</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.6</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.525</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.43</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.38</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.55</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>),</span>
<span id="cb6-19">               <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.375</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.4</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.55</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.65</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.43</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.48</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.375</span>),</span>
<span id="cb6-20">               <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4.5</span>)</span>
<span id="cb6-21">             ),</span>
<span id="cb6-22">             <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mapping =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> x, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> y, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> size)</span>
<span id="cb6-23">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-24">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_size_identity</span>()</span>
<span id="cb6-25">s7</span></code></pre></div></div>
<div class="quarto-layout-panel" data-layout-ncol="3">
<div class="quarto-layout-row">
<div class="quarto-layout-cell" style="flex-basis: 33.3%;justify-content: center;">
<p><img src="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/images/s4.png" class="img-fluid" alt="Purple square with white snowflakes and green tree in foreground."></p>
</div>
<div class="quarto-layout-cell" style="flex-basis: 33.3%;justify-content: center;">
<p><img src="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/images/s5.png" class="img-fluid" alt="Purple square with white snowflakes and green tree in foreground, now with brown trunk at foot of tree."></p>
</div>
<div class="quarto-layout-cell" style="flex-basis: 33.3%;justify-content: center;">
<p><img src="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/images/s7.png" class="img-fluid" alt="Purple square with white snowflakes and green tree in foreground. Tree is decorated with red and gold baubles of various sizes."></p>
</div>
</div>
</div>
</section>
<section id="seasons-greetings" class="level2">
<h2 class="anchored" data-anchor-id="seasons-greetings">Season’s greetings</h2>
<p>The final step was to add text to the top of the image, wishing you all a Merry Christmas, and our logo to the bottom, so you know who the card is from:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># add text</span></span>
<span id="cb7-2">s8 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> s7 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(</span>
<span id="cb7-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">geom =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"text"</span>,</span>
<span id="cb7-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,</span>
<span id="cb7-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.875</span>,</span>
<span id="cb7-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">label =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Merry Christmas"</span>,</span>
<span id="cb7-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red3"</span>,</span>
<span id="cb7-9">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fontface =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bold"</span>,</span>
<span id="cb7-10">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span></span>
<span id="cb7-11">  )</span>
<span id="cb7-12">s8</span>
<span id="cb7-13"></span>
<span id="cb7-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># add logo </span></span>
<span id="cb7-15">path <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"images/rwds-logo-150px.png"</span></span>
<span id="cb7-16">img <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readPNG</span>(path, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">native =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) </span>
<span id="cb7-17">s9 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> s8 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>                   </span>
<span id="cb7-18">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">inset_element</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">p =</span> img, </span>
<span id="cb7-19">                <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">left =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3265</span>, </span>
<span id="cb7-20">                <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bottom =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, </span>
<span id="cb7-21">                <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">right =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.6735</span>, </span>
<span id="cb7-22">                <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">top =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span></span>
<span id="cb7-23">  ) </span>
<span id="cb7-24">s9</span></code></pre></div></div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/images/rwds-christmas-card.png" class="img-fluid quarto-figure quarto-figure-center figure-img" alt="Purple square with white snowflakes and green tree in foreground. Tree is decorated with red and gold baubles of various sizes. Text over tree reads Merry Christmas. Under tree is a logo for the Real World Data Science website."></p>
</figure>
</div>
<p>I hope you like the Christmas card! From all of us at Real World Data Science, thank you for your support throughout 2023. Merry Christmas, happy holidays, and best wishes for 2024!</p>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2023 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Tarran, Brian. 2023. “A Christmas card in R for the Real World Data Science community.” Real World Data Science, December 12, 2023. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/rwds-xmas-card.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>


</section>

 ]]></description>
  <category>R</category>
  <category>Data visualisation</category>
  <category>Updates</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/rwds-xmas-card.html</guid>
  <pubDate>Tue, 12 Dec 2023 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/12/images/rwds-christmas-card-thumb.png" medium="image" type="image/png" height="105" width="144"/>
</item>
<item>
  <title>AI and digital ethics in 2023: a ‘remarkable, eventful year’</title>
  <dc:creator>Brian Tarran</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/08/digital-ethics-summit.html</link>
  <description><![CDATA[ 





<p>What a difference a year makes! That was the general tone of the conversation coming out of techUK’s <a href="https://www.techuk.org/digital-ethics-summit-2023-seizing-the-moment.html">Digital Ethics Summit</a> this week. At last year’s event, ChatGPT was but a few days old. An exciting, enticing prospect, sure – but not yet the phenomenon it would soon become. My notes from last year include only two mentions of the AI chatbot: Andrew Strait of the Ada Lovelace Institute expressing concern about the way ChatGPT had been released straight to the public, and Jack Stilgoe of UCL warning of the threat such technology poses to the social contract – public data trains it, while private firms profit.</p>
<p>A lot has happened since last December, as many of the speakers at Wednesday’s summit pointed out. UNESCO’s Gabriela Ramos commented on how <a href="https://www.gov.uk/government/publications/ai-safety-summit-programme">the UK’s AI Safety Summit</a>, <a href="https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/">US President Joe Biden’s executive order on AI</a>, and other international initiatives had brought about “a change in the conversation” on AI risk, safety, and assurance. Simon Staffell of Microsoft spoke of “a huge amount of progress” being made, building from principles into voluntary actions that companies and countries can take.</p>
<p>Luciano Floridi of Yale University described 2023 as a “remarkable, eventful year which we didn’t quite expect,” with various international efforts helping to build consensus on what needs to be done, and what needs to be regulated, to ensure the benefits of AI can be realised while harms are minimised. Camille Ford of the Centre for European Policy Studies noted that while attempts at global governance of AI make for a “crowded space” – with more than 200 documents in circulation – there are at least principles in common across the various initiatives, focusing on aspects such as transparency, reliability and trustworthiness, safety, privacy, and accountability and liability.</p>
<p>However, in some respects, we’ve not come as far as we could or should have over the past 12 months. Ford, for instance, called for more conversation on AI safety, and a frank discussion about on whose terms AI safety is defined. Not only are there the risks and harms of AI outputs to consider, but also environmental harms, exploitative labour practices, and more besides. <a href="https://realworlddatascience.net/the-pulse/posts/2023/12/06/ai-fringe.html">Echoing the Royal Statistical Society’s recent AI debate</a>, Ford said we need to focus on the risks we face now, rather than being consumed by discussions about the existential and catastrophic risks of AI – which, for many, are still firmly in the realm of science fiction.</p>
<p>There also remains “a big mismatch” between the AI knowledge and skills that reside within tech companies and that of other communities, said Zeynep Engin of Data for Policy. And many speakers were clear that the global south needs a more prominent voice in the AI debate.</p>
<section id="regulatory-approaches" class="level2">
<h2 class="anchored" data-anchor-id="regulatory-approaches">Regulatory approaches</h2>
<p>The UK government’s AI Safety Summit has been criticised for focusing too much on the hypothetical existential risks of AI. But, on regulation at least, there was broad agreement that the UK’s principles- and sector-based approach, outlined in <a href="https://www.gov.uk/government/publications/ai-regulation-a-pro-innovation-approach/white-paper">a March 2023 white paper</a>, is the right one. That’s not to say it’s perfect: discussions were had about whether regulatory bodies would be adequately funded to regulate the use of AI in their sectors, while Hetan Shah of the British Academy wondered “where was the golden thread” linking the AI white paper to the AI Safety Summit and its various pronouncements, including <a href="https://www.gov.uk/government/news/prime-minister-launches-new-ai-safety-institute">plans for an AI Safety Institute</a>. (On the Safety Institute in particular, Lord Tim Clement-Jones was sceptical of yet another body being drafted in to debate these issues – a point made by panellists at the RSS’s recent AI debate.)</p>
<p>Delegates also got to hear from the UK’s Information Commissioner directly. John Edwards delivered a keynote address in which he acknowledged the huge excitement surrounding the benefits AI promises to bring, while cautioning that deployment and use of AI must be done in accordance with existing rules on data protection and privacy. The technology may be new, he said, but the same old data rules apply: “Our legislation is founded on technology-neutral principles of general application. They are capable of adapting to numerous new technologies, as they have over the last 30 years and will continue to do.”</p>
<p>He warned that noncompliance with data protection rules and regulations “will not be profitable,” and that persistent misuse of AI and personal data for competitive advantage would be punished. Edwards concluded by saying that AI is built on the data of human individuals and should therefore be used to improve their lives, and not put them or their personal data at risk.</p>
</section>
<section id="elections-in-an-era-of-generative-ai" class="level2">
<h2 class="anchored" data-anchor-id="elections-in-an-era-of-generative-ai">Elections in an era of generative AI</h2>
<p>One major looming risk is the use of generative AI to create mis- and disinformation during election campaigns. Hans-Petter Dalen of IBM suggested that next year is perhaps the biggest year for elections in the history of mankind, with votes due in the UK, US, and India, to name but a few. Generative AI represents not a new threat, he said, but an “amplified” one – a point further developed by Henry Parker of Logically.ai. Parker spoke of the risk of large-scale breakdown in trust due to mis- or disinformation campaigns. Thanks to AI tools, he said, we are now seeing the “democratisation of disinformation.” What once might have cost millions of dollars and required a team of hundreds of people can now be done much more cheaply and with fewer human resources. As the Royal Society’s Areeq Chowdhury said, the challenge of disinformation has only become harder.</p>
<p>Asked how to counter this, Dalen said that if he were a politician, “I would certainly get my own blockchain and all my content would have been digitally watermarked from source – that’s what the blockchain does.” But digital watermarking is only part of the answer, added Parker. Identifying mis- and disinformation is both a question of provenance and of dissemination. Logically.ai is using AI as a tool to analyse behaviours around the circulation of mis- and disinformation, Parker said – positioning AI as but one solution to a problem it has helped exacerbate.</p>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2023 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>. Thumbnail photo by <a href="https://unsplash.com/@kajtek?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Kajetan Sumila</a> on <a href="https://unsplash.com/photos/a-screenshot-of-a-computer-bxaqUeVIGHU?utm_content=creditCopyText&amp;utm_medium=referral&amp;utm_source=unsplash">Unsplash</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Tarran, Brian. 2023. “AI and digital ethics in 2023: a ‘remarkable, eventful year.’” Real World Data Science, December 8, 2023. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/08/digital-ethics-summit.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>


</section>

 ]]></description>
  <category>AI</category>
  <category>Large language models</category>
  <category>Ethics</category>
  <category>Regulation</category>
  <category>Risk</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/08/digital-ethics-summit.html</guid>
  <pubDate>Fri, 08 Dec 2023 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/12/08/images/kajetan-sumila-bxaqUeVIGHU-unsplash.png" medium="image" type="image/png" height="105" width="144"/>
</item>
<item>
  <title>Evaluating artificial intelligence: How data science and statistics can make sense of AI models</title>
  <dc:creator>Brian Tarran</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/posts/2023/12/06/ai-fringe.html</link>
  <description><![CDATA[ 





<p>A little over a month ago, governments, technology firms, multilateral organisations, and academic and civil society groups came together at Bletchley Park – home of Britain’s World War II code breakers – to discuss the safety and risks of artificial intelligence.</p>
<p>One output from that event was <a href="https://www.gov.uk/government/publications/ai-safety-summit-2023-the-bletchley-declaration/the-bletchley-declaration-by-countries-attending-the-ai-safety-summit-1-2-november-2023">a declaration</a>, signed by countries in attendance, of their resolve to “work together in an inclusive manner to ensure human-centric, trustworthy and responsible AI that is safe, and supports the good of all.”</p>
<p>We also heard from UK prime minister Rishi Sunak of <a href="https://www.gov.uk/government/news/prime-minister-launches-new-ai-safety-institute">plans for an AI Safety Institute</a>, to be based in the UK, which will “carefully test new types of frontier AI before and after they are released to address the potentially harmful capabilities of AI models, including exploring all the risks, from social harms like bias and misinformation, to the most unlikely but extreme risk, such as humanity losing control of AI completely.”</p>
<p>But at a panel debate at the Royal Statistical Society (RSS) the day before the Bletchley Park gathering, data scientists, statisticians, and machine learning experts questioned whether such an institute would be sufficient to meet the challenges posed by AI; whether data inputs – compared to AI model outputs – are getting the attention they deserve; and whether the summit was overly focused on <a href="https://realworlddatascience.net/the-pulse/posts/2023/06/05/no-AI-probably-wont-kill-us.html">AI doomerism</a> and neglecting more immediate risks and harms. There were also calls for AI developers to be more driven to solve real-world problems, rather than just pursuing AI for AI’s sake.</p>
<p>The RSS event was chaired by Andrew Garrett, the Society’s president, and formed part of the national <a href="https://aifringe.org/">AI Fringe programme of activities</a>. The panel featured:</p>
<ul>
<li>Mihaela van der Schaar, John Humphrey Plummer professor of machine learning, artificial intelligence and medicine at the University of Cambridge and a fellow at The Alan Turing Institute.</li>
<li>Detlef Nauck, head of AI and data science research at BT, and a member of the <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2022/10/18/meet-the-team.html">Real World Data Science editorial board</a>.</li>
<li>Mark Levene, principal scientist in the Department of Data Science at the National Physical Laboratory.</li>
<li>Martin Goodson, chief executive of Evolution AI, and former chair of the RSS Data Science and AI Section.</li>
</ul>
<p>What follows are some edited highlights and key takeaways from the discussion.</p>
<div class="keyline">
<hr>
</div>
<section id="ai-safety-and-ai-risks" class="level2">
<h2 class="anchored" data-anchor-id="ai-safety-and-ai-risks">AI safety, and AI risks</h2>
<p><strong>Andrew Garrett:</strong> For those who were listening to the commentary last week, the PM [prime minister] made a very interesting speech. Rishi Sunak announced the creation of the world’s first AI Safety Institute in the UK, to examine, evaluate and test new types of AI. He also stated that he pushed hard to agree the first ever international statement about the risks of AI because, in his view, there wasn’t a shared understanding of the risks that we face. He used the example of the IPCC, the Intergovernmental Panel on Climate Change, to establish a truly global panel to publish a “state of AI science” report. And he also announced an investment in raw computing power, so around a billion pounds in a supercomputer, and £2.5 billion in quantum computers, making them available for researchers and businesses as well as government.</p>
<p>The RSS provided two responses this year to prominent [AI policy] reviews. The first was in June <a href="https://rss.org.uk/RSS/media/File-library/Policy/2023/RSS-AI-white-paper-response-v2-2.pdf">on the AI white paper</a>, and the second was on <a href="https://rss.org.uk/RSS/media/File-library/Policy/RSS_Evidence_Communications_and_Digital_Lords_Select_Committee_Inquiry_Large_Language_Models_September_2023.pdf">the House of Lords Select Committee inquiry into large language models</a> back in September. How do they relate to what the PM said? There’s some good news here, and maybe not quite so good news.</p>
<p>First, the RSS had requested investments in AI evaluation and a risk-based approach. And you could argue, by stating that there will be a safety institute, that that certainly ticks one of the boxes. We also recommended investment in open source, in computing power, and in data access. In terms of computing power, that was certainly in the [PM’s] speech. We spoke about strengthening leadership, and in particular including practitioners in the [AI safety] debate. A lot of academics and maybe a lot of the big tech companies have been involved in the debate, but we want to get practitioners – those close to the coalface – involved in the debate. I’m not sure we’ve seen too much of that. We recommended that strategic direction was provided, because it’s such a fast-moving area, and the fact that the Bletchley Park Summit is happening tomorrow, I think, is good for that. And we also recommended that data science capability was built amongst the regulators. I don’t think there was any mention of that.</p>
<p>That’s the context [for the RSS event today]. What I’m going to do now is ask each of the panellists to give an introductory statement around the AI summit, focusing on the safety aspects. What do they see as the biggest risk? And how would they mitigate or manage this risk?</p>
<p><strong>Detlef Nauck:</strong> I work at BT and run the AI and data science research programme. We’ve been looking at the safety, reliability, and responsibility of AI for quite a number of years already. Five years ago, we put up a responsible AI framework in the company, and this is now very much tied into our data governance and risk management frameworks.</p>
<p>Looking at the AI summit, they’re focusing on what they call “frontier models,” and they’re missing a trick here because I don’t think we need to worry about all-powerful AI; we need to worry about inadequate AI that is being used in the wrong context. For me, AI is programming with data, and that means I need to know what sort of data has been used to build the model, and I need AI vendors to be upfront about it and to tell me: What is the data that they have used to build it, how have they built it, or if they’ve tested for bias? And there are no protocols around this. So, therefore, I’m very much in favour of AI evaluation. But I don’t want to wait for an institute for AI evaluation. I want the academic research that needs to be done around this, which hasn’t been done. I want everybody who builds AI systems to take this responsibility and document properly what they’re doing.</p>
<div class="pullquote-container">
<div class="grid">
<div class="g-col-12 g-col-lg-4">
<div class="quarto-figure quarto-figure-left">
<figure class="figure">
<p><img src="https://realworlddatascience.net/the-pulse/posts/2023/12/06/images/llm-3d-shapes-crop.png" class="img-fluid quarto-figure quarto-figure-left figure-img"></p>
</figure>
</div>
</div>
<div class="g-col-12 g-col-lg-8 pullquote-grid pullquote">
<p><img src="https://realworlddatascience.net/images/pullquote-purple.png" class="img-fluid" width="50"></p>
<p>I hear more and more a lot of companies talking about AI general intelligence, and how AI is going to take over the world, and I’m tremendously concerned about this. There is an opportunity to build AI that is human empowering, that keeps us strong, able, capable, intelligent, and can support us in all our human capabilities.</p>
</div>
</div>
</div>
<p><strong>Mihaela van der Schaar:</strong> I am an AI researcher building AI and machine learning technology. Before talking about the risks, I also would like to say that I see tremendous potential for good. Many of these machine learning AI models can transform for the better areas that I find extremely important – healthcare and education. That being said, there are substantial risks, and we need to be very careful about that. First, if not designed well, AI can be both unsafe as well as biased, and that could lead to tremendous impact, especially in medicine and education. I completely agree with all the points that the Royal Statistical Society has made not only about open source but also about data access. This AI technology cannot be built unless you have access to high quality data, and what I see a lot happening, especially in industry, is people have data sources that they’ll keep private, build second-rate or third-rate technology on them, and then turn that into commercialised products that are sold to us for a lot of money. If data is made widely available, the best as well as the safest AI can be produced, rather than monopolised.</p>
<p>Another area of risk that I’m especially worried about is human marginalisation. I hear more and more a lot of companies talking about AI general intelligence, and how AI is going to take over the world, and I’m tremendously concerned as an AI researcher about this. There is an opportunity to build AI that is human empowering, that keeps us strong, able, capable, intelligent, and can support us in all our human capabilities.</p>
<p><strong>Martin Goodson:</strong> The AI Safety Summit is starting tomorrow. But, unfortunately, I think the government are focusing on the wrong risks. There are lots of risks to do with AI, and if you look at the scoping document for the summit, it says that what they’re interested in is misuse risk and the risk of loss of control. Misuse risk is that bad actors will gain access to information that they shouldn’t have and build chemical weapons and things like that. And the loss of control risk is that we will have this super intelligence which is going to take over and we should see, as is actually mentioned, the risk of the extinction of the human race, which I think is a bit overblown.</p>
<p>Both of these risks – the misuse risk and the loss of control risk – are potential risks. But we don’t really know how likely they are. We don’t even know whether they’re possible. But there are lots of risks that we do know are possible, like loss of jobs, and reductions in salary, particularly of white-collar jobs – that seems inevitable. There’s another risk, which is really important, which is the risk of monopolistic control by the small number of very powerful AI companies. These are the risks which are not just likely but are actually happening now – people are losing their jobs right now because of AI – and in terms of monopolistic control, OpenAI is the only company that has anything like a large language model as powerful as GPT-4. Even the mighty Google can’t really compete. This is a huge risk, I think, because we have no control over pricing: they could raise the prices if they wanted to; they could constrain access; they could only give access to certain people that they want to give access to. We don’t have any control over these systems.</p>
<p><strong>Mark Levene:</strong> I work in NPL as a principal scientist in the data science department. I’m also emeritus professor in Birkbeck, University of London. I have a long-standing expertise in machine learning and focus in NPL on trustworthy AI and uncertainty quantification. I believe that measurement is a key component in locking-in AI safety. Trustworthy AI and safe AI both have similar goals but different emphases. We strive to demonstrate the trustworthiness of an AI system so that we can have confidence in the technology making what we perceive as responsible decisions. Safe AI puts the emphasis on the prevention of harmful consequences. The risk [of AI] is significant, and it could potentially be catastrophic if we think of nuclear power plants, or weapons, and so on. I think one of the problems here is, who is actually going to take responsibility? This is a big issue, and not necessarily an issue for the scientist to decide. Also, who is accountable? For instance, the developers of large language models: are they the ones that are accountable? Or is it the people who deploy the large language models and are fine-tuning them for their use cases?</p>
<p>The other thing I want to emphasise is the socio-technical characteristics [of the AI problem]. We need to get an interdisciplinary team of people to actually try and tackle these issues.</p>
</section>
<section id="do-we-need-an-ai-safety-institute" class="level2">
<h2 class="anchored" data-anchor-id="do-we-need-an-ai-safety-institute">Do we need an AI Safety Institute?</h2>
<p><strong>Andrew Garrett:</strong> Do we need to have an AI Safety Institute, as Rishi Sunak has said? And if we don’t need one, why not?</p>
<p><strong>Detlef Nauck:</strong> I’m more in favour of encouraging academic research in the field and funding the kind of research projects that can look into how to build AI safely, [and] how to evaluate what it does. One of the key features of this technology is it has not come out of academic research; it has been built by large tech companies. And so, I think we have to do a bit of catch up in scientific research and in understanding how are we building these models, what can they do, and how do we control them?</p>
<p><strong>Mihaela van der Schaar:</strong> This technology has a life of its own now, and we are using it for all sorts of things that maybe initially was not even intended. So, shall we create an AI [safety] institute? We can, but we need to realise first that testing AI and showing that it’s safe in all sorts of ways is complicated. I would dare say that doing that well is a big research challenge by itself. I don’t think just one institute will solve it. And I feel the industry needs to bear some of the responsibility. I was very impressed by Professor [Geoffrey] Hinton, who came to Cambridge and said, “I think that some of these companies should invest as much money in making safe AI as developing AI.” I resonated quite a lot with that.</p>
<p>Also, let’s not forget, many academic researchers have two hats nowadays: they are professors, and they are working for big tech [companies] for a lot of money. So, if we take this academic, we put them in this AI tech safety institute, we have potential for corruption. I’m not saying that this will happen. But one needs to be very aware, and there needs to be a very big separation between who develops [AI technology] and who tests it. And finally, we need to realise that we may require an enormous amount of computation to be able to validate and test correctly, and very few academic or governmental organisations may have [that].</p>
<div class="pullquote-container">
<div class="grid">
<div class="g-col-12 g-col-lg-8 pullquote-grid pullquote">
<p><img src="https://realworlddatascience.net/images/pullquote-purple.png" class="img-fluid" width="50"></p>
<p>I think it’s an insult to the UK’s scientific legacy that we’re reduced to testing software that has been made by US companies. We have huge talents in this country. Why aren’t we using that talent to actually build something instead of testing something that someone else has made?</p>
</div>
<div class="g-col-12 g-col-lg-4">
<div class="quarto-figure quarto-figure-right">
<figure class="figure">
<p><img src="https://realworlddatascience.net/the-pulse/posts/2023/12/06/images/llm-3d-shapes-crop.png" class="img-fluid quarto-figure quarto-figure-right figure-img"></p>
</figure>
</div>
</div>
</div>
</div>
<p><strong>Martin Goodson:</strong> Can I disagree with this idea of an evaluation institute? I think it’s a really, really bad idea, for two reasons. The first is an argument about fairness. If you look at drug regulation, who pays for clinical trials? It’s not the government. It’s the pharmaceutical companies. They spend billions on clinical trials. So, why do we want to do this testing for free for the big tech companies? We’re just doing product development for them. It’s insane! They should be paying to show that their products are safe.</p>
<p>The other reason is, I think it’s an insult to the UK’s scientific legacy that we’re reduced to testing software that has been made by US companies. I think it’s pathetic. We were one of the main leaders of the Human Genome Project, and we really pushed it – the Wellcome Trust and scientists in the UK pushed the Human Genome Project because we didn’t want companies to have monopolistic control over the human genome. People were idealistic, there was a moral purpose. But now, we’re so reduced that all we can do is test some APIs that have been produced by Silicon Valley companies. We have huge talents in this country. Why aren’t we using that talent to actually build something instead of testing something that someone else has made?</p>
<p><strong>Mark Levene:</strong> Personally, I don’t see any problem in having an AI institute for safety or any other AI institutes. I think what’s important in terms of taxpayers’ money is that whatever institute or forum is invested in, it’s inclusive. One thing that the government should do is, we should have a panel of experts, and this panel should be interdisciplinary. And what this panel can do is it can advise government of the state of play in AI, and advise the regulators. And this panel doesn’t have to be static, it doesn’t have to be the same people all the time.</p>
<p><strong>Andrew Garrett:</strong> To evaluate something, whichever way you chose to do it, you need to have an inventory of those systems. So, with the current proposal, how would this AI Safety Institute have an inventory of what anyone was doing? How would it even work in practice?</p>
<p><strong>Martin Goodson:</strong> Unless we voluntarily go to them and say, “Can you test out our stuff?” then they wouldn’t. That’s the third reason why it’s a terrible idea. You’d need a licencing regime, like for drugs. You’d need to licence AI systems. But teenagers in their bedrooms are creating AI systems, so that’s impossible.</p>
</section>
<section id="lets-do-reality-centric-ai" class="level2">
<h2 class="anchored" data-anchor-id="lets-do-reality-centric-ai">Let’s do reality-centric AI!</h2>
<p><strong>Andrew Garrett:</strong> What are your thoughts about Rishi Sunak wanting the UK to be an AI powerhouse?</p>
<p><strong>Martin Goodson:</strong> It’s not going to be a powerhouse. This stuff about us being world leading in AI, it’s just a fiction. It’s a fairy tale. There are no real supercomputers in the UK. There are moves to build something, like you mentioned in your introduction, Andrew. But what are they going do with it? If they’re just going to build a supercomputer and carry on doing the same kinds of stuff that they’ve been doing for years, they’re not going to get anywhere. There needs to be a big project with an aim. You can build as many computers as you want. But if you haven’t got a plan for what to do with them, what’s the point?</p>
<p><strong>Mihaela van der Schaar:</strong> I really would agree with that. What about solving some real problem: trying to solve cancer; trying to solve our crisis in healthcare, where we don’t have enough infrastructure and doctors to take care of us? What about solving the climate change problem, or even traffic control, or preventing the next financial crisis? I wrote a little bit about that, and I call it “let’s do reality-centric AI.” Let’s have some goal that’s human empowering, take a problem that we have – energy, climate, cancer, Alzheimer’s, better education for children, and more diverse education for children – and let us solve these big challenges, and in the process we will build AI that’s hopefully more human empowering, rather than just saying, “Oh, we are going to solve everything if we have general AI.” Right now, I hear too much about AI for the sake of AI. I’m not sure, despite all the technology we build, that we have advanced in solving some real-world problems that are important for humanity – and imminently important.</p>
<p><strong>Martin Goodson:</strong> So, healthcare– I tried to make an appointment with my GP last week, and they couldn’t get me an appointment for four weeks. In the US you have this United States Medical Licencing Examination, and in order to practice medicine you need to pass all three components, you need to pass them by about 60%. They are really hard tests. GPT-4 for gets over 80% in all three of those. So, it’s perfectly plausible, I think, that an AI could do at least some of the role of the GP. But, you’re right, there is no mission to do that, there is no ambition to do that.</p>
<p><strong>Mihaela van der Schaar:</strong> Forget about replacing the doctors with ChatGPT, which I’m less sure is such a good idea. But, building AI to do the planning of healthcare, to say, “[Patient A], based on what we have found out about you, you’re not as high risk, maybe you can come in four weeks. But [patient B], you need to come tomorrow, because something is worrisome.”</p>
<p><strong>Martin Goodson:</strong> We can get into the details, but I think we are agreeing that a big mission to solve real problems would be a step forward, rather than worrying about these risks of superintelligences taking over everything, which is what the government is doing right now.</p>
</section>
<section id="managing-misinformation" class="level2">
<h2 class="anchored" data-anchor-id="managing-misinformation">Managing misinformation</h2>
<p><strong>Andrew Garrett:</strong> We have some important elections coming up in 2024 and 2025. We haven’t talked much about misinformation, and then disinformation. So, I’m interested to get your views here. How much is that a problem?</p>
<p><strong>Detlef Nauck:</strong> There’s a problem in figuring out when it happens, and that’s something we need to get our heads around. One thing that we’re looking at is, how do we make communication safe from bad actors? How do you know that you’re talking to the person you see on the camera and it’s not a deep fake? Detection mechanisms don’t really work, and they can be circumvented. So, it seems like what we need is new standards for communication systems, like watermarks and encryption built into devices. A camera should be able to say, “I’ve produced this picture, and I have watermarked it and it’s encrypted to a certain level,” and if you don’t see that, you can’t trust that what you see comes from a genuine camera, and it’s not artificially created. It’s more difficult around text and language – you can’t really watermark text.</p>
<p><strong>Mark Levene:</strong> Misinformation is not just a derivative of AI. It’s a derivative of social networks and lots of other things.</p>
<p><strong>Mihaela van der Schaar:</strong> I would agree that this is not only a problem with AI. We need to emphasise the role of education, and lifelong education. This is key to being able to comprehend, to judge for ourselves, to be trained to judge for ourselves. And maybe we need to teach different methods – from young kids to adults that are already working – to really exercise our own judgement. And that brings me to this AI for human empowerment. Can we build AI that is training us to become smarter, to become more able, more capable, more thoughtful, in addition to providing sources of information that are reliable and trustworthy?</p>
<p><strong>Andrew Garrett:</strong> So, empower people to be able to evaluate AI themselves?</p>
<p><strong>Mihaela van der Schaar:</strong> Yes, but not only AI – all information that is given to us.</p>
<p><strong>Martin Goodson:</strong> On misinformation, I think this is really an important topic, because large language models are extremely persuasive. I asked ChatGPT a puzzle question, and it calculated all of this stuff and gave me paragraphs of explanations, and the answer was [wrong]. But it was so convincing I was almost convinced that it was right. The problem is, these things have been trained on the internet and the internet is full of marketing – it’s trillions of words of extremely persuasive writing. So, these things are really persuasive, and when you put that into a political debate or an election campaign, that’s when it becomes really, really dangerous. And that is extremely worrying and needs to be regulated.</p>
<div class="pullquote-container">
<div class="grid">
<div class="g-col-12 g-col-lg-4">
<div class="quarto-figure quarto-figure-left">
<figure class="figure">
<p><img src="https://realworlddatascience.net/the-pulse/posts/2023/12/06/images/llm-3d-shapes-crop.png" class="img-fluid quarto-figure quarto-figure-left figure-img"></p>
</figure>
</div>
</div>
<div class="g-col-12 g-col-lg-8 pullquote-grid pullquote">
<p><img src="https://realworlddatascience.net/images/pullquote-purple.png" class="img-fluid" width="50"></p>
<p>At the moment, if you type something into ChatGPT and you ask for references, half of them will be made up. We know that, and also OpenAI knows that. But it could be that, if there’s regulation that things are traceable, you should be able to ask, ‘How did this information come about? Where did it come from?’</p>
</div>
</div>
</div>
<p><strong>Mark Levene:</strong> You need ways to detect it. Even that is a big challenge. I don’t know if it’s impossible, because, if there’s regulation, for example, there should be traceability of data. So, at the moment, if you type something into ChatGPT and you ask for references, half of them will be made up. We know that, and also OpenAI knows that. But it could be that, if there’s regulation that things are traceable, you should be able to ask, “How did this information come about? Where did it come from?” But I agree that if you just look at an image or some text, and you don’t know where it came from, it’s easy to believe. Humans are easily fooled, because we’re just the product of what we know and what we’re used to, and if we see something that we recognise, we don’t question it.</p>
</section>
<section id="audience-qa" class="level2">
<h2 class="anchored" data-anchor-id="audience-qa">Audience Q&amp;A</h2>
<section id="how-can-we-help-organisations-to-deploy-ai-in-a-responsible-way" class="level3">
<h3 class="anchored" data-anchor-id="how-can-we-help-organisations-to-deploy-ai-in-a-responsible-way">How can we help organisations to deploy AI in a responsible way?</h3>
<p><strong>Detlef Nauck:</strong> Help for the industry to deploy AI reliably and responsibly is something that’s missing, and for that, trust in AI is one of the things that needs to be built up. And you can only build up trust in AI if you know what these things are doing and they’re properly documented and tested. So that’s the kind of infrastructure, if you like, that’s missing. It’s not all big foundation models. It’s about, how do you actually use this stuff in practice? And 90% of that will be small, purpose-built AI models. That’s an area where the government can help. How do you empower smaller companies that don’t have the background of how AI works and how it can be used, how can they be supported in knowing what they can buy and what they can use and how they can use it?</p>
<p><strong>Mark Levene:</strong> One example from healthcare which comes to mind: when you do a test, let’s say, a blood test, you don’t just get one number, you should get an interval, because there’s uncertainty. What current [AI] models do is they give you one answer, right? In fact, there’s a lot of uncertainty in the answer. One thing that can build trust is to make transparent the uncertainty that the AI outputs.</p>
</section>
<section id="how-can-data-scientists-and-statisticians-help-us-understand-how-to-use-ai-properly" class="level3">
<h3 class="anchored" data-anchor-id="how-can-data-scientists-and-statisticians-help-us-understand-how-to-use-ai-properly">How can data scientists and statisticians help us understand how to use AI properly?</h3>
<p><strong>Martin Goodson:</strong> One big thing, I think, is in culture. In machine learning – academic research and in industry – there isn’t a very scientific culture. There isn’t really an emphasis on observation and experimentation. We hire loads of people coming out of an MSc or a PhD in machine learning, and they don’t know anything, really, about doing an experiment or selection bias or how data can trip you up. All they think about is, you get a benchmark set of data and you measure the accuracy of your algorithm on that. And so there isn’t this culture of scientific experimentation and observation, which is what statistics is all about, really.</p>
<p><strong>Mihaela van der Schaar:</strong> I agree with you, this is where we are now. But we are trying to change it. As a matter of fact, at the next big AI conference, NeurIPS, we plan to do a tutorial to teach people exactly this and bring some of these problems to the forefront, because trying really to understand errors in data, biases, confounders, misrepresentation – this is the biggest problem AI has today. We shouldn’t just build yet another, let’s say, classifier. We should spend time to improve the ability of these machine learning models to deal with all sorts of data.</p>
</section>
<section id="do-we-honestly-believe-yet-another-institute-and-yet-more-regulation-is-the-answer-to-what-were-grappling-with-here" class="level3">
<h3 class="anchored" data-anchor-id="do-we-honestly-believe-yet-another-institute-and-yet-more-regulation-is-the-answer-to-what-were-grappling-with-here">Do we honestly believe yet another institute, and yet more regulation, is the answer to what we’re grappling with here?</h3>
<p><strong>Detlef Nauck:</strong> I think we all agree, another institute is not going to cut it. One of the main problems is regulators are not trained on AI, so it’s the wrong people looking into it. This is where some serious upskilling is required.</p>
</section>
<section id="are-we-wrong-to-downplay-the-existential-or-catastrophic-risks-of-ai" class="level3">
<h3 class="anchored" data-anchor-id="are-we-wrong-to-downplay-the-existential-or-catastrophic-risks-of-ai">Are we wrong to downplay the existential or catastrophic risks of AI?</h3>
<p><strong>Martin Goodson:</strong> If I was an AI, a superintelligent AI, the easiest path for me to cause the extinction of the human race would be to spread misinformation about climate change, right? So, let’s focus on misinformation, because that’s an immediate danger to our way of life. Why are we focusing on science fiction? Let’s focus on reality.</p>
</section>
<section id="ai-tech-has-advanced-but-evaluation-metrics-havent-moved-forward.-why" class="level3">
<h3 class="anchored" data-anchor-id="ai-tech-has-advanced-but-evaluation-metrics-havent-moved-forward.-why">AI tech has advanced, but evaluation metrics haven’t moved forward. Why?</h3>
<p><strong>Mihaela van der Schaar:</strong> First, the AI community that I’m part of innovates at a very fast pace, and they don’t reward metrics. I am a big fan of metrics, and I can tell you, I can publish much faster a method in these top conferences then I can publish a metric. Number two, we often have in AI very stupid benchmarks, where we test everything on one dataset, and these datasets may be very wrong. On a more positive note, this is an enormous opportunity for machine learners and statisticians to work together and advance this very important field of metrics, of test sets, of data generating processes.</p>
<p><strong>Martin Goodson:</strong> The big problem with metrics right now is contamination, because most of the academic metrics and benchmark sets that we’re talking about, they’re published on the internet, and these systems are trained on the internet. I’ve already said that I don’t think this [evaluation] institute should exist. But if it did exist, there’s one thing that they could do, which is important, and that would be to create benchmark datasets that they do not publish. But obviously, you may decide, also, that the traditional idea of having a training set and a test set just doesn’t make any sense anymore. And there are loads of issues with data contamination, and data leakage between the training sets and the test sets.</p>
</section>
</section>
<section id="closing-thoughts-what-would-you-say-to-the-ai-safety-summit" class="level2">
<h2 class="anchored" data-anchor-id="closing-thoughts-what-would-you-say-to-the-ai-safety-summit">Closing thoughts: What would you say to the AI Safety Summit?</h2>
<p><strong>Andrew Garrett:</strong> If you were at the AI Safety Summit and you could make one point very succinctly, what would it be?</p>
<p><strong>Martin Goodson:</strong> You’re focusing on the wrong things.</p>
<p><strong>Mark Levene:</strong> What’s important is to have an interdisciplinary team that will advise the government, rather than to build these institutes, and that this team should be independent and a team which will change over time, and it needs to be inclusive.</p>
<p><strong>Mihaela van der Schaar:</strong> AI safety is complex, and we need to realise that people need to have the right expertise to be able to really understand the risks. And there is risk, as I mentioned before, of potential collusion, where people are both building the AI and saying it’s safe, and we need to separate these two worlds.</p>
<p><strong>Detlef Nauck:</strong> Focus on the data, not the models. That’s what’s important to build AI.</p>
<div class="article-btn">
<p><a href="../../../../../the-pulse/index.html">Discover more The Pulse</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2023 Royal Statistical Society
</dd>
</dl>
<p>Images by <a href="https://cream3d.com/">Wes Cockx</a> &amp; <a href="https://deepmind.google/discover/visualising-ai/">Google DeepMind</a> / <a href="https://www.betterimagesofai.org">Better Images of AI</a> / AI large language models / <a href="https://creativecommons.org/licenses/by/4.0/">Licenced by CC-BY 4.0</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Tarran, Brian. 2023. “Evaluating artificial intelligence: How data science and statistics can make sense of AI models.” Real World Data Science, December 6, 2023. <a href="https://realworlddatascience.net/the-pulse/posts/2023/12/06/ai-fringe.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>


</section>

 ]]></description>
  <category>AI</category>
  <category>Large language models</category>
  <category>Accountability</category>
  <category>Regulation</category>
  <category>Metrics</category>
  <category>Events</category>
  <guid>https://realworlddatascience.net/the-pulse/posts/2023/12/06/ai-fringe.html</guid>
  <pubDate>Wed, 06 Dec 2023 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/posts/2023/12/06/images/llm-3d-shapes.png" medium="image" type="image/png" height="105" width="144"/>
</item>
<item>
  <title>How data science and statistics can shape the UK’s AI strategy</title>
  <dc:creator>Brian Tarran</dc:creator>
  <link>https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/10/30/ai-conf-panel.html</link>
  <description><![CDATA[ 





<div class="quarto-video ratio ratio-16x9"><iframe data-external="1" src="https://www.youtube.com/embed/7aZrkQIComM?si=7efQPy5m3ZCxe4sg" title="" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></div>
<section id="about-the-panelists" class="level2">
<h2 class="anchored" data-anchor-id="about-the-panelists">About the panelists</h2>
<p><strong>Andrew Garrett</strong> (chair) is president of the Royal Statistical Society. He is executive vice president of scientific operations at the clinical research organisation ICON plc, where he is responsible for the strategic direction and operational delivery of a range of clinical trial services. Having worked extensively in the area of rare diseases, he has held various biostatistics managerial positions in the pharmaceutical industry, including vice president of biostatistics, medical writing and regulatory affairs at Quintiles (now IQVIA).</p>
<p><strong>Peter Wells</strong> is a technologist, who accidentally started a second career in public policy. He has both worked on AI policy and helped design AI-enabled services. After 20 years in the telecoms industry, he found himself spending 2014 developing digital government policy for the Labour Party. Since then he has worked with multiple governments and organisations including the Open Data Institute, Projects by IF, Google, Meta and the Government Digital Service.</p>
<p><strong>Maxine Setiawan</strong> is a data scientist specialising in AI and data risk and trusted AI in EY UK&amp;I. She works to help clients from various industries assess and manage risks from analytics and AI systems, and implement AI governance to ensure AI systems are implemented with fair, accountable, and trustworthy principles. She combines her socio-technical background with an MSc in Social Data Science from the University of Oxford, and her experience working in data science within consulting firms.</p>
<p><strong>Sophie Carr</strong> is chair of the <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2022/10/18/meet-the-team.html">Real World Data Science editorial board</a> and is the founder and owner of Bays Consulting, a data science company. Having trained as an aeronautical engineer, Sophie completed her PhD in Bayesian analysis part time whilst she worked and, following redundancy, founded her own company. She is the VP for education and statistical literacy at the RSS and sits on the executive committees of the Academy for Mathematical Sciences and the International Centre for Mathematical Sciences. She is also currently <a href="https://ima.org.uk/12382/worlds-most-interesting-mathematician-2019-dr-sophie-carr/">the world’s most interesting mathematician</a>.</p>
<p><strong>Chris Nemeth</strong> is a professor of statistics at Lancaster University. His primary research area is in probabilistic machine learning and computational statistics. He holds an EPSRC-funded Turing AI fellowship on Probabilistic Algorithms for Scalable and Computable Approaches to Learning (PASCAL), and through his fellowship he works closely with partners including Shell, Tesco, Elsevier, Microsoft Research and The Alan Turing Institute. He is chair of the <a href="https://rss.org.uk/membership/rss-groups-and-committees/sections/statistical-computing/">Royal Statistical Society Section on Computational Statistics and Machine Learning</a>.</p>
<p><strong>Karen Tingay</strong> is a principal statistical methodologist at the Office for National Statistics where she specialises in natural language processing and in managing complex survey imputation. She established and heads up the Text Data Subcommunity, a large network of public sector analysts to build capability and best practice guidance in managing and analysing unstructured text data, on behalf of the Government Data Science Community. She sits on several cross-government and international working groups on responsible use of generative AI.</p>
<div class="article-btn">
<p><a href="../../../../../../the-pulse/editors-blog/index.html">Back to Editors’ blog</a></p>
</div>
<div class="further-info">
<div class="grid">
<div class="g-col-12 g-col-md-6">
<dl>
<dt>Copyright and licence</dt>
<dd>
© 2023 Royal Statistical Society
</dd>
</dl>
<p><a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> <img style="height:22px!important;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"></a> This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) <a href="http://creativecommons.org/licenses/by/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;"> International licence</a>.</p>
</div>
<div class="g-col-12 g-col-md-6">
<dl>
<dt>How to cite</dt>
<dd>
Tarran, Brian. 2023. “How data science and statistics can shape the UK’s AI strategy.” Real World Data Science, October 30, 2023. <a href="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/10/30/ai-conf-panel.html">URL</a>
</dd>
</dl>
</div>
</div>
</div>


</section>

 ]]></description>
  <category>AI</category>
  <category>Large language models</category>
  <category>Events</category>
  <guid>https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/10/30/ai-conf-panel.html</guid>
  <pubDate>Mon, 30 Oct 2023 00:00:00 GMT</pubDate>
  <media:content url="https://realworlddatascience.net/the-pulse/editors-blog/posts/2023/10/30/images/panel.png" medium="image" type="image/png" height="105" width="144"/>
</item>
</channel>
</rss>
