Case studies are a core feature of the Real World Data Science platform. Our case studies are designed to show how data science is used to solve real-world problems in business, public policy and beyond.
A good case study will be a source of information, insight and inspiration for each of our target audiences:
- Practitioners will learn from their peers – whether by seeing new techniques applied to common problems, or familiar techniques adapted to unique challenges.
- Leaders will see how different data science teams work, the mix of skills and experience in play, and how the components of the data science process fit together.
- Students will enrich their understanding of how data science is applied, how data scientists operate, and what skills they need to hone to succeed in the workplace.
Structure
Case studies should follow the structure below. It is not necessary to use the section headings we have provided – creativity and variety are encouraged. However, the areas outlined under each section heading should be covered in all submissions.
- The problem/challenge
- Summarise the project and its relevance to your organisation’s needs, aims and ambitions.
- Goals
- Specify what exactly you sought to achieve with this project.
- Background
- An opportunity to explain more about your organisation, your team’s work leading up to this project, and to introduce audiences more generally to the type of problem/challenge you faced, particularly if it is a problem/challenge that may be experienced by organisations working in different sectors and industries.
- Approach
- Describe how you turned the organisational problem/challenge into a task that could be addressed by data science. Explain how you proposed to tackle the problem, including an introduction, explanation and (possibly) a demonstration of the method, model or algorithm used. (NB: If you have a particular interest and expertise in the method, model or algorithm employed, including the history and development of the approach, please consider writing an Explainer article for us.) Discuss the pros and cons, strengths and limitations of the approach.
- Implementation
- Walk audiences through the implementation process. Discuss any challenges you faced, the ethical questions you needed to ask and answer, and how you tested the approach to ensure that outcomes would be robust, unbiased, good quality, and aligned with the goals you set out to achieve.
- Impact
- How successful was the project? Did you achieve your goals? How has the project benefited your organisation? How has the project benefited your team? Does it inform or pave the way for future projects?
- Learnings
- What are your key takeaways from the project? Are there lessons that you can apply to future projects, or are there learnings for other data scientists working on similar problems/challenges?
Advice and recommendations
You do not need to divulge the detailed inner workings of your organisation. Audiences are mostly interested in understanding the general use case and the problem-solving process you went through, to see how they might apply the same approach within their own organisations.
Goals can be defined quite broadly. There’s no expectation that you set out your organisation’s short- or long-term targets. Instead, audiences need to know enough about what you want to do so they can understand what motivates your choice of approach.
Use toy examples and synthetic data to good effect. We understand that – whether for commercial, legal or ethical reasons – it can be difficult or impossible to share real data in your case studies, or to describe the actual outputs of your work. However, there are many ways to share learnings and insights without divulging sensitive information. This blog post from Lyft uses hypotheticals, mathematical notation and synthetic data to explain the company’s approach to causal forecasting without revealing actual KPIs or data.
People like to experiment, so encourage them to do so. Our platform allows you to embed code and to link that code to interactive coding environments like Google Colab. So if, for example, you want to explain a technique like bootstrapping, why not provide a code block so that audiences can run a bootstrapping simulation themselves.
Leverage links. You can’t be expected to explain or cover every detail in one case study, so feel free to point audiences to other sources of information that can enrich their understanding: blogs, videos, journal articles, conference papers, etc.