Course Catalogue
Filter Courses
Matches
Language
Programming experience
Course type
Data Visualisation with ggplot2
Want to learn how to effectively visualise your data in R using the elegant {ggplot2} package? With {ggplot2} it’s easy to customise everything from plot layouts and themes to scales, colours, and more! This course will comprehensively take you through basic plot types such as bar and line charts as well as cover more advanced topics such as interactive graphics with {plotly}.
- Programming Level: Intermediate
- Type: Analytics
Data Wrangling in the Tidyverse
If you work with data, you probably spend a lot of time cleaning it and wrangling it into the correct shape. This course will show you how you can use R to efficiently clean and wrangle your data into a format that’s ready for analysis. You will learn about the Tidyverse, what tidy data really is, and how to practically achieve it with packages such as {dplyr}, {tidyr}, {lubridate} and {forcats}.
- Programming Level: Foundation
- Type: Programming
From Nothing to Gold… Productionising with Databricks using the Medallion Architecture
This session is a morning workshop at our upcoming AI in Production 2026 conference, taking place on 4–5 June 2026 in Newcastle upon Tyne, UK. Join us at The Catalyst for a hands-on, practical learning experience focused on applying AI and machine-learning techniques in real-world production settings. Visit the conference website for full details and registration. TICKETS CAN NOW BE BOOKED ON EVENTBRITE!
- Programming Level: Intermediate
- Type: Programming
Improving your workflow with Positron and Claude
This session is a morning workshop at our upcoming AI in Production 2026 conference, taking place on 4–5 June 2026 in Newcastle upon Tyne, UK. Join us at The Catalyst for a hands-on, practical learning experience focused on applying AI and machine-learning techniques in real-world production settings. Visit the conference website for full details and registration. TICKETS CAN NOW BE BOOKED ON EVENTBRITE!
- Programming Level: Intermediate
- Type: Programming
Introduction to Machine Learning Operations
In this course we will train an example Machine Learning model using the scikit-learn library, then use open source tools to explore the steps involved in taking the model to production. Along the way we will cover model versioning using pins, model deployment using FastAPI, and good practices for monitoring model performance as the data evolves over time. We will finish with some considerations for automated reporting of model outputs and standardising your Machine Learning workflows using cloud platforms like AWS and Databricks.
- Programming Level: Intermediate
- Type: Stats/ML
Introduction to Python
Python is a general-purpose programming language popular among data scientists and statisticians. In this one-day introductory course, participants will learn to import, summarise and visualise their data. At each step, we avoid using “magic code”, and stress the importance of understanding what Python is doing.
- Programming Level: Foundation
- Type: Programming
Introduction to R
In this course, you’ll explore the versatility of R, a powerful language for statistical computing and graphics. Discover the benefits of using R and get started with the basics. Gain confidence with the user-friendly RStudio interface and learn fundamental R concepts. You’ll also dive into the Tidyverse, a collection of packages for data storage, visualization, and manipulation. This course offers a solid foundation to kickstart your journey with R!
- Programming Level: Foundation
- Type: Programming
Introduction to Shiny
Do you want to provide interactive visualisation and data exploration features for users who do not have R and data science skills? Discover how easy it can be to use R and {shiny} to create your own apps and dashboards for exploring data without relying on web development or external BI tools. We will show you various examples of input widgets and outputs to display tables and visualisations.
- Programming Level: Intermediate
- Type: Reporting
LLM-Driven Applications with R and Python
Learn how to work with large language models (LLMs) using R and Python. This course will start with basic concepts like sending user prompts and receiving a structured output, before moving onto more advanced topics like building LLM-powered web applications and configuring a knowledge store for retrieval-augmented generation (RAG). Throughout, we will emphasise important considerations for security, safety and responsible use of AI.
- Programming Level: Intermediate
- Type: Programming, Stats/ML
Programming with R
The benefit of using a programming language such as R is that we can automate repetitive tasks. This course covers the fundamental techniques such as functions, for loops and conditional expressions. By the end of this course, you will understand what these techniques are and when to use them. This is a one-day intensive course on R.
- Programming Level: Intermediate
- Type: Programming
Prompt Craft & AI Integration: Building LLM-Driven Workflows in R and Python
This session is a morning workshop at our upcoming AI in Production 2026 conference, taking place on 4–5 June 2026 in Newcastle upon Tyne, UK. Join us at The Catalyst for a hands-on, practical learning experience focused on applying AI and machine-learning techniques in real-world production settings. Visit the conference website for full details and registration. TICKETS CAN NOW BE BOOKED ON EVENTBRITE!
- Programming Level: Intermediate
- Type: Programming
Python Best Practices
So you can write code? Great. But can you write code which is easy to read, simple to maintain, and reproducible? Under the pressure of deadlines even the best of us can fall victim to bad-practices. In this course we motivate the importance of good-practices, and show how we can make best practices second nature by incorporating them into our normal workflow.
- Programming Level: Intermediate
- Type: Programming
R Best Practices
So you can write code? Great. But can you write code which is easy to read, simple to maintain, and reproducible? Under the pressure of deadlines even the best of us can fall victim to bad-practices. In this course we motivate the importance of good-practices, and show how we can make best practices second nature by incorporating them into our normal workflow.
- Programming Level: Intermediate
- Type: Programming
Reporting with R Markdown
Do you want to dynamically create static or interactive documents? Do you want your reports to automatically update when the data changes? Then this session is for you! R Markdown is easy to use and allows for dynamic report generation. Whether you are hoping to generate HTML, PDF or Microsoft Word like documents, or even slides for a presentation, R Markdown tailors to your needs.
- Programming Level: Intermediate
- Type: Reporting
Self-hosted LLMs: Running Your Own Inference Infrastructure
This session is a morning workshop at our upcoming AI in Production 2026 conference, taking place on 4–5 June 2026 in Newcastle upon Tyne, UK. Join us at The Catalyst for a hands-on, practical learning experience focused on applying AI and machine-learning techniques in real-world production settings. Visit the conference website for full details and registration. TICKETS CAN NOW BE BOOKED ON EVENTBRITE!
- Programming Level: Intermediate
- Type: Programming
Shiny Meets LLMs: Smarter App Experiences
This session is a morning workshop at our upcoming AI in Production 2026 conference, taking place on 4–5 June 2026 in Newcastle upon Tyne, UK. Join us at The Catalyst for a hands-on, practical learning experience focused on applying AI and machine-learning techniques in real-world production settings. Visit the conference website for full details and registration. TICKETS CAN NOW BE BOOKED ON EVENTBRITE!
- Programming Level: Intermediate
- Type: Programming
The Power of Databricks Genie Rooms… Data Discovery and Questions with Minimal Effort
This session is a morning workshop at our upcoming AI in Production 2026 conference, taking place on 4–5 June 2026 in Newcastle upon Tyne, UK. Join us at The Catalyst for a hands-on, practical learning experience focused on applying AI and machine-learning techniques in real-world production settings. Visit the conference website for full details and registration. TICKETS CAN NOW BE BOOKED ON EVENTBRITE!
- Programming Level: Intermediate
- Type: Programming
Why Use R?
This is a 1/2 day session that gives an overview of where and how R is used. Using a combination of lecture-based case studies, and hands-on practicals we’ll cover some of the latest developments in the R world. This course is intended to be interactive and is aimed at an organisation that is considering why (or why not) to move to R.
- Programming Level: Foundation
- Type: Management
Advanced Concepts in Shiny
Take your interactive {shiny} skills to the next level by creating more robust, responsive and maintainable applications. In this course, we’ll visit more advanced topics that can be used to improve the experience for both those producing the apps and those using them. Subjects will cover: additional ways to react to and validate user inputs; restructuring your app with modules; and an introduction to testing your {shiny} apps.
- Programming Level: Advanced
- Type: Reporting
Efficient Data Science in Python
In recent years Python has exploded onto the data-science scene, and with it has come a great swathe of data-oriented packages. However, as easy as these packages make analysis, using these tools efficiently requires much more know-how. By the end of this course participants will be able to locate and address bottlenecks in their data-science workflows, using a number of different techniques and tools.
- Programming Level: Intermediate
- Type: Programming
Managing Packages with Posit Package Manager
Package management is made simple with Posit Package Manager (PPM): manage your entire organisation’s packages from a single interface. PPM enables offline access to CRAN, PyPI, and Bioconductor via binaries, making installation of packages way faster for users, and consistent across your organisation. PPM also allows users to time-travel to previous versions of the package repository when needed. Allow us to introduce your data scientists to the reliability and flexibility of PPM.
- Programming Level: Intermediate
- Type: Management
Programming with Python
The benefit of using a programming language such as Python is that we can automate repetitive tasks. This course covers the fundamental techniques such as functions, for loops and conditional expressions. By the end of this course, you will understand what these techniques are and how they can be applied to solve real-world data wrangling tasks.
- Programming Level: Intermediate
- Type: Programming
Reporting with Quarto
Do you create interactive documents that always need to be updated when the data changes? Then this course is for you. In this course you will learn how to use Quarto to create high quality, dynamic, fully reproducible documents. Quarto is a multi-language open source publishing tool that allows for the creation of dynamic content with Python, R, Julia and Observable.
- Programming Level: Intermediate
- Type: Reporting
Shiny for Python
Do you want to provide interactive visualisation and data exploration features for users who do not have Python and data science skills? Discover how easy it can be to use Python and Shiny to create your own apps and dashboards for exploring data without relying on web development or external BI tools. We will show you various examples of input widgets and outputs to display tables and visualisations.
- Programming Level: Intermediate
- Type: Reporting
Machine Learning with Tidymodels
Machine learning is the process of applying statistical techniques to gain systematic information about a quantity of interest. We will be specifically focusing on how we can use the {tidymodels} suite of packages to implement these techniques. We cover key reasons for model fitting, such as prediction and inference, on quantitative and qualitative responses.
- Programming Level: Intermediate
- Type: Stats/ML
Git for Organisations
Git is perfect for working collaboratively. However, when you’re working together in an organisation, it’s important to have rules and processes so you all know how to work together. We’ll share what works for us at Jumping Rivers, and then guide you through the process of coming up with your own rules and processes for git, such as choosing a branching strategy and formalising your code review.
- Programming Level: Intermediate
- Type: Version Control
Introduction to Git
When working on data analysis projects version control is essential, for tracking project progress and assisting project collaboration. During this course we will show you multiple ways to integrate version control into your project with git. You will gain an understanding of how to use online code sharing websites such as GitHub / GitLab, along with the best practices while doing so.
- Programming Level: Foundation
- Type: Version Control
Introduction to SQL
The Structured Query Language (SQL) defines a standard for communicating with a relational database. In this half-day introductory course, participants will learn the basic SQL syntax for data extraction, filtering and insertion. We will then discuss some considerations for working with databases on the cloud, and finish by learning basic techniques for joining tables.
The course can be taken either independently or as a precursor to our Intro to SQL with R and Intro to SQL with Python courses.
- Programming Level: Foundation
- Type: Programming
Object-Oriented Programming in Python
Object-oriented programming is the dominant programming paradigm in Python and can be used to improve the structure of your data science code. Here, we will learn how to model real-world entities using classes, how to create class instances, and how to attach data and behaviour to these objects. The main ideas of object-oriented design (inheritance, polymorphism, encapsulation, abstraction) are covered, and you will learn how to extend existing classes from well-known data science packages.
- Programming Level: Advanced
- Type: Programming
Big Data Analytics with PySpark
Tools such as pandas offer a powerful way to manipulate and analyse data in Python. However, if you need to process a large dataset, a single machine might not cut it. Apache Spark is an analytics engine for processing large volumes of data on a computer cluster. It comes with a Python interface, PySpark, enabling those familiar with Python to easily get started with Spark for big data. This course will introduce data science at scale with the PySpark DataFrames API and Spark MLlib.
- Programming Level: Intermediate
- Type: Programming
Data Exploration with Tableau
Tableau is more than just a simple data visualisation tool. It also gives people the capability to manipulate multiple data sources, create custom charts, build predictive models, and turn their plots into interactive dashboards and presentations. Designed for people with some experience of Tableau, this course will showcase what Tableau can do beyond basic data visualisation.
- Programming Level: Intermediate
- Type: Reporting
Functional Programming with {purrr}
This is a one-day course on the {tidyverse} package, {purrr}. {purrr} is a very powerful package that gives great flexibility to analysts, by enhancing R’s functional programming toolkit. We will demonstrate how to use functions such as map(), map2() and pmap(), to iteratively map functions over multi-element objects like vectors and lists. Emphasis will also be placed on how we can manipulate list outputs and how this can be applied to our data.
- Programming Level: Foundation
- Type: Programming
Introduction to Tableau
Faster and more capable of handling larger datasets than Excel, Tableau is quickly becoming a valuable tool for individuals and organisations who want to leverage their data. It’s more user-friendly and simpler to learn than programming languages, but still allows a high-level of customisation. This course is designed for people with no prior experience of Tableau, who want to get to grips with the basics of summarising and interactively visualising their data.
- Programming Level: Foundation
- Type: Reporting
Managing Requirements
Communicating is difficult, especially when combining technical and non-technical teams.
In this workshop we introduce methods for users to effectively communicate what they want from an application, while allowing for developers to specify how they provide it. We discuss how to prioritise and estimate features collaboratively in order to deliver the most value the soonest. By providing a means to understand each team’s needs, we highlight the role that each person intuitively plays.
- Programming Level: Foundation
- Type: Management
Responsive Web Design in Shiny
Shiny makes it easy to view data by creating an interactive webpage. But the way we access web content has evolved ever since smartphone and tablet devices became affordable. Now responsive web design is used across the internet to ensure that webpages will dynamically arrange their content to best suit the device’s dimensions. This course discusses how to design responsive Shiny apps that work effectively on displays of all shapes and sizes.
- Programming Level: Advanced
- Type: Reporting
Text Mining in R
Want to learn how to get the most out of text data? Today, a lot of data produced contains unstructured text, which can be difficult to transform and analyse without the correct knowledge and tools. In this course you will learn the basics of manipulating and transforming text data as well as how to extract meaning and sentiment in R, using packages such as {stringr} and {tidytext}.
- Programming Level: Intermediate
- Type: Programming
Web Accessibility in Shiny
A Shiny app won’t meet Web Content Accessibility Guidelines (WCAG) standards right out of the box. There’s a few things you’ll need to consider before your Shiny app is accessible to all. In this course, we’ll demonstrate some common accessibility requirements, the assistive technologies that may be used, and the design adjustments we can make to accommodate those needs.
- Programming Level: Advanced
- Type: Reporting
Introduction to Bayesian Inference using RStan
Despite the promise of big data, inferences are often limited by its systematic structure. Only by carefully modelling this structure can we take full advantage of the data. Stan is a platform for facilitating this modelling, providing an expressive modelling language to implement state-of-the-art algorithms, to draw subsequent Bayesian inferences. This course will teach participants how to interface with Stan through R!
- Programming Level: Intermediate
- Type: Stats/ML, Programming
Introduction to Bayesian Inference using PyStan
Despite the promise of big data, inferences are often limited by its systematic structure. Only by carefully modelling this structure can we take full advantage of the data. Stan is a platform for facilitating this modelling, providing an expressive modelling language to implement state-of-the-art algorithms, to draw subsequent Bayesian inferences.
The course will teach participants how to interface with Stan through Python!
- Programming Level: Intermediate
- Type: Stats/ML, Programming
Rust Programming
Explore the power and efficiency of Rust, a modern language designed for speed, security, and low-level memory access. This course is ideal for experienced developers looking to enhance their skills and productivity by leveraging Rust’s unique capabilities.
- Programming Level: Intermediate
- Type: Programming
Data Visualisation with Python
Python has a number of packages for the effective creation of graphics to communicate your data insights. This course will examine two popular libraries for creating static 2D plots: Matplotlib and Seaborn. During the training session, we’ll cover plotting basics and customisation of figures with Matplotlib, before moving onto complex statistical visualisations with Seaborn.
- Programming Level: Intermediate
- Type: Analytics
Statistical Modelling with R
From the very beginning, R was designed for statistical modelling. Out of the box, R makes standard statistical techniques easy. This course covers the fundamental modelling techniques. We begin the day by revising hypotheses tests, before moving onto ANOVA tables and regression analysis. The class ends by looking at more sophisticated methods such as clustering and principal components analysis (PCA).
- Programming Level: Intermediate
- Type: Stats/ML
Introduction to Posit Workbench
Posit Workbench takes all of the features you love about the RStudio IDE and puts them in the cloud. Workbench also enables real-time collaboration, support for both R and Python development environments and secure, concurrent sessions. Our trainers are excited to introduce your organisation to a whole new world with Posit Workbench.
- Programming Level: Foundation
- Type: Management
Spatial Data Analysis with R
As spatial data sets get larger, more sophisticated software needs to be harnessed for their analysis. R is now a widely used open source software platform for working with spatial data thanks to its powerful analysis and visualisation packages. The focus of this course is providing participants with the understanding needed to apply R’s powerful suite of geographical tools to their own problems.
- Programming Level: Advanced
- Type: Analytics
Advanced Machine Learning with Tidymodels
A course that builds on the material covered in our Machine Learning with Tidymodels course. We take a look at how we can fit linear discriminant analysis (LDA) models using {discrim}, assessing model reliability using V-fold cross validation, pre-processing, tree-based models & more. If you wish to explore the abundance of model fitting techniques {tidymodels} has to offer, then this course is certainly for you!
- Programming Level: Advanced
- Type: Stats/ML
An Introduction to SQL with R
Using databases is a fundamental part of a data scientist’s role. The main focus of this training course is to introduce SQL databases, write your first SQL queries, and show how R can be used to retrieve and manipulate data stored in a relational database. The course uses both the {DBI} and {dbplyr} packages.
We use the PostgreSQL database as an example for public courses. For in-house training, we are happy to adapt the course to match your database requirements.
- Programming Level: Intermediate
- Type: Programming
Building an R Package
This is a one-day intensive course on building a package in R. The focus will be on getting a working R package ready for distribution. This includes automating package setup and consistent package structure with {usethis}. You will be able to use the {testthat} workflow to create tests for packages.
- Programming Level: Advanced
- Type: Programming
Efficient R Programming
This course is for anyone who wants to make their R code faster to type, faster to run and more scalable. During the course, we’ll cover the main R sins (and how to avoid them), dabble with hardware, look at running in parallel and think about efficient R data structure. This course should be useful to people with a range of skill levels.
- Programming Level: Advanced
- Type: Programming
Introduction to Docker
This is a one-day Docker course aimed at R users. Docker is a popular platform for packaging, deploying, and running applications. These applications run in containers. Crucially, this container can be used on any system: a developer’s laptop, systems on premises, or in the cloud. Applications are packaged as images that contain everything needed to run them: code, libraries, and configuration.
- Programming Level: Intermediate
- Type: Programming
Introduction to Posit Connect
Posit Connect (formally RStudio) is an enterprise-grade publishing platform which gives you, the user, the ability to easily share code, documents and applications with collaborators, colleagues and clients. By the end of this course participants will be able to deploy their content to Posit Connect, manage its access and settings, and tune how this content scales with usage.
- Programming Level: Intermediate
- Type: Management
Introduction to SQL with Python
Using databases is a fundamental part of a data scientist’s role. This training course introduces SQL databases and the SQL command syntax, and shows how Python can be used to retrieve and manipulate data held in a relational database. The course also discusses how SQLAlchemy can be used to define and interact with databases using object-oriented Python code.
We use a PostgreSQL database as an example, and communicate with this using a psycopg2 connection.
- Programming Level: Intermediate
- Type: Programming
Machine Learning with Python
Python (along with R) has become the dominant language in machine learning and data science. This course will equip you with the knowledge and tools to undertake a variety of tasks in a standard machine learning pipeline. We stress the importance of data preparation, both in terms of data standardisation and feature selection, before tackling model building.
We run a separate course on using Tensorflow with Python.
- Programming Level: Intermediate
- Type: Stats/ML
Object Oriented Programming in R
The training course will cover R object-oriented programming techniques. We’ll discuss what OOP is and the different varieties within R. Beginning with the popular S3 and S4 OOP frameworks, we’ll finish with the new {R6} package that is used extensively in Shiny applications. By the end of the course, participants will be able to use OOP within their own code.
- Programming Level: Advanced
- Type: Programming
Python and Tensorflow
Deep learning is a cutting-edge machine learning technique for classification and regression. In the past few years, it has produced state-of-the-art results in fields such as image classification, natural language processing, bioinformatics and robotics. This course will cover the main ideas of deep learning, and how to implement it in practice with tensorflow: a software framework for efficient and scalable deep learning.
- Programming Level: Intermediate
- Type: Programming
PyTorch with Python
Python (along with R) has become the dominant language in machine learning and data science. PyTorch is an open-source machine learning library for Python, based on Torch, used for applications such as natural language processing. It is primarily developed by Facebook’s artificial-intelligence research group, and Uber’s “Pyro” software for probabilistic programming is built on it.
- Programming Level: Intermediate
- Type: Stats/ML
Tidy Evaluation in R
This is a one-day course comprising of methods for tidy evaluation in R. We introduce the {rlang} package as a way of parsing variables from a data set into a function. Furthermore, we cover environments and function-evaluation in R, to help you understand how the tools in {rlang} work under the hood.
- Programming Level: Advanced
- Type: Programming
Time Series Analysis with R
Predicting the future is a tough problem. Time series analysis makes it possible to assess whether or not predictions are possible and, if they are, build a model which can generate informed predictions for the future with realistic estimates of uncertainty. This training course will introduce participants to the packages in the Tidyverts.
The best qualification of a prophet is to have a good memory – George Savile
- Programming Level: Intermediate
- Type: Stats/ML, Analytics
No courses match your current set of filters.
























































