Course Catalogue

Data Visualisation with ggplot2

Want to learn how to effectively visualise your data in R using the elegant {ggplot2} package? With {ggplot2} it’s easy to customise everything from plot layouts and themes to scales, colours, and more! This course will comprehensively take you through basic plot types such as bar and line charts as well as cover more advanced topics such as interactive graphics with {plotly}.

Programming Level: Intermediate
Type: Analytics

Data Wrangling in the Tidyverse

If you work with data, you probably spend a lot of time cleaning it and wrangling it into the correct shape. This course will show you how you can use R to efficiently clean and wrangle your data into a format that’s ready for analysis. You will learn about the Tidyverse, what tidy data really is, and how to practically achieve it with packages such as {dplyr}, {tidyr}, {lubridate} and {forcats}.

Programming Level: Foundation
Type: Programming

From Nothing to Gold… Productionising with Databricks using the Medallion Architecture

This session is a morning workshop at our upcoming AI in Production 2026 conference, taking place on 4–5 June 2026 in Newcastle upon Tyne, UK. Join us at The Catalyst for a hands-on, practical learning experience focused on applying AI and machine-learning techniques in real-world production settings. Visit the conference website for full details and registration. TICKETS CAN NOW BE BOOKED ON EVENTBRITE!

Programming Level: Intermediate
Type: Programming

Improving your workflow with Positron and Claude

Programming Level: Intermediate
Type: Programming

Introduction to Machine Learning Operations

In this course we will train an example Machine Learning model using the scikit-learn library, then use open source tools to explore the steps involved in taking the model to production. Along the way we will cover model versioning using pins, model deployment using FastAPI, and good practices for monitoring model performance as the data evolves over time. We will finish with some considerations for automated reporting of model outputs and standardising your Machine Learning workflows using cloud platforms like AWS and Databricks.

Programming Level: Intermediate
Type: Stats/ML

Introduction to Python

Python is a general-purpose programming language popular among data scientists and statisticians. In this one-day introductory course, participants will learn to import, summarise and visualise their data. At each step, we avoid using “magic code”, and stress the importance of understanding what Python is doing.

Programming Level: Foundation
Type: Programming

Introduction to R

In this course, you’ll explore the versatility of R, a powerful language for statistical computing and graphics. Discover the benefits of using R and get started with the basics. Gain confidence with the user-friendly RStudio interface and learn fundamental R concepts. You’ll also dive into the Tidyverse, a collection of packages for data storage, visualization, and manipulation. This course offers a solid foundation to kickstart your journey with R!

Programming Level: Foundation
Type: Programming

Introduction to Shiny

Do you want to provide interactive visualisation and data exploration features for users who do not have R and data science skills? Discover how easy it can be to use R and {shiny} to create your own apps and dashboards for exploring data without relying on web development or external BI tools. We will show you various examples of input widgets and outputs to display tables and visualisations.

Programming Level: Intermediate
Type: Reporting

LLM-Driven Applications with R and Python

Learn how to work with large language models (LLMs) using R and Python. This course will start with basic concepts like sending user prompts and receiving a structured output, before moving onto more advanced topics like building LLM-powered web applications and configuring a knowledge store for retrieval-augmented generation (RAG). Throughout, we will emphasise important considerations for security, safety and responsible use of AI.

Programming Level: Intermediate
Type: Programming, Stats/ML

Programming with R

The benefit of using a programming language such as R is that we can automate repetitive tasks. This course covers the fundamental techniques such as functions, for loops and conditional expressions. By the end of this course, you will understand what these techniques are and when to use them. This is a one-day intensive course on R.

Programming Level: Intermediate
Type: Programming

Prompt Craft & AI Integration: Building LLM-Driven Workflows in R and Python

Programming Level: Intermediate
Type: Programming

Python Best Practices

So you can write code? Great. But can you write code which is easy to read, simple to maintain, and reproducible? Under the pressure of deadlines even the best of us can fall victim to bad-practices. In this course we motivate the importance of good-practices, and show how we can make best practices second nature by incorporating them into our normal workflow.

Programming Level: Intermediate
Type: Programming

R Best Practices

Programming Level: Intermediate
Type: Programming

Reporting with R Markdown

Do you want to dynamically create static or interactive documents? Do you want your reports to automatically update when the data changes? Then this session is for you! R Markdown is easy to use and allows for dynamic report generation. Whether you are hoping to generate HTML, PDF or Microsoft Word like documents, or even slides for a presentation, R Markdown tailors to your needs.

Programming Level: Intermediate
Type: Reporting

Self-hosted LLMs: Running Your Own Inference Infrastructure

Programming Level: Intermediate
Type: Programming

Shiny Meets LLMs: Smarter App Experiences

Programming Level: Intermediate
Type: Programming

The Power of Databricks Genie Rooms… Data Discovery and Questions with Minimal Effort

Programming Level: Intermediate
Type: Programming

Why Use R?

This is a 1/2 day session that gives an overview of where and how R is used. Using a combination of lecture-based case studies, and hands-on practicals we’ll cover some of the latest developments in the R world. This course is intended to be interactive and is aimed at an organisation that is considering why (or why not) to move to R.

Programming Level: Foundation
Type: Management

Advanced Concepts in Shiny

Take your interactive {shiny} skills to the next level by creating more robust, responsive and maintainable applications. In this course, we’ll visit more advanced topics that can be used to improve the experience for both those producing the apps and those using them. Subjects will cover: additional ways to react to and validate user inputs; restructuring your app with modules; and an introduction to testing your {shiny} apps.

Programming Level: Advanced
Type: Reporting

Efficient Data Science in Python

In recent years Python has exploded onto the data-science scene, and with it has come a great swathe of data-oriented packages. However, as easy as these packages make analysis, using these tools efficiently requires much more know-how. By the end of this course participants will be able to locate and address bottlenecks in their data-science workflows, using a number of different techniques and tools.

Programming Level: Intermediate
Type: Programming

Managing Packages with Posit Package Manager

Package management is made simple with Posit Package Manager (PPM): manage your entire organisation’s packages from a single interface. PPM enables offline access to CRAN, PyPI, and Bioconductor via binaries, making installation of packages way faster for users, and consistent across your organisation. PPM also allows users to time-travel to previous versions of the package repository when needed. Allow us to introduce your data scientists to the reliability and flexibility of PPM.

Programming Level: Intermediate
Type: Management

Programming with Python

The benefit of using a programming language such as Python is that we can automate repetitive tasks. This course covers the fundamental techniques such as functions, for loops and conditional expressions. By the end of this course, you will understand what these techniques are and how they can be applied to solve real-world data wrangling tasks.

Programming Level: Intermediate
Type: Programming

Reporting with Quarto

Do you create interactive documents that always need to be updated when the data changes? Then this course is for you. In this course you will learn how to use Quarto to create high quality, dynamic, fully reproducible documents. Quarto is a multi-language open source publishing tool that allows for the creation of dynamic content with Python, R, Julia and Observable.

Programming Level: Intermediate
Type: Reporting

Shiny for Python

Do you want to provide interactive visualisation and data exploration features for users who do not have Python and data science skills? Discover how easy it can be to use Python and Shiny to create your own apps and dashboards for exploring data without relying on web development or external BI tools. We will show you various examples of input widgets and outputs to display tables and visualisations.

Programming Level: Intermediate
Type: Reporting

Machine Learning with Tidymodels

Machine learning is the process of applying statistical techniques to gain systematic information about a quantity of interest. We will be specifically focusing on how we can use the {tidymodels} suite of packages to implement these techniques. We cover key reasons for model fitting, such as prediction and inference, on quantitative and qualitative responses.

Programming Level: Intermediate
Type: Stats/ML

Git for Organisations

Git is perfect for working collaboratively. However, when you’re working together in an organisation, it’s important to have rules and processes so you all know how to work together. We’ll share what works for us at Jumping Rivers, and then guide you through the process of coming up with your own rules and processes for git, such as choosing a branching strategy and formalising your code review.

Programming Level: Intermediate
Type: Version Control

Introduction to Git

When working on data analysis projects version control is essential, for tracking project progress and assisting project collaboration. During this course we will show you multiple ways to integrate version control into your project with git. You will gain an understanding of how to use online code sharing websites such as GitHub / GitLab, along with the best practices while doing so.

Programming Level: Foundation
Type: Version Control

Introduction to SQL

The Structured Query Language (SQL) defines a standard for communicating with a relational database. In this half-day introductory course, participants will learn the basic SQL syntax for data extraction, filtering and insertion. We will then discuss some considerations for working with databases on the cloud, and finish by learning basic techniques for joining tables.

The course can be taken either independently or as a precursor to our Intro to SQL with R and Intro to SQL with Python courses.

Programming Level: Foundation
Type: Programming

Object-Oriented Programming in Python

Object-oriented programming is the dominant programming paradigm in Python and can be used to improve the structure of your data science code. Here, we will learn how to model real-world entities using classes, how to create class instances, and how to attach data and behaviour to these objects. The main ideas of object-oriented design (inheritance, polymorphism, encapsulation, abstraction) are covered, and you will learn how to extend existing classes from well-known data science packages.

Programming Level: Advanced
Type: Programming

Big Data Analytics with PySpark

Tools such as pandas offer a powerful way to manipulate and analyse data in Python. However, if you need to process a large dataset, a single machine might not cut it. Apache Spark is an analytics engine for processing large volumes of data on a computer cluster. It comes with a Python interface, PySpark, enabling those familiar with Python to easily get started with Spark for big data. This course will introduce data science at scale with the PySpark DataFrames API and Spark MLlib.

Programming Level: Intermediate
Type: Programming

Data Exploration with Tableau

Tableau is more than just a simple data visualisation tool. It also gives people the capability to manipulate multiple data sources, create custom charts, build predictive models, and turn their plots into interactive dashboards and presentations. Designed for people with some experience of Tableau, this course will showcase what Tableau can do beyond basic data visualisation.

Programming Level: Intermediate
Type: Reporting

Functional Programming with {purrr}

This is a one-day course on the {tidyverse} package, {purrr}. {purrr} is a very powerful package that gives great flexibility to analysts, by enhancing R’s functional programming toolkit. We will demonstrate how to use functions such as map(), map2() and pmap(), to iteratively map functions over multi-element objects like vectors and lists. Emphasis will also be placed on how we can manipulate list outputs and how this can be applied to our data.

Programming Level: Foundation
Type: Programming

Introduction to Tableau

Faster and more capable of handling larger datasets than Excel, Tableau is quickly becoming a valuable tool for individuals and organisations who want to leverage their data. It’s more user-friendly and simpler to learn than programming languages, but still allows a high-level of customisation. This course is designed for people with no prior experience of Tableau, who want to get to grips with the basics of summarising and interactively visualising their data.

Programming Level: Foundation
Type: Reporting

Managing Requirements

Communicating is difficult, especially when combining technical and non-technical teams.

In this workshop we introduce methods for users to effectively communicate what they want from an application, while allowing for developers to specify how they provide it. We discuss how to prioritise and estimate features collaboratively in order to deliver the most value the soonest. By providing a means to understand each team’s needs, we highlight the role that each person intuitively plays.

Programming Level: Foundation
Type: Management

Responsive Web Design in Shiny

Shiny makes it easy to view data by creating an interactive webpage. But the way we access web content has evolved ever since smartphone and tablet devices became affordable. Now responsive web design is used across the internet to ensure that webpages will dynamically arrange their content to best suit the device’s dimensions. This course discusses how to design responsive Shiny apps that work effectively on displays of all shapes and sizes.

Programming Level: Advanced
Type: Reporting

Text Mining in R

Want to learn how to get the most out of text data? Today, a lot of data produced contains unstructured text, which can be difficult to transform and analyse without the correct knowledge and tools. In this course you will learn the basics of manipulating and transforming text data as well as how to extract meaning and sentiment in R, using packages such as {stringr} and {tidytext}.

Programming Level: Intermediate
Type: Programming

Web Accessibility in Shiny

A Shiny app won’t meet Web Content Accessibility Guidelines (WCAG) standards right out of the box. There’s a few things you’ll need to consider before your Shiny app is accessible to all. In this course, we’ll demonstrate some common accessibility requirements, the assistive technologies that may be used, and the design adjustments we can make to accommodate those needs.

Programming Level: Advanced
Type: Reporting

Introduction to Bayesian Inference using RStan

Despite the promise of big data, inferences are often limited by its systematic structure. Only by carefully modelling this structure can we take full advantage of the data. Stan is a platform for facilitating this modelling, providing an expressive modelling language to implement state-of-the-art algorithms, to draw subsequent Bayesian inferences. This course will teach participants how to interface with Stan through R!

Programming Level: Intermediate
Type: Stats/ML, Programming

Introduction to Bayesian Inference using PyStan

The course will teach participants how to interface with Stan through Python!

Programming Level: Intermediate
Type: Stats/ML, Programming

Rust Programming

Explore the power and efficiency of Rust, a modern language designed for speed, security, and low-level memory access. This course is ideal for experienced developers looking to enhance their skills and productivity by leveraging Rust’s unique capabilities.

Programming Level: Intermediate
Type: Programming

Data Visualisation with Python

Python has a number of packages for the effective creation of graphics to communicate your data insights. This course will examine two popular libraries for creating static 2D plots: Matplotlib and Seaborn. During the training session, we’ll cover plotting basics and customisation of figures with Matplotlib, before moving onto complex statistical visualisations with Seaborn.

Programming Level: Intermediate
Type: Analytics

Statistical Modelling with R

From the very beginning, R was designed for statistical modelling. Out of the box, R makes standard statistical techniques easy. This course covers the fundamental modelling techniques. We begin the day by revising hypotheses tests, before moving onto ANOVA tables and regression analysis. The class ends by looking at more sophisticated methods such as clustering and principal components analysis (PCA).

Programming Level: Intermediate
Type: Stats/ML

Introduction to Posit Workbench

Posit Workbench takes all of the features you love about the RStudio IDE and puts them in the cloud. Workbench also enables real-time collaboration, support for both R and Python development environments and secure, concurrent sessions. Our trainers are excited to introduce your organisation to a whole new world with Posit Workbench.

Programming Level: Foundation
Type: Management

Spatial Data Analysis with R

As spatial data sets get larger, more sophisticated software needs to be harnessed for their analysis. R is now a widely used open source software platform for working with spatial data thanks to its powerful analysis and visualisation packages. The focus of this course is providing participants with the understanding needed to apply R’s powerful suite of geographical tools to their own problems.

Programming Level: Advanced
Type: Analytics

Advanced Machine Learning with Tidymodels

A course that builds on the material covered in our Machine Learning with Tidymodels course. We take a look at how we can fit linear discriminant analysis (LDA) models using {discrim}, assessing model reliability using V-fold cross validation, pre-processing, tree-based models & more. If you wish to explore the abundance of model fitting techniques {tidymodels} has to offer, then this course is certainly for you!

Programming Level: Advanced
Type: Stats/ML

An Introduction to SQL with R

Using databases is a fundamental part of a data scientist’s role. The main focus of this training course is to introduce SQL databases, write your first SQL queries, and show how R can be used to retrieve and manipulate data stored in a relational database. The course uses both the {DBI} and {dbplyr} packages.

We use the PostgreSQL database as an example for public courses. For in-house training, we are happy to adapt the course to match your database requirements.

Programming Level: Intermediate
Type: Programming

Building an R Package

This is a one-day intensive course on building a package in R. The focus will be on getting a working R package ready for distribution. This includes automating package setup and consistent package structure with {usethis}. You will be able to use the {testthat} workflow to create tests for packages.

Programming Level: Advanced
Type: Programming

Efficient R Programming

This course is for anyone who wants to make their R code faster to type, faster to run and more scalable. During the course, we’ll cover the main R sins (and how to avoid them), dabble with hardware, look at running in parallel and think about efficient R data structure. This course should be useful to people with a range of skill levels.

Programming Level: Advanced
Type: Programming

Introduction to Docker

This is a one-day Docker course aimed at R users. Docker is a popular platform for packaging, deploying, and running applications. These applications run in containers. Crucially, this container can be used on any system: a developer’s laptop, systems on premises, or in the cloud. Applications are packaged as images that contain everything needed to run them: code, libraries, and configuration.

Programming Level: Intermediate
Type: Programming

Introduction to Posit Connect

Posit Connect (formally RStudio) is an enterprise-grade publishing platform which gives you, the user, the ability to easily share code, documents and applications with collaborators, colleagues and clients. By the end of this course participants will be able to deploy their content to Posit Connect, manage its access and settings, and tune how this content scales with usage.

Programming Level: Intermediate
Type: Management

Introduction to SQL with Python

Using databases is a fundamental part of a data scientist’s role. This training course introduces SQL databases and the SQL command syntax, and shows how Python can be used to retrieve and manipulate data held in a relational database. The course also discusses how SQLAlchemy can be used to define and interact with databases using object-oriented Python code.

We use a PostgreSQL database as an example, and communicate with this using a psycopg2 connection.

Programming Level: Intermediate
Type: Programming

Machine Learning with Python

Python (along with R) has become the dominant language in machine learning and data science. This course will equip you with the knowledge and tools to undertake a variety of tasks in a standard machine learning pipeline. We stress the importance of data preparation, both in terms of data standardisation and feature selection, before tackling model building.

We run a separate course on using Tensorflow with Python.

Programming Level: Intermediate
Type: Stats/ML

Object Oriented Programming in R

The training course will cover R object-oriented programming techniques. We’ll discuss what OOP is and the different varieties within R. Beginning with the popular S3 and S4 OOP frameworks, we’ll finish with the new {R6} package that is used extensively in Shiny applications. By the end of the course, participants will be able to use OOP within their own code.

Programming Level: Advanced
Type: Programming

Python and Tensorflow

Deep learning is a cutting-edge machine learning technique for classification and regression. In the past few years, it has produced state-of-the-art results in fields such as image classification, natural language processing, bioinformatics and robotics. This course will cover the main ideas of deep learning, and how to implement it in practice with tensorflow: a software framework for efficient and scalable deep learning.

Programming Level: Intermediate
Type: Programming

PyTorch with Python

Python (along with R) has become the dominant language in machine learning and data science. PyTorch is an open-source machine learning library for Python, based on Torch, used for applications such as natural language processing. It is primarily developed by Facebook’s artificial-intelligence research group, and Uber’s “Pyro” software for probabilistic programming is built on it.

Programming Level: Intermediate
Type: Stats/ML

Tidy Evaluation in R

This is a one-day course comprising of methods for tidy evaluation in R. We introduce the {rlang} package as a way of parsing variables from a data set into a function. Furthermore, we cover environments and function-evaluation in R, to help you understand how the tools in {rlang} work under the hood.

Programming Level: Advanced
Type: Programming

Time Series Analysis with R

Predicting the future is a tough problem. Time series analysis makes it possible to assess whether or not predictions are possible and, if they are, build a model which can generate informed predictions for the future with realistic estimates of uncertainty. This training course will introduce participants to the packages in the Tidyverts.

The best qualification of a prophet is to have a good memory – George Savile

Programming Level: Intermediate
Type: Stats/ML, Analytics

No courses match your current set of filters.

Course Catalogue

Filter Courses

Matches

Language

Programming experience

Course type