The tidyverse is essential for any data scientist who deals with data on a day-to-day basis. By focusing on small key tasks, the tidyverse suite of packages removes the pain of data manipulation. The tidyverse allows you to import, tidy, transform, manipulate and visualise data. This course covers key tidyverse areas, such as {dplyr}, {lubridate}, {tidyr} and tibbles.

- Programming Level: Foundation
- Type: Analytics

The tidyverse is essential for any statistician or data scientist who deals with data on a day-to-day basis. By focusing on small key tasks, the tidyverse suite of packages removes the pain of data manipulation. This course takes the next steps in using the tidyverse and examines how and where to use packages such as {purrr}, {stringr}, {forcats} and {tidytext} in an analysis.

- Programming Level: Intermediate
- Type: Programming

Python is a powerful, general-purpose programming language that plays well with others, runs everywhere, is friendly and easy to learn. This two-day intensive course will introduce you to the language and equip you with the tools to manipulate, visualise and summarise your data.

- Programming Level: Foundation
- Type: Programming

When working on data analysis projects version control is essential, for tracking project progress and in aiding project collaboration. Fortunately it is now easier than ever before to integrate version control into your project, using RStudio’s interface to the version control software git and online code sharing websites such as GitHub / GitLab.

- Programming Level: Foundation
- Type: Programming

An important aspect of managing workflow in data science is being able to work in tandem with your colleagues! This course outlines how effective git is as a tool for version control in collaborative projects. We will be making use of the RStudio git interface and remote project hosting platforms, such as Github and Gitlab.

- Programming Level: Foundation
- Type: Programming

This is a one-day intensive course on R and assumes no prior knowledge. By the end of the course, participants will be able to import, summarise and plot their data. At each step, we avoid using “magic code”, and stress the importance of understanding what R is doing.

- Programming Level: Foundation
- Type: Programming

This is a one-day intensive course on advanced graphics with R. The standard plotting commands in R are known as the base graphics, but are starting to show their age. In this course, we cover more advanced graphics packages - in particular, {ggplot2}. The {ggplot2} package can create advanced and informative graphics.

- Programming Level: Intermediate
- Type: Graphics

The benefit of using a programming language such as R is that we can automate repetitive tasks. This course covers the fundamental techniques such as functions, for loops and conditional expressions. By the end of this course, you will understand what these techniques are and when to use them. This is a one-day intensive course on R.

- Programming Level: Intermediate
- Type: Programming

Despite the promise of big data, inferences are often limited by its systematic structure. Only by carefully modelling this structure can we take full advantage of the data. Stan is a platform for facilitating this modelling, providing an expressive modelling language to implement state-of-the-art algorithms, to draw subsequent Bayesian inferences. The course will teach participants how to interface with Stan through R!

- Programming Level: Intermediate
- Type: Programming

Despite the promise of big data, inferences are often limited by its systematic structure. Only by carefully modelling this structure can we take full advantage of the data. Stan is a platform for facilitating this modelling, providing an expressive modelling language to implement state-of-the-art algorithms, to draw subsequent Bayesian inferences. The course will teach participants how to interface with Stan through Python!

- Programming Level: Intermediate
- Type: Programming

This is a one-day intensive course on the R package {shiny}. Shiny allows you to create cutting-edge interactive web-graphics. From the Shiny documentation ‘Shiny makes it incredibly easy to build interactive web applications with R. Automatic ‘reactive’ binding between inputs and outputs and extensive pre-built widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.’

- Programming Level: Intermediate
- Type: Graphics

Do you want to dynamically create static or interactive documents? Do you want your reports to automatically update when the data changes? Then this session is for you! R Markdown is easy to use and allows for dynamic report generation. Whether you are hoping to generate HTML, PDF or Microsoft Word like documents, or even slides for a presentation, R Markdown tailors to your needs.

- Programming Level: Intermediate
- Type: Graphics

This is a one-day intensive course on Python and assumes no prior knowledge. By the end of the course, participants will be able to import, summarise and plot their data. At each step, we avoid using “magic code”, and stress the importance of understanding what Python is doing.

- Programming Level: Foundation
- Type: Programming

The benefit of using a programming language such as R is that we can automate repetitive tasks. This course covers the fundamental techniques such as functions, for loops and conditional expressions. By the end of this course, you will understand what these techniques are and when to use them.

- Programming Level: Intermediate
- Type: Programming

Python has a number of packages for the effective creation of graphics to communicate your data insights. This one day course will examine a range of packages for building impactful visualisations. During the training session, we’ll cover the main Python plotting libraries: plotly, matplotlib and seaborn. Additionally, we discuss how to effectively use faceting and layers in a graphic.

- Programming Level: Intermediate
- Type: Graphics

From the very beginning, R was designed for statistical modelling. Out of the box, R makes standard statistical techniques easy. This course covers the fundamental modelling techniques. We begin the day by revising hypotheses tests, before moving onto ANVOA tables and regression analysis. The class ends by looking at more sophisticated methods such as clustering and principal components analysis (PCA).

- Programming Level: Intermediate
- Type: Analytics

As spatial data sets get larger, more sophisticated software needs to be harnessed for their analysis. R is now a widely used open source software platform for working with spatial data thanks to its powerful analysis and visualisation packages. The focus of this course is providing participants with the understanding needed to apply R’s powerful suite of geographical tools to their own problems.

- Programming Level: Advanced
- Type: Analytics

This is a two-day intensive course on advanced R programming. The training course will not only cover advanced R programming techniques, such as S3/S4 objects, reference classes and function closures, we will spend significant time discussing why and where these methods are used. By the end of the course, participants will be able to use OOP within their own code.

- Programming Level: Advanced
- Type: Programming

Using databases is a fundamental part of a data scientists role. The main focus of this training course is to introduce SQL databases and how R can be used to retrieve and manipulate data stored in a relational database. We use the PostgresSQL database as an example for public courses. For in-house training, we are happy to adapt the course to match your database requirements.

- Programming Level: Intermediate
- Type: Programming

This is a one-day intensive course on building a package in R. This course will be a mixture of lectures and computer practicals. The main focus will be getting a working R package ready for distribution.

- Programming Level: Advanced
- Type: Programming

Docker is a popular platform for packaging, deploying, and running applications. These applications run in containers. Crucially, this container can be used on any system: a developer’s laptop, systems on premises, or in the cloud. Applications are packaged as images that contain everything needed to run them: code, libraries, and configuration.

- Programming Level: Intermediate
- Type: Programming

In recent years Python has exploded onto the data-science scene, and with it has come a great swathe of data-oriented packages. However, as easy as these packages make analysis, using these tools efficiently requires much more know-how. By the end of this course participants will be able to locate and address bottlenecks in their data-science workflows, using a number of different techniques and tools.

- Programming Level: Intermediate
- Type: Programming

This course is for anyone who wants to make their R code faster to type, faster to run and more scalable. During the course, we’ll cover the main R sins (and how to avoid them), dabble with hardware, look at running in parallel and think about efficient R data structure. This course should be useful to people with a range of skill levels.

- Programming Level: Advanced
- Type: Programming

We are very happy to announce that following “Jumping Rivers: Bayesian Inference using Stan”, Michael Betancourt, a core developer of Stan, is running a series of 5 modules for principled statistical modelling with Stan. This module introduces Gaussian processes as a statistical modelling technique, motivating principled prior models that avoid pathological behaviour. For full event information and booking details, please visit the event page

- Programming Level: Advanced
- Type: Analytics

We are very happy to announce that following “Jumping Rivers: Bayesian Inference using Stan”, Michael Betancourt, a core developer of Stan, is running a series of 5 modules for principled statistical modelling with Stan. This module introduces exchangeability and hierarchical models with a strong focus on the inherent identifiability issues and their computational consequences, as well as strategies for moderating this issues. Completion of the Regression Modelling module is recommended.

- Programming Level: Advanced
- Type: Analytics

The capturing and quantification of uncertainty is a very important aspect of model-fitting and parameter inference. Bayesian inference represents a fully-probabilistic approach to parameter inference, allowing a practitioner to quantify their uncertainties through probability densities. However, fitting models in a Bayesian framework can be an involved and complicated affair, often necessitating the use of Markov chain Monte Carlo (MCMC) algorithms and their programmatic implementation.

- Programming Level: Foundation
- Type: Programming

RStudio Connect is an enterprise-grade publishing platform which gives you, the user, the ability to easily share code, documents and applications with collaborators, colleagues and clients. By the end of this course participants will be able to deploy their content to RStudio Connect, manage its access and settings, and tune how this content scales with usage.

- Programming Level: Intermediate
- Type: Programming

Python (along with R) has become the dominant language in machine learning and data science. This two-day intensive course will equip you with the knowledge and tools to undertake a variety of tasks in a standard machine learning analytics pipeline. We stress the importance of data preparation, both in terms of data standardisation and feature selection, before tackling model building. We run a separate course on using Tensorflow and Keras with Python.

- Programming Level: Intermediate
- Type: Analytics

This two-day course is aimed at not only teaching an understanding of some of the most common machine learning techniques, but also the approach to implementing machine learning. During this course, attendees will learn how to define a problem and prepare data, the range of techniques available for solving common problems and the approaches to take to evaluate models and achieve the best results possible.

- Programming Level: Intermediate
- Type: Analytics

We are very happy to announce that following “Jumping Rivers: Bayesian Inference using Stan”, Michael Betancourt, a core developer of Stan, is running a series of 5 modules for principled statistical modelling with Stan. This module introduces conditional exchangeability, marginal exchangeability, and multifactor modelling (also known as multilevel or random effects modelling) with a focus on efficient implementations. Completion of the Regression Modelling and Hierarchical Modelling modules is highly recommended.

- Programming Level: Advanced
- Type: Analytics

We are very happy to announce that following “Jumping Rivers: Bayesian Inference using Stan”, Michael Betancourt, a core developer of Stan, is running a series of 5 modules for principled statistical modelling with Stan. In this module we review a principled Bayesian workflow that guides the development of statistical models suited to the particular details of a given application. For full event information and booking details, please visit the event page

- Programming Level: Advanced
- Type: Analytics

Deep learning is a cutting-edge machine learning technique for classification and regression. In the past few years, it has produced state-of-the-art results in fields such as image classification, natural language processing, bioinformatics and robotics. This course will cover the main ideas of deep learning, and how to implement it in practice with tensorflow: a software framework for efficient and scalable deep learning.

- Programming Level: Intermediate
- Type: Programming

Python (along with R) has become the dominant language in machine learning and data science. PyTorch is an open-source machine learning library for Python, based on Torch, used for applications such as natural language processing. It is primarily developed by Facebook’s artificial-intelligence research group, and Uber’s “Pyro” software for probabilistic programming is built on it.

- Programming Level: Intermediate
- Type: Analytics,Programming

Dealing with big data sets in R can be painful. One small mistake, and a seemingly trivial calculation makes our computer grind to a halt. This training course is a one-day intensive practical introduction to dealing with big data. Unfortunately, there are no easy answers. So we’ll take you through the different possible strategies you might employ, clearly highlighting the positives and negatives of each.

- Programming Level: Advanced
- Type: Programming

Jane produces reports both weekly progress, monthly, quarterly and annual overviews for management and the board. She uses a variety of licensed software/tools because each one has limitations. This course aims to take each individual through the fundamental approach to using R programming in her current role. By the end of the course the individual will be working towards automating all of their reports.

- Programming Level: Foundation,Intermediate
- Type: Analytics,Graphics

We are very happy to announce that following “Jumping Rivers: Bayesian Inference using Stan”, Michael Betancourt, a core developer of Stan, is running a series of 5 modules for principled statistical modelling with Stan. This module presents linear and general linear regression techniques from a modelling perspective, using that context to motivate robust implementations. We will especially emphasize principled prior modelling strategies for linear, log, and logistic regression models. For full event information and booking details, please visit the event page

- Programming Level: Advanced
- Type: Analytics

This course is aimed at statisticians and data scientists already familiar with a dynamic programming language (such as R, Python or Octave). Scala is a free modern, powerful, strongly-typed, functional programming language. In particular, it is fast and efficient, runs on the Java virtual machine (JVM), and is designed to easily exploit modern multi-core and distributed computing architectures.

- Programming Level: Advanced
- Type: Programming

This course is a practical introduction to some of every day and more sophisticated tools used for the analysis of survival data.

- Programming Level: Intermediate
- Type: Analytics

Predicting the future is a tough problem. Time series analysis makes it possible to assess whether or not predictions are possible and, if they are, build a model which can generate informed predictions for the future with realistic estimates of uncertainty. This training course will introduce participants to the packages in the Tidyverts. The best qualification of a prophet is to have a good memory – George Savile

- Programming Level: Intermediate
- Type: Analytics

This is a 1/2 day session that gives an overview of where and how R is used. Using a combination of lecture-based case studies, and hands-on practicals we’ll cover some of the latest developments in the R world. This course is intended to be interactive and is aimed at an organisation that is considering why (or why not) to move to R.

- Programming Level: Foundation
- Type: Programming

Moving you from data storage to data insights with our expert training courses.

Contact our support team if you have any questions about a specific course or if you need a course creating tailored to your needs.