Your first D3 visualisation with {r2d3} and Scooby-Doo

Get the code for this blog on GitHub What is this tutorial and who is it for? This tutorial is aimed mainly at R users who want to learn a bit of D3, and specifically those who are interested in how you can incorporate D3 into your existing workflows in RStudio. It will gloss over a lot of the fundamentals of D3 and related topics (JavaScript, CSS, and HTML) to fast-forward the process of creating your first D3.

Understanding the Parquet file format

Apache Parquet is a popular column storage file format used by Hadoop systems, such as Pig, Spark, and Hive. The file format is language independent and has a binary representation. Parquet is used to efficiently store large data sets and has the extension .parquet. This blog post aims to understand how parquet works and the tricks it uses to efficiently store data. Key features of parquet are: it’s cross platform it’s a recognised file format used by many systems it stores data in a column layout it stores metadata The latter two points allow for efficient storage and querying of data.

Webinars: Practical Advice for R in Production

Many organisations have a robust infrastructure that allows their data science teams to provide, fast and reliable insights. But for many groups, they are just starting down this path. We, Jumping Rivers, have partnered with RStudio to launch a two-part webinar series which examines and explores the usage of R in production environments. The first webinar will discuss the big picture of using open source languages and tools in enterprise environments.

Cleaning up forked GitHub repositories with {gh}

One great thing about using GitHub is the ability to view and contribute to others’ code. Even the code underlying many of our favourite packages is available for us to examine and play around with. Forking a repository is a great way to create an exact replica of someone else’s project in our own user space. We can then freely make changes to this copy without affecting the original project. If you end up especially proud of your changes, you can then submit a Pull Request to offer them up to the owner of the original repository.

Job vacancies at Jumping Rivers!

In line with the continuous growth at Jumping Rivers, we are looking to expand our team of dedicated professionals working in our teams. If you are enthusiastic and keen to develop your skills in cutting edge data science or infrastructure please read on! Who are we and what do we do? Jumping Rivers is an analytics company whose passion is data and machine learning. We help our clients move from data storage to data insights.

Jumping Rivers 2021 Online Training Schedule

Good news! In tandom with the loosening of lockdown restrictions, Jumping Rivers has released the updated 2021 public, online training course schedule. We are offering courses across multiple programming languages, including R, Python, Stan, Scala and git. In the past year, we have converted all of our courses to be online friendly and have recieved great feedback in relation to interactivity, course structure and overall attendee satisfaction. Some examples of feedback we have recieved can be seen below:

New features in R 4.1.0

R-4.1.0 is released! Rejoice! A new R release (v 4.1.0) is due on 18th May 2021. Typically most major R releases don’t contain that many new features, but this release does contain some interesting and important changes. This post summarises some of the notable changes introduced. More detail on the changes can be found at the R changelog. Declining support for 32-bit Windows The 4.1.x series will be the last to support 32-bit Windows systems.

Tips & tricks when moving to Hugo

Over Christmas we moved our main site from Wordpress to Hugo & Netlify. The main benefits for us moving to Hugo were Security. We were always getting emails about various Wordpress plugins. As our site was essentially static, this was an additional maintenance task. Site-speed. Although Wordpress has lots of clever plugins for optimising site-speed (which then leads to the situation above); Wordpress is just “big”. Raw cost. By this I mean web-site fees.

Default knitr options and hooks

This is part four of our four part series Part 1: Specifying the correct figure dimension in {knitr}. Part 2: What image format should you use for graphics. Part 3: Including external graphics in your document Part 4: Setting default {knitr} options (this post). As with many aspects of programming, when you are working by yourself you can be (slightly) more lax with styles and set-up. However, as you start working in a team, different styles can quickly become a hindrance and lead to errors.

Job: Shiny Developer

We are currently developing a SAAS Shiny application. We have a prototype that is functional, but not ready for release. Your role will be to refactor the application, and push it towards public release. The core requirements for this role is Shiny experience, plus CSS and Javascript. If you have experience in deployment that’s great, but isn’t required. This Shiny application will be your main role, but not your only one.