Public Health Scotland: Support for scientific SPSS functions transitioning to R

The Challenge

Public Health Scotland (PHS) are discontinuing their use of the SPSS programming tool and strategically moving to more open source tools such as R and Python in order to save on licensing costs and to have access to the more advanced capability that R and Python provides. As part of this changeover process, PHS identified a skills gap in their ability to transfer from one programming language to another.

PHS purchased 40 days of consultancy work from Jumping Rivers to cover five SPSS to R code transformation projects. The migration work had to be completed within a four month timeframe in order to meet an important deadline, i.e. the annual renewal date for the SPSS licences.

The Project

There were approximately 10,000 lines of SPSS syntax in around 20 syntax files that needed to be re-written in R to produce the same output as before. Much of the syntax covered “data wrangling”, i.e. bringing the data together from numerous raw data sources, but there was also some modelling, visualisation and database query syntax to convert.

Jumping Rivers allocated two R consultants to work on the individual projects, taking approximately five to ten person days on each. Firstly the consultants needed to familiarise themselves with the “intention” of the application involved so R code could be appropriately deployed to produce the same output. Then the data sources had to be accessed, R code written and tested before hand over to each project team.

Our Results

The project was delivered on time and on budget and allowed PHS to cancel their licence commitment. Apart from the benefit of reducing costs, PHS also found significant performance improvements in some areas. For example an SPSS routine that took 2 days per month of effort to produce is now being run in 15 minutes.