Tuesday 25 June 2013

rClr: low level access to .NET from R

rClr is a package to access arbitrary .NET code seamlessly. The "CLR" acronym part of the package name stands for Common Language Runtime. C# and R being languages I regularly use, I have felt the need for better interoperability between these for a few years. What started as week-end investigation out of curiosity grew to rClr. There has already been a few rounds of beta releases and it is quite functional running on Windows and using the Microsoft .NET Framework, hence this post. I used it regularly for my work for the past 9 months. Running on other operating systems with the Mono  CLR is also supported and is almost at feature-parity. After a bit more testing a tarball will be available.

A new beta version of the binary for Windows package is currently available at rClr on Codeplex, alongside the source code under LGPL 2.1. While likely to work as is on many Windows boxes, you may need to install the latest Microsoft Visual C++ runtime. Instructions on how to do this are at the web site.

A quick tour with some sample code, starting with a customary "Hello world" with a bit of GUI for good measure.
 The following sample shows that some of the package functions help to discover the content of loaded assemblies (i.e. .NET dynamic libraries), to reduce the need to get back to the source code.
A "complex" .NET object is essentially an external pointer (structure similar to that in rJava)
The package is designed to allow access to existing .NET code without modification to that code (well, for code well designed for access anyway). rClr is also designed to be made as intuitive as possible for users accustomed to R programming idioms. A corollary of that design is that data types are converted to their natural representation in each runtime whenever possible without ambiguity. The following table gives the conversion table for the most used unidimensional vector. This is not an exhaustive list of supported conversions.


mode type class length clrType
character character character 3 System.String[]
numeric integer integer 3 System.Int32[]
numeric double numeric 3 System.Double[]
logical logical logical 3 System.Boolean[]
numeric double Date 3 System.DateTime[]
numeric double POSIXct 3 System.DateTime[]
character character character 1 System.String
numeric integer integer 1 System.Int32
numeric double numeric 1 System.Double
logical logical logical 1 System.Boolean
numeric double Date 1 System.DateTime
numeric double POSIXct 1 System.DateTime

I've used  rClr to access environmental time stepping models in C#, to combine it with the statistical and visualisation strengths of R. One of the tutorials on the web site is a self-contained simplified use case.




Roadmap

I am presenting at the useR conference in a couple of weeks. First attendance, and really looking forward to meet a new crowd.
A few wrinkles needs ironing out for a first stable release of course, notably for running on *nix and MacOS (I "only" develop and test on a Debian box). Trailblazing testers and contributors are very welcome. The build process is inherently more complicated than your average package but this is alleviated with configure scripts. You can post questions/discussions through the web site.
Submission to CRAN is probably the next big item on the list, in preference to more features. While codeplex is fine for my codebase management needs it is not a typical go-to place for R users.

Acknowledgements

I gratefully acknowledge Kosei Abe for the nicely crafted R.NET library that is in places reused in the rClr package. R.NET is primarily designed for .NET developers to access the R engine, but I envisaged a growing role for it in rClr.
The package rJava by Simon Urbanek and other contributors also was a natural source of insight in my early investigations on how to tackle in-process interop of R and .NET.
Simon Knapp a few years ago presented a neat way to mix in-process R code with .NET via Python for .NET, and this led to the idea of the rClr package.