The OpenCPU System

Towards a universal interface for scientific computing

Jeroen Ooms
UCLA Statistics

What is OpenCPU

The OpenCPU system exposes an HTTP API for scientific computing to build scalable analysis and visualization modules for use in systems, pipelines, and web applications.

Hello World! Basic JSON RPC

curl https://public.opencpu.org/ocpu/library/stats/R/rnorm/json \
-H "Content-Type: application/json" -d '{"n":3, "mean": 10, "sd":10}'

[4.9829, 6.3104, 11.411]

This maps to the following request

#library(jsonlite)
args <- fromJSON('{"n":3, "mean": 10, "sd":10}')
output <- do.call(stats::rnorm, args)
toJSON(output)

Which is equivalent to this function call

rnorm(n=3, mean=10, sd=10)

gears2

What OpenCPU does:

  • Interoperable HTTP for data analysis
  • RPC and object management
  • I/O: JSON, Protocol Buffers, CSV
  • Support for parallel/async requests
  • Highly configurable security policies
  • Native reproducibility
  • Client libraries: JavaScript, Ruby, ...

OpenCPU does not

  • No predefined widgets
  • No special programming paradigms
  • No need to manage processes, users, code evaluation, etc. Yet state and privacy!

Beyond widgets...

stockplot

Keys and objects

curl -v https://demo.ocpu.io/stocks/R/smoothplot -d 'ticker="GOOG"&from="2013-01-01"'

> POST /stocks/R/smoothplot HTTP/1.1
> User-Agent: curl/7.30.0
> Content-Type: application/x-www-form-urlencoded

< HTTP/1.1 201 Created
< Location: https://tmp.ocpu.io/x081cca8c23/
< Cache-Control: max-age=300, public
< Access-Control-Allow-Origin: *
< X-ocpu-session: x081cca8c23
< X-ocpu-r: R version 3.1.0 (2014-04-10)
< X-ocpu-locale: en_US.UTF-8
< X-ocpu-time: 2014-06-26 17:29:32 PDT
< X-ocpu-version: 1.4.3
< x-ocpu-cache: MISS

State in OpenCPU

social

  • Each requests stateless (HTTP)
  • No single, permanent R process

Instead: "functional state"

  • Each RPC stores object and returns key. No side-effects.
  • Use key to retrieve or re-use stored object

Privacy?

  • No users! Each key is secret
  • Keys initially only known to creator
  • But: you are free to share/publish keys
  • Basis of "social" analysis

OpenCPU apps: JavaScript Client

  • App is simply a package with web pages
  • Web pages call R functions via Ajax
//JavaScript client code
var ticker = $("#ticker").val();    
var req = $("#plotdiv").rplot("smoothplot", {
    ticker : ticker,
    from : "2013-01-01"
})

Results in:

smoothplot(ticker=ticker, from="2013-01-01")

Which is the basis of the stocks app and this jsfiddle.

OpenCPU and RStudio Server

rstudio

API testing page

testttool

Motivation: Compare to language bridges

Bridges to R are available for most popular languages and environments:

  • RInside (C++)
  • rpy2 (python)
  • JRI (Java)
  • RinRuby
  • rApache
  • littler
  • RServe (socket)
  • RDCOM (windows, excel)

So why would you want to use OpenCPU?

Motivation: Difficulties with language bridges

An hello-world example from the rserve manual:

RConnection c = new RConnection();
double d[] = c.eval("rnorm(10)").asDoubles();
  • Client needs to generate R syntax
  • Client needs to read/manipulate internal R data types
  • Client needs to manage R processes
  • Limited exception handling
  • No concurrency
  • Result: high coupling
  • Need cross-language expert to get this to work
  • Fragmentation of efforts by language/environment

Towards an API

OpenCPU layers on a standardized application protocol (HTTP) to provide an API for statistical computing and visualization (with R, or something else...).

Benefits of HTTP

  • Mature, very flexible application protocol
  • Interoperable (both client and server!)
  • Distributed (using simple URLs)
  • Native exception handling (status codes)
  • Many features get built-in by design (caching, encryption, authentication, etc)
  • Clients widely available
  • Implemented in all browsers

Separation of concerns

  • API describes logic of data analysis
  • Independent of client/application
  • Independent of computational language
  • Same API could be implemented in Julia, Python, Matlab

Important API concepts:

  1. Objects
  2. Graphics
  3. Data
  4. Manuals
  5. Namespaces
  6. Function calls

The OpenCPU API

HTTP Methods

Current API uses GET and POST methods. Get is for retrieving objects, POST is for RPC.

Method Target Action Arguments Example
GET object read object control output format GET /ocpu/cran/MASS/data/cats/json
POST object call function function arguments POST /ocpu/library/stats/R/rnorm
GET file read file - GET /ocpu/cran/MASS/NEWS
GET /ocpu/cran/MASS/scripts/
POST file run script control interpreter POST /ocpu/cran/MASS/scripts/ch01.R
POST /ocpu/cran/knitr/examples/minimal.Rmd

Try it!

The OpenCPU API

HTTP Status Codes

HTTP Code When Returns
200 OK On successful GET request Resource content
201 Created On successful POST request Output location
302 Found Redirect Redirect Location
400 Bad Request R raised an error. Error message in text/plain
502 Bad Gateway Nginx (opencpu-cache) can't connect to OpenCPU server. (admin needs to look in error logs)
503 Bad Request Serious problem with the server (admin needs to look in error logs)

The OpenCPU API

Package resources

Path What Examples
../{pkgname}/ Package Information /ocpu/cran/MASS/
../{pkgname}/R/ Exported R objects /ocpu/cran/MASS/R/rlm
../{pkgname}/data/ Data included with this package. /ocpu/cran/MASS/data/cats
../{pkgname}/man/ Manuals (help pages) included in this package. /ocpu/cran/MASS/man/rlm
../{pkgname}/* Files in package installation directory /ocpu/cran/MASS/NEWS
/ocpu/cran/MASS/scripts/

The OpenCPU API

Session resources

Path What Example
../{key}/ List available output for this session. /ocpu/tmp/x08384729/
../{key}/R/ R objects stored in this session. /ocpu/tmp/x08384729/R/.val/json
../{key}/graphics/ Graphics generated in this session. /ocpu/tmp/x08384729/graphics/1/png
../{key}/source Input source code for this session. /ocpu/tmp/x08384729/source
../{key}/stdout Text printed to STDOUT in this session /ocpu/tmp/x08384729/stdout
../{key}/console Console I/O (combines source and stdout) /ocpu/tmp/x08384729/console
../{key}/zip Download session as a zip archive. /ocpu/tmp/x08384729/zip
../{key}/tar Download session as a gzipped tarball. /ocpu/tmp/x08384729/tar
../{key}/files/* Files in the working directory /ocpu/tmp/x08384729/files/mydata.csv

The OpenCPU API

Content-types

Format Content-type Encoder (+args) Example
print text/plain base::print /ocpu/cran/MASS/R/rlm/print
json application/json jsonlite::toJSON /ocpu/cran/MASS/data/cats/json
csv text/csv utils::write.csv /ocpu/cran/MASS/data/cats/csv
tab text/plain utils::write.table /ocpu/cran/MASS/data/cats/tab
rda application/octet-stream base::save /ocpu/cran/MASS/data/cats/rda
rds application/octet-stream base::saveRDS /ocpu/cran/MASS/data/cats/rds
pb application/x-protobuf RProtoBuf::serialize_pb /ocpu/cran/MASS/data/cats/pb
png image/png grDevices::png /ocpu/tmp/{key}/graphics/1/png
pdf application/pdf grDevices::pdf /ocpu/tmp/{key}/graphics/1/pdf
svg image/svg+xml grDevices::svg /ocpu/tmp/{key}/graphics/1/svg

The OpenCPU API

Executing Scripts

File extension Type Interpreter Arguments
file.r R script evaluate::evaluate -
file.tex latex tools::texi2pdf -
file.rnw knitr/sweave knitr::knit + tools::texi2pdf -
file.md markdown knitr::pandoc format (see ?pandoc)
file.rmd knitr/markdown knitr::knit + knitr::pandoc format (see ?pandoc)
file.brew brew brew::brew output (see ?brew)

Libraries

Path What
/ocpu/library/{pkgname}/ R packages installed in one of the global libraries on the server.
/ocpu/user/{username}/library/{pkgname}/ R packages installed in the home library of Linux user {username}.
/ocpu/cran/{pkgname}/ Interfaces to the R package {pkgname} that is current on CRAN.
/ocpu/bioc/{pkgname}/ Interfaces to the R package {pkgname} that is current on BioConductor.
/ocpu/github/{gituser}/{pkgname}/ R package {pkgname} in the master branch of the identically named repository from github user {gituser}.
/ocpu/tmp/{key}/ Temporary sessions, which hold outputs from a function/script RPC.

Trying OpenCPU

Free public demo server:

Single-user development server:

install.packages("opencpu")
library(opencpu)

Ubuntu Linux cloud server

#requires Ubuntu 14.04
sudo add-apt-repository ppa:opencpu/opencpu-1.4
sudo apt-get update
sudo apt-get install opencpu

Publish your packages/apps on ocpu.io

Learn more!