otpr

Build Status CRAN status DOI Codecov test coverage CRAN downloads Project Status: Active – The project has reached a stable, usable state and is being actively developed.

Overview

otpr is an R package that provides a wrapper for the OpenTripPlanner (OTP) API. To use otpr you will need a running instance of OTP. The purpose of the package is to submit a query to the relevant OTP API resource, parse the OTP response and return useful R objects. The package is aimed at both new and expert users of OTP. The key parameters needed to query each supported API resource are provided as (or via) otpr function arguments. The argument values submitted by the user are comprehensively checked prior to submission to OTP to ensure that they are valid and make sense in combination, with feedback provided as appropriate. This makes the package ideal for new users of OTP (especially when used with the accompanying tutorial). Advanced users can provide any additional API parameters they wish via the extra.params argument. These parameters are passed directly to the OTP API without checks.

otpr currently supports the following OTP API resources:

This package will be useful to researchers and transport planners who want to use OTP to generate trip data for accessibility analysis or to derive variables for use in transportation models.

Version support

otpr fully supports OTP versions 1 and 2. Please note the following :

Installation

# Install from CRAN
# install.packages("otpr")

Development version

To get a bug fix, or use a feature from the development version, you can install otpr from GitHub. See NEWS for changes since last release.

# install.packages("devtools")
# devtools::install_github("marcusyoung/otpr")

Getting started

library(otpr)

Defining an OTP connection

The first step is to call otp_connect(), which defines the parameters needed to connect to a router on a running OTP instance. The function can also confirm that the router is reachable.

# For a basic instance of OTP running on localhost with standard ports and a 'default' router
# this is all that's needed
otpcon <- otp_connect()
#> http://localhost:8080/otp is running OTPv1
#> Router http://localhost:8080/otp/routers/default exists

Handling Time Zones

If the time zone of an OTP graph differs from the time zone of the local system running otpr then by default returned trip start and end times will be expressed in the local system’s time zone and not the time zone of the graph. This is because the OTP API returns EPOCH values and the conversion to date and time format occurs on the local system. A ‘timeZone’ column is included in returned dataframes that contain start and end times to make this explicit. If you wish to have start and end times expressed in the time zone of the graph, the tz argument can be specified when calling the otp_connect() function. This must be a valid time zone (checked against the vector returned by OlsonNames()); for example: “Europe/Berlin”.

Querying the OTP API

Function behaviour

The functions that query the OTP API return a list of three or more elements. The first element is an errorId - with the value “OK” or the error code returned by OTP. If errorId is “OK”, the second element will contain the query response; otherwise it will contain the OTP error message. There may be further list elements forming the query response. The last element will be the query URL that was submitted to the OTP API (for information and useful for troubleshooting).

Distance between two points

To get the trip distance in metres between an origin and destination on the street and/or path network use otp_get_distance(). You can specify the required mode: CAR (default), BICYCLE or WALK are valid. The trip information will relate to the first itinerary that was returned by the OTP server.

# Distance between Manchester city centre and Manchester airport by CAR
otp_get_distance(
  otpcon,
  fromPlace = c(53.48805,-2.24258),
  toPlace = c(53.36484,-2.27108)
)
#> $errorId
#> [1] "OK"
#> 
#> $distance
#> [1] 29050.21
#> 
#> $query
#> [1] "http://localhost:8080/otp/routers/default/plan?fromPlace=53.48805,-2.24258&toPlace=53.36484,-2.27108&mode=CAR"

# Now for BICYCLE
otp_get_distance(
  otpcon,
  fromPlace = c(53.48805,-2.24258),
  toPlace = c(53.36484,-2.27108),
  mode = "BICYCLE"
)
#> $errorId
#> [1] "OK"
#> 
#> $distance
#> [1] 16028.57
#> 
#> $query
#> [1] "http://localhost:8080/otp/routers/default/plan?fromPlace=53.48805,-2.24258&toPlace=53.36484,-2.27108&mode=BICYCLE"

Time between two points

To get the trip duration in minutes between an origin and destination use otp_get_times(). You can specify the required mode: TRANSIT (all available transit modes), BUS, RAIL, SUBWAY, TRAM, CAR, BICYCLE, and WALK are valid. All the public transit modes automatically allow WALK. There is also the option to combine TRANSIT with BICYCLE.

# Time between Manchester city centre and Manchester airport by BICYCLE
otp_get_times(
  otpcon,
  fromPlace = c(53.48805,-2.24258),
  toPlace = c(53.36484,-2.27108),
  mode = "BICYCLE"
)
#> $errorId
#> [1] "OK"
#> 
#> $duration
#> [1] 59.75
#> 
#> $query
#> [1] "http://localhost:8080/otp/routers/default/plan?fromPlace=53.48805,-2.24258&toPlace=53.36484,-2.27108&mode=BICYCLE&date=01-06-2022&time=19:07:48&walkReluctance=2&waitReluctance=1&arriveBy=FALSE&transferPenalty=0&minTransferTime=0"


# By default the date and time of travel is taken as the current system date and
# time. This can be changed using the 'date' and 'time' arguments
otp_get_times(
  otpcon,
  fromPlace = c(53.48805,-2.24258),
  toPlace = c(53.36484,-2.27108),
  mode = "TRANSIT",
  date = "01-19-2021",
  time = "07:15:00"
)
#> $errorId
#> [1] "OK"
#> 
#> $duration
#> [1] 39.65
#> 
#> $query
#> [1] "http://localhost:8080/otp/routers/default/plan?fromPlace=53.48805,-2.24258&toPlace=53.36484,-2.27108&mode=TRANSIT,WALK&date=01-19-2021&time=07:15:00&walkReluctance=2&waitReluctance=1&arriveBy=FALSE&transferPenalty=0&minTransferTime=0"

Breakdown of time by mode, waiting time and transfers

To get more information about the trip when using transit modes, otp_get_times() can be called with the detail argument set to TRUE. The trip duration (minutes) is then further broken down by time on transit, walking time (from/to and between stops), waiting time (when changing transit vehicle or mode), and number of transfers (when changing transit vehicle or mode). By default the function returns trip information for the first itinerary suggested by the OTP server. However, additional itineraries can be requested by specifying the maxItineraries argument. The function will return every available itinerary suggested by the OTP server, in order, up to the value of maxItineraries (the default is 1).

# Time between Manchester city centre and Manchester airport by TRANSIT with detail
otp_get_times(
  otpcon,
  fromPlace = c(53.48805,-2.24258),
  toPlace = c(53.36484,-2.27108),
  mode = "TRANSIT",
  date = "01-19-2021",
  time = "07:15:00",
  detail = TRUE,
  maxItineraries = 1
)
#> $errorId
#> [1] "OK"
#> 
#> $itineraries
#>                 start                 end      timeZone duration walkTime
#> 1 2021-01-19 07:16:03 2021-01-19 07:55:42 Europe/London    39.65        8
#>   transitTime waitingTime transfers
#> 1          31        0.65         1
#> 
#> $query
#> [1] "http://localhost:8080/otp/routers/default/plan?fromPlace=53.48805,-2.24258&toPlace=53.36484,-2.27108&mode=TRANSIT,WALK&date=01-19-2021&time=07:15:00&walkReluctance=2&waitReluctance=1&arriveBy=FALSE&transferPenalty=0&minTransferTime=0"

Details of each leg for transit-based trips

To get information about each leg of transit-based trips, otp_get_times() can be called with both the detail and includeLegs arguments set to TRUE. A column called ‘legs’ will then be included in the itineraries dataframe. This column contains a nested ‘legs’ dataframe for each itinerary. The ‘legs’ dataframe contains a row for each leg of the trip. The information provided for each leg includes start and end times, duration, distance, mode, route details, agency details, and stop names. There is also a column called ‘departureWait’ which is the length of time in minutes required to wait before the start of a leg. The sum of ‘departureWait’ will equal the total waiting time for the itinerary.

# Time between Manchester city centre and Manchester airport by TRANSIT with detail and legs
trip <- otp_get_times(
  otpcon,
  fromPlace = c(53.48805,-2.24258),
  toPlace = c(53.36484,-2.27108),
  mode = "TRANSIT",
  date = "01-19-2021",
  time = "07:15:00",
  detail = TRUE,
  includeLegs = TRUE,
  maxItineraries = 2
)

# Legs for the first itinerary returned by OTP (first 9 columns)
trip$itineraries$legs[[1]][1:9]
#>             startTime             endTime      timeZone mode departureWait
#> 1 2021-01-19 07:16:03 2021-01-19 07:18:59 Europe/London WALK          0.00
#> 2 2021-01-19 07:19:00 2021-01-19 07:28:00 Europe/London TRAM          0.02
#> 3 2021-01-19 07:28:00 2021-01-19 07:29:23 Europe/London WALK          0.00
#> 4 2021-01-19 07:30:00 2021-01-19 07:52:00 Europe/London RAIL          0.62
#> 5 2021-01-19 07:52:01 2021-01-19 07:55:42 Europe/London WALK          0.02
#>   duration  distance routeType                 routeId
#> 1     2.93   197.370        NA                    <NA>
#> 2     9.00  1424.641         0 2:METLYELL:I:2021-01-02
#> 3     1.38   106.629        NA                    <NA>
#> 4    22.00 14921.306         2                  1:7285
#> 5     3.68   246.086        NA                    <NA>

Travel time isochrones (OTPv1 only)

The otp_get_isochrone() function can be used to get one or more travel time isochrones in either GeoJSON or SF format. These are only available for transit or walking modes (OTP limitation). They can be generated either from (default) or to the specified location.

GeoJSON example

# 900, 1800 and 2700 second isochrones for travel *to* Manchester Airport by any transit mode
my_isochrone <- otp_get_isochrone(
  otpcon,
  location = c(53.36484, -2.27108),
  fromLocation = FALSE,
  cutoffs = c(900, 1800, 2700),
  mode = "TRANSIT",
  date = "01-19-2021",
  time = "07:15:00"
)

# function returns a list of two elements
names(my_isochrone)
#> [1] "errorId"  "response" "query"

# now write the GeoJSON (in the "response" element) to a file so it can be opened in QGIS (for example)
write(my_isochrone$response, file = "my_isochrone.geojson")

SF example

# request format as "SF"
my_isochrone <- otp_get_isochrone(
  otpcon,
  location = c(53.36484, -2.27108),
  format = "SF",
  fromLocation = FALSE,
  cutoffs = c(900, 1800, 2700, 3600, 4500, 5400),
  mode = "TRANSIT",
  date = "01-19-2021",
  time= "07:15:00",
  maxWalkDistance = 1600,
  walkReluctance = 5,
  minTransferTime = 600
)

# plot using tmap package

library(tmap)
library(tmaptools)

# set bounding box
bbox <- bb(my_isochrone$response)
# get OSM tiles
osm_man <- read_osm(bbox, ext = 1.1)
# plot isochrones
tm_shape(osm_man) +
  tm_rgb() +
  tm_shape(my_isochrone$response) +
  tm_fill(
    col = "time",
    alpha = 0.8,
    palette = "-plasma",
    n = 6,
    style = "cat",
    title = "Time (seconds)"
  ) + tm_layout(legend.position = c("left", "top"), legend.bg.color = "white", 
                main.title = "15-90 minute isochrone to Manchester Airport", 
                main.title.size = 0.8)

One-to-many Travel Time Analysis (OTPv1 only)

If you wish to calculate the travel time from one or more origins to many destinations, querying the OTP journey planning API (using otp_get_times()) may not be the best option as it requires multiple requests to the API which can be very inefficient. An alternative is to use the OTP surface analysis feature, which enables you to calculate the travel time from an origin to thousands of destinations in about the same time it takes to perform a single origin:destination lookup. This is achieved by generating a surface for an origin which contains the travel time to every geographic coordinate that can be reached from that origin by the specified transport mode (though note that there is a hard-coded surface cutoff of 120 minutes set in OTP).

Once the surface has been generated, it can be evaluated to rapidly retrieve the travel times from the origin to each ‘destination’ point provided in a supplied CSV file. This file, known as a pointset, can also contain the quantities of one or more ‘opportunities’ that are associated with each point. During evaluation, OTP will sum the opportunities available at each additional minute of travel time, and otpr generates a cumulative sum of the opportunities. For example, you might have a pointset of workplace zones and a column with the number of jobs within each zone. The output will be a cumulative sum of jobs reachable for each minute of travel time by the mode specified when the surface was generated.

Before a surface analysis can be performed, OTP must be started with the --analyst switch. To evaluate one or more pointsets against a surface, the location of the pointset CSV file(s) must be specified using the --pointSets switch followed by the file path. For more information about the required file format for a pointset CSV file and the switches to start OTP with, see: http://docs.opentripplanner.org/en/dev-1.x/Surface/.

Create a surface

Once an OTP instance has been started in analysis mode, a surface can be generated by calling the otp_create_surface() function. The arguments that can be passed to this function are very similar to otp_get_times(). The main differences are the exclusion of the toPlace argument and two new function-specific arguments - getRaster and rasterPath. These are used to request a raster image (a geoTIFF file) of the generated surface which is saved to the local file system. If the surface is successfully generated, the function will return the OTP ID number of the surface - this will be needed for any subsequent evaluation against the surface.

There are a few things to note regarding the raster image that OTP creates:

# create surface with origin as Manchester city centre
otp_create_surface(otpcon, fromPlace = c(53.479167,-2.244167), date = "01-19-2021",
time = "08:00:00", mode = "TRANSIT", maxWalkDistance = 1600, getRaster = TRUE,
rasterPath = "C:/temp")
#> $errorId
#> [1] "OK"
#> 
#> $surfaceId
#> [1] 0
#> 
#> $surfaceRecord
#> [1] "{\"id\":0,\"params\":{\"mode\":\"TRANSIT,WALK\",\"date\":\"01-19-2021\",\"walkReluctance\":\"2\",\"arriveBy\":\"FALSE\",\"minTransferTime\":\"0\",\"fromPlace\":\"53.479167,-2.244167\",\"batch\":\"TRUE\",\"transferPenalty\":\"0\",\"time\":\"08:00:00\",\"maxWalkDistance\":\"1600\",\"waitReluctance\":\"1\"}}"
#> 
#> $rasterDownload
#> [1] "C:/temp/surface_0.tiff"
#> 
#> $query
#> [1] "http://localhost:8080/otp/surfaces?fromPlace=53.479167,-2.244167&mode=TRANSIT,WALK&date=01-19-2021&time=08:00:00&maxWalkDistance=1600&walkReluctance=2&waitReluctance=1&transferPenalty=0&minTransferTime=0&arriveBy=FALSE&batch=TRUE"
Example of surface raster visualised in QGIS

Evaluate a surface

Once a surface has been generated, it can be evaluated using the otp_evaluate_surface() function. The ID of the surface (returned by the otp_create_surface() function) and the name of a pointset (which is the pointset file name excluding the extension) must be provided as arguments. The function will return one or more dataframes for each of the ‘opportunity’ columns in the pointset CSV file. Each of these dataframes contains four columns:

As noted above, there is a cutoff of 120 minutes for the surface and only data for minutes up to 120 are returned by OTP. This limit is hard-coded in OTP.

If the optional detail argument is set to TRUE, then a dataframe called ‘times’ containing the time taken (in seconds) to reach each point in the pointset file will also be returned. If a point is not reachable the time will be recorded as NA. This could mean that the point is genuinely unreachable by the mode (e.g. if it is CAR mode and the location is only accessible by walking or cycling) or that the point falls outside of the surface (which might be due to the extent of the network or the 120 minute limit to the surface extent). The ‘times’ table can be joined with the data from the original pointset file to get the travel time from the origin to each destination.

response <- otp_evaluate_surface(otpcon, surfaceId = 0, pointset = "gm-jobs", detail = TRUE)
# Look at first few rows of the job opportunity data
head(response$jobs)
#>   minutes counts sums cumsums
#> 1       1      0  219     219
#> 2       2      0  441     660
#> 3       3      1  637    1297
#> 4       4      3 1656    2953
#> 5       5      4 2905    5858
#> 6       6      9 4848   10706
# And the last few rows
tail(response$jobs)
#>     minutes counts sums cumsums
#> 101     101      1  401 1225495
#> 102     102      1  329 1225824
#> 103     103      1  560 1226384
#> 104     104      2  817 1227201
#> 105     105      0  458 1227659
#> 106     106      1  198 1227857
# Number of job opportunities accesible with 60 minutes from origin by TRANSIT
response$jobs$cumsums[60]
#> [1] 927699
# And a peak at the times dataframe
head(response$times)
#>   point time
#> 1     1 3641
#> 2     2 4351
#> 3     3 4835
#> 4     4 5058
#> 5     5 4653
#> 6     6 4244

Learning more

The example function calls shown above can be extended by passing additional parameters to the OTP API. This includes the advanced option to pass any parameter that is not an otpr argument directly to the OTP API via the extra.params argument. Further information is available in the documentation for each function:

# get help on using the otp_get_times() function
?otp_get_times

If you are new to OTP, then the best place to start is to work through the tutorial, OpenTripPlanner Tutorial - creating and querying your own multi-modal route planner. This includes everything you need, including example data, to get started with OTPv1. The tutorial also has examples of using otpr functions, and helps you get the most from the package, for example using it to populate an origin-destination matrix.

For more guidance on how otpr, in conjunction with OTP, can be used to generate data for input into models, you may be interested in: An automated framework to derive model variables from open transport data using R, PostgreSQL and OpenTripPlanner.

Getting help

How to cite

Please cite otpr if you use it. Get citation information using: citation(package = 'otpr'):

citation(package = 'otpr')
#> 
#> To cite the otpr package in publications, please use the following. You
#> can obtain the DOI for a specific version from:
#> https://zenodo.org/record/4065250
#> 
#>   Marcus Young (2020). otpr: An API wrapper for OpenTripPlanner. R
#>   package version 0.5.0. https://doi.org/10.5281/zenodo.4065250
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     author = {{Marcus Young}},
#>     title = {{otpr: An API wrapper for OpenTripPlanner}},
#>     year = {2020},
#>     note = {{R package version 0.5.0}},
#>     doi = {10.5281/zenodo.4065250},
#>   }

Want to say thanks?