GitHub - imangR/bysykkel: Functions to get Norwegian city bike data (original) (raw)
bysykkel
Overview
bysykkel is an R package that provides functions that simplifies the task of gathering Norwegian city bike data for data analysis. bysykkel provides functions to read city bike data directly to R or download it to your R session’s working directory.
read_trips_data()
reads bike trip records to R as a data frame.fread_trips_data()
fast reads bike trip records to R as a data frame by utilizingfread()
from data.table.dl_trips_data()
downloads bike trip records to your working directoryget_api_data()
gets real-time data from the specified city bike API service.
bysykkel lets you, the user, focus on data exploration, visualization, statistical analysis, and building machine learning models on Norwegian city bike data, by simplifying the task of getting the data. Indeed, the purpose of bysykkel is to reduce time spent on getting Norwegian city bike data, and lower barriers to start analyzing it.
The package name, bysykkel, is the Norwegian word for “city bikes”, where by means “city”, and sykkel means “bike” (or “bicycle”).
Installation
You can install the released version of bysykkel fromCRAN with:
install.packages("bysykkel")
Development version
Alternatively, you can install a development version of bysykkel from GitHub to get bug fixes or new features before the next package version is released on CRAN. To install the development version, you can use devtools to install bysykkel from GitHub.
#! install.packages("devtools") devtools::install_github("imangR/bysykkel")
Bike data
bysykkel currently retrieves data from three[1] city bike services in Norway that make bike data publicly available[2]:
Each city bike service provide two data-related services:
- Historical trip data
- Real-time data
The historical trip data is available both as a CSV-file, and a JSON-file, that contains monthly anonymized historical bike trip data. Real-time data is available in the GBFS format, and must be accessed with each city bike’s API service, which provide information about
- bike and dock availability;
- stations (including geolocation, addresses and station descriptions); and
- machine readable basic information about the city bike service.
The data is made available under the Norwegian License for Open Data 2.0, abbreviated as NLOD 2.0, which you can read abouthere.
Examples
Read bike trips data to R
library(bysykkel)
Get bike trip data for April, 2019 for Oslo as a data frame
oslo_trips <- read_trips_data(year = 2019, month = 04, city = "Oslo")
Get winter bike data for January, 2019 for Oslo as a data frame
oslo_trips <- read_trips_data(2019, 1, "Oslo")
Fast read bike data from June to August in 2018 for Bergen with lapply()
,
and rbind()
the resulting list
with do.call()
to get a data frame
#! install.packages("data.table")
bergen_trips <- lapply(06:08, fread_trips_data, year = 2018, city = "Bergen") bergen_trips <- do.call(rbind, bergen_trips)
Alternatively, use map_dfr()
from purrr
instead of lapply()
,
rbind()
, and `do.call() to get the same result: a data frame
#! install.packages("purrr")
library(purrr)
bergen_trips <- map_dfr(6:8, fread_trips_data, year = 2018, city = "Bergen")
NB! I recommend that you use fread_trips_data()
to fast read city bike data, especially if you want to read bike data for several months.
NB! data.table is not automatically installed with bysykkel, and must be installed separately with install.packages("data.table")
if you want to use fread_trips_data()
.
Download bike trips data
library(bysykkel)
Download bike trip data for April 2019 for Trondheim
dl_trips_data(2019, 04, "Trondheim", filetype = "JSON") #> The CSV-file is downloaded to your R session's current working directory
Download bike trip data for summer 2018 for Oslo
lapply(06:08, dl_trips_data, year = 2018, city = "Oslo", filetype = "CSV") #> The CSV-file for each month is downloaded to your R session's working directory
Get real-time data from the API service
NB! Please read each City Bike’s guide on how to correctly use their API service before using get_api_data()
. See Oslo City Bike’s guide as an example.
The return_df
argument in get_api_data()
specifies whether you want to return the result as a data frame. If return_df = FALSE
(default), then the function returns a list that contains a data frame, and a number that represents the datetime (in POSIX format) of when you made the API request.
library(bysykkel)
Get API data on bike stations as a data frame
oslo_stations <- get_api_data(client_id = "myname-myapp", data = "stations", city = "Oslo", return_df = TRUE)
Get API data for bike availability as a list that contains a data frame, and
a number that represents the (POSIX) time of when you made the API request
bergen_availability <- get_api_data(client_id = "mycompany-myservice", data = "availability", city = "Bergen", return_df = FALSE)
Get API data on bike system information
trondheim_system <- get_api_data("Ola Nordmann-bike dashboard", "system", "Trondheim", return_df = FALSE)
File an issue or suggest an improvement
If you want to report a discovered bug, raise some other issue, or suggest an improvement to bysykkel, then please file an issue onGitHub. For bugs, please file a minimal reproducible example.
Known issues
No issues have been identified at this time in version 0.3.1.
- It used to be four services, but the Oslo Winter Bike service has been shut down, and associated data is now unavailable. The bysykkel package has updated all functions in version 0.3.1 to remove any interface to the Oslo Winter Bike service.
- Bike data for Bærum City Bike