This function allows users to overcome the API's observation limit per query by breaking down the query into smaller sequential sub-queries and appending back the results. The main use case of this function is for backfilling hourly series.

eia_backfill(start, end, offset, api_key, api_path, facets)

Arguments

start

defines the start time of the series, should use a POSIXt class for hourly series or Date format for non-hourly series (daily, monthly, etc.)

end

defines the end time of the series, should use a POSIXt class for hourly series or Date format for non-hourly series (daily, monthly, etc.)

offset

An integer, defines the number of observations limitation per query. In line with the API limitation of up to 5000 observations per query, the offset argument's upper limit is 5000 observations.

api_key

A string, EIA API key, see https://www.eia.gov/opendata/ for registration to the API service

api_path

A string, the API path to follow the API endpoint https://api.eia.gov/v2/. The path can be found on the EIA API dashboard, for more details see https://www.eia.gov/opendata/browser/

facets

A list, optional, set the filtering argument (defined as 'facets' on the API header), following the structure of list(facet_name_1 = value_1, facet_name_2 = value_2)

Value

A time series

Details

The function use start, end, and offset arguments to define a sequence of queries.

Examples

if (FALSE) {
 start <- as.POSIXlt("2018-06-19T00", tz = "UTC")
 end <- lubridate::floor_date(Sys.time()- lubridate::days(2), unit = "day")
 attr(end, "tzone") <- "UTC"
 offset <- 2000
 api_key <- Sys.getenv("eia_key")
 api_path <- "electricity/rto/region-sub-ba-data/data/"

 facets = list(parent = "NYIS",
               subba = "ZONA")

 df <- eia_backfill(start = start,
                end = end,
                offset = offset,
                api_key = api_key,
                api_path = api_path,
                facets = facets)

 at_y <- pretty(df$value)[c(2, 4, 6)]
 at_x <- seq.POSIXt(from = start,
                  to = end,
                  by = "2 years")
 plot(df$time, df$value,
      col = "#1f77b4",
      type = "l",
      frame.plot = FALSE,
      axes = FALSE,
      panel.first = abline(h = at_y, col = "grey80"),
      main = "NY Independent System Operator (West) - Hourly Generation of Electricity",
      xlab = "Source: https://www.eia.gov/",
      ylab = "MegaWatt/Hours")

 mtext(side =1, text = format(at_x, format = "%Y"), at = at_x,
       col = "grey20", line = 1, cex = 0.8)

 mtext(side =2, text = format(at_y, scientific = FALSE), at = at_y,
       col = "grey20", line = 1, cex = 0.8)
}