vignettes/UKgrid_vignette.Rmd
UKgrid_vignette.Rmd
The UKgrid
is an R dataset package with the historical national demand of the electricity transmission system in the UK and other related variables. This dataset is a half-hourly time series data with observations since April 2005. This dataset was sourced from National Grid UK website.
Install the package from CRAN:
install.packages("UKgrid")
or install the development version from Github:
# install.packages("remotes")
remotes::install_github("RamiKrispin/UKgrid")
library(UKgrid)
data("UKgrid")
str(UKgrid)
#> 'data.frame': 254592 obs. of 9 variables:
#> $ TIMESTAMP : POSIXct, format: "2005-04-01 00:00:00" "2005-04-01 00:30:00" ...
#> $ ND : int 32926 32154 33633 34574 34720 34452 33818 32951 32448 32212 ...
#> $ I014_ND : int 32774 32032 33495 34460 34641 34347 33854 33020 32525 32287 ...
#> $ TSD : int 34286 33885 35082 36028 36179 35921 35362 34607 34347 34103 ...
#> $ ENGLAND_WALES_DEMAND : int 29566 28871 30340 31253 31325 31094 30442 29664 29167 28920 ...
#> $ EMBEDDED_WIND_GENERATION : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ EMBEDDED_WIND_CAPACITY : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ EMBEDDED_SOLAR_GENERATION: int 0 0 0 0 0 0 0 0 0 0 ...
#> $ EMBEDDED_SOLAR_CAPACITY : int 0 0 0 0 0 0 0 0 0 0 ...
A variable dictionary is available in the dataset documentation.
extract_grid
functionThe extract_grid
function provides the ability to extract the UKgrid series in a different format (tsibble
, xts
, zoo
, ts
, data.frame
, data.table
and tbl
), frequencies (half-hourly, hourly, daily, weekly, monthly and quarterly), and subset the series by time frame.
For example, you can select the national demand variable (ND), using tsibble
format:
nd_half_hourly <- extract_grid(type = "tsibble", # default
columns = "ND", # default
aggregate = NULL # default
)
library(tsibble)
head(nd_half_hourly)
#> # A tsibble: 6 x 2 [30m] <UTC>
#> TIMESTAMP ND
#> <dttm> <int>
#> 1 2005-04-01 00:00:00 32926
#> 2 2005-04-01 00:30:00 32154
#> 3 2005-04-01 01:00:00 33633
#> 4 2005-04-01 01:30:00 34574
#> 5 2005-04-01 02:00:00 34720
#> 6 2005-04-01 02:30:00 34452
class(nd_half_hourly)
#> [1] "tbl_ts" "tbl_df" "tbl" "data.frame"
index(nd_half_hourly)
#> TIMESTAMP
interval(nd_half_hourly)
#> <interval[1]>
#> [1] 30m
library(TSstudio)
ts_plot(ts.obj = nd_half_hourly,
title = "UK National Demand - Half-Hourly")
Alternatively, you can aggregate the series to an hourly frequency with the aggregate
argument:
nd_hourly <- extract_grid(type = "tsibble",
columns = "ND",
aggregate = "hourly"
)
interval(nd_hourly)
#> <interval[1]>
#> [1] 1h
ts_plot(ts.obj = nd_hourly,
title = "UK National Demand - Hourly")
Selection of the UKgrid columns is done by the columns
argument. The full list of columns is available on the dataset documentation (?UKgrid
). For instance, let’s select the “ND” and “TSD” columns in a daily format:
df <- extract_grid(type = "xts",
columns = c("ND","TSD"),
aggregate = "daily"
)
head(df)
#> ND TSD
#> 2005-04-01 1920069 1965115
#> 2005-04-02 1674699 1717958
#> 2005-04-03 1631352 1675112
#> 2005-04-04 1916693 1955599
#> 2005-04-05 1952082 1994242
#> 2005-04-06 1964584 2009831
ts_plot(ts.obj = df,
title = "UK National and Transmission System Demand - Daily")
Note: by default, when any of the data frame family structure is used, the output will include the timestamp of the data (even if was not selected in the columns argument)
Last but not least, you can subset the series by time range with the start
and end
argument:
df1 <- extract_grid(type = "zoo",
columns = "ND",
aggregate = "daily",
start = 2015,
end = 2017)
head(df1)
#> 2015-01-01 2015-01-02 2015-01-03 2015-01-04 2015-01-05 2015-01-06
#> 1442371 1564118 1653984 1684045 1882487 1866439
ts_plot(ts.obj = df1,
title = "UK National and Transmission System Demand - Daily between 2015 and 2017")