The UK National Electricity Transmission System Dataset

Intro

The UKgrid is an R dataset package with the historical national demand of the electricity transmission system in the UK and other related variables. This dataset is a half-hourly time series data with observations since April 2005. This dataset was sourced from National Grid UK website.

Installation

Install the package from CRAN:

or install the development version from Github:

# install.packages("remotes")
remotes::install_github("RamiKrispin/UKgrid")

Usage

library(UKgrid)

data("UKgrid")

str(UKgrid)
#> 'data.frame':    254592 obs. of  9 variables:
#>  $ TIMESTAMP                : POSIXct, format: "2005-04-01 00:00:00" "2005-04-01 00:30:00" ...
#>  $ ND                       : int  32926 32154 33633 34574 34720 34452 33818 32951 32448 32212 ...
#>  $ I014_ND                  : int  32774 32032 33495 34460 34641 34347 33854 33020 32525 32287 ...
#>  $ TSD                      : int  34286 33885 35082 36028 36179 35921 35362 34607 34347 34103 ...
#>  $ ENGLAND_WALES_DEMAND     : int  29566 28871 30340 31253 31325 31094 30442 29664 29167 28920 ...
#>  $ EMBEDDED_WIND_GENERATION : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ EMBEDDED_WIND_CAPACITY   : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ EMBEDDED_SOLAR_GENERATION: int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ EMBEDDED_SOLAR_CAPACITY  : int  0 0 0 0 0 0 0 0 0 0 ...

A variable dictionary is available in the dataset documentation.

The extract_grid function

The extract_grid function provides the ability to extract the UKgrid series in a different format (tsibble, xts, zoo, ts, data.frame, data.table and tbl), frequencies (half-hourly, hourly, daily, weekly, monthly and quarterly), and subset the series by time frame.

For example, you can select the national demand variable (ND), using tsibble format:


nd_half_hourly <- extract_grid(type = "tsibble", # default
                          columns = "ND", # default
                          aggregate = NULL # default
                          )



library(tsibble)

head(nd_half_hourly)
#> # A tsibble: 6 x 2 [30m] <UTC>
#>   TIMESTAMP              ND
#>   <dttm>              <int>
#> 1 2005-04-01 00:00:00 32926
#> 2 2005-04-01 00:30:00 32154
#> 3 2005-04-01 01:00:00 33633
#> 4 2005-04-01 01:30:00 34574
#> 5 2005-04-01 02:00:00 34720
#> 6 2005-04-01 02:30:00 34452

class(nd_half_hourly)
#> [1] "tbl_ts"     "tbl_df"     "tbl"        "data.frame"

index(nd_half_hourly)
#> TIMESTAMP

interval(nd_half_hourly)
#> <interval[1]>
#> [1] 30m

library(TSstudio)

ts_plot(ts.obj = nd_half_hourly,
        title = "UK National Demand - Half-Hourly")

Alternatively, you can aggregate the series to an hourly frequency with the aggregate argument:


nd_hourly <- extract_grid(type = "tsibble", 
                          columns = "ND", 
                          aggregate = "hourly" 
                          )


interval(nd_hourly)
#> <interval[1]>
#> [1] 1h

ts_plot(ts.obj = nd_hourly, 
        title = "UK National Demand - Hourly")

Selection of the UKgrid columns is done by the columns argument. The full list of columns is available on the dataset documentation (?UKgrid). For instance, let’s select the “ND” and “TSD” columns in a daily format:


df <- extract_grid(type = "xts", 
                          columns = c("ND","TSD"), 
                          aggregate = "daily" 
                          )

head(df)
#>                 ND     TSD
#> 2005-04-01 1920069 1965115
#> 2005-04-02 1674699 1717958
#> 2005-04-03 1631352 1675112
#> 2005-04-04 1916693 1955599
#> 2005-04-05 1952082 1994242
#> 2005-04-06 1964584 2009831

ts_plot(ts.obj = df, 
        title = "UK National and Transmission System Demand - Daily")

Note: by default, when any of the data frame family structure is used, the output will include the timestamp of the data (even if was not selected in the columns argument)

Last but not least, you can subset the series by time range with the start and end argument:


df1 <- extract_grid(type = "zoo", 
                          columns = "ND", 
                          aggregate = "daily", 
                          start = 2015,
                          end = 2017)

head(df1)
#> 2015-01-01 2015-01-02 2015-01-03 2015-01-04 2015-01-05 2015-01-06 
#>    1442371    1564118    1653984    1684045    1882487    1866439

ts_plot(ts.obj = df1, 
        title = "UK National and Transmission System Demand - Daily between 2015 and 2017")