Tuesday, 14 November 2017

Energy data analysis with Pandas

Leo Smith of Gridwatch has done a cracking job of making electricity data available on an online resource. The data available through his website includes total demand, total generation from different generation types, interconnector flows and frequency with a temporal resolution of approximately 5 minutes.

This post is about using a Python package called pandas to perform data analysis on the data downloaded from the gridwatch website. Pandas is a powerful data analysis library for Python.

A small Python script follows that reads gridwatch data into a Pandas data frame, performs data cleaning for erroneous data and plot.

import pandas as pd
import matplotlib.pyplot as plt

The first thing we want to do is to import pandas and matplotlib packages in our Python script. This is achieved by lines 1 and 2.

filename = 'gridwatch.csv'
data = pd.read_csv(filename)

'gridwatch.csv' is a csv file that I have downloaded from the gridwatch website. To keep things simple, I have only downloaded the demand data in this file. However, the same script would work for any other data or a combination of data types. The resulting file is a big csv file (~3.8MB, 104860 lines).

Line 4 reads our csv file as a pandas data frame.

data =  data.rename(columns=lambda x:x.strip())
for key in data:
    data[key] = data[key].map(lambda x: str(x).strip())

The downloaded csv file (and hence the pandas dataframe) contains leading white space. We need to fix this by stripping the white space, as this would cause problems later. This is achieved by lines 5-7.

data['timestamp'] = pd.to_datetime(data['timestamp'],
                                   format="%Y-%m-%d %H:%M:%S")
data['demand'] = pd.to_numeric(data['demand'])
data.set_index('timestamp', inplace=True)
ax = data['demand'].plot()

Line 8 converts the datatype of  our dataframe column 'timestamp' to take a datetime data type. This is useful as I would like to plot the demand against time and for that need pandas to recognise the datetime data type. Line 10 converts the demand to numeric values. Line 11 sets 'timestamp' to index of the data frame. Line 12 and 13 plot the demand against time, and we have the following plot.

The above figure shows some spikes and dips. These spikes and dips corresponds to the erroneous data in the downloaded csv file. The total demand of Great British electricity system can not be higher than 60GW and less that 15GW. Based on this knowledge, we need to clean our data. This can be achieved by using following command.

data = data[(data.demand>15000) & (data.demand<60000)]

See how easy is plotting and making sense of data using Pandas and Python. 

The full code of reading gridwatch data is available on this link.

Friday, 10 November 2017

Correlation between different generation types

There are four electricity interconnectors linking Great Britain (GB) to Ireland and continental Europe with a transmission capacity of 4 GW. These links represents approximately 5% of the GB's electricity generation capacity.

The four interconnectors link GB to other electricity markets and provide a transmission capacity for the electricity to flow from high price region to a low price region.

Recently I pulled electricity generation data for the year 2016 from the Elexon portal. I wanted to see the correlation between different generation types, especially how each generation influences the interconnector flows.

Fig. 1: Correlation matrix of electricity generation in GB for the year 2016.

Figure 1 shows the correlation matrix of different generation types for the year 2016. The highest positive correlation exists between generation from the onshore and offshore wind farms-no surprise there. There is a strong positive correlation between generation from the two different hydro generation types. 

In Figure 1, 'INTIRL' represents the 500MW interconnector between South Ayrshire (Scotland) to Ballycronan (Northern Ireland), where as 'INTEW' represents a 500MW interconnector between County Dubline (Republic of Ireland) and Prestatyn (Wales). A positive correlation exists between the interconnector flows to Ireland.This is expected as the Republic of Ireland and Northern Ireland operates as a single electricity market.