Why data lakes are key in the renewables industry

1st March 2022

The shift to an energy system powered by renewables is going to demand a technological evolution that places data at its core, requiring a complete overhaul of IT architecture. Paul Grimshaw, chief technology officer at Sennen, explains why market leaders are moving to cloud native technology and data lakes. 

In every sector, business leaders are realising that embracing cloud native technology is critical to their ongoing success. Renewables is a data-heavy industry on a dramatic growth curve. Real digitisation means collecting vast amounts of information from sensors, day and night.

Harvesting and processing this in the cloud offers the opportunity to save costs and massively increase the value derived from data to enhance asset value and succeed in a highly competitive market. This method of storage is also known as a ‘data lake’ – a repository that can hold huge amounts of raw data for indefinite periods of time.

What is a data lake?

A data lake is a cloud-based storage system, owned by you, which allows you to collect all data in a semi-structured form into one very large, low-cost location. When data is required for a particular application, it is extracted and transferred into a high-performance database. It’s not particularly new technology, but as a general trend in renewables, companies are starting to see its advantages. 

Many SaaS (Software as a Service) companies serve the sector by collecting SCADA data, hosting it in their own software and providing analytics tools. The drawback with a third party is that you get stuck with a limited set of features and no database of your own. The insights provided are a blackbox – impossible to know if the analytics are good or bad.  It also becomes too costly on large operating portfolios (500MW+).

Large companies working at scale don’t want to be limited by the old model. They want all their IoT data flowing into a data lake and to deploy software services natively.

The key players

Big cloud providers like Amazon Web Services, Azure and Google are moving aggressively into the market to be the go-to providers. They are creating very powerful IoT solutions and promoting the use of remote ‘edge’ devices, which can be located on site to collect and stream back data. 

An even simpler solution is on the horizon also; newer turbine and inverter models increasingly have standardised interfaces, which can be directly interfaced with AWS and Azure.

The advantages of a data lake

Firstly, cloud storage is competitive. A data lake is a cheap option at scale and is also great at collecting everything. It will hoover up all available data, rather small subsets of data, which can be amassed to enable more accurate reporting and looked back on over time to answer questions. It’s also more secure. With a data lake, companies have more control as their IT team manages the structure. This way, software providers work within the data lake rather than being the sole handler.

What are the challenges?

There is a learning curve, during which time there’s the possibility of making mistakes. Often, companies are confused about what a data lake actually is. It’s different to a database, which is a smaller quantity of one type of data such as the real-time signals from a wind turbine.

The other challenge is getting the right architecture structure so that everything flows in correctly, triggering accurate onward processes. It requires a good cloud architect who understands best practice.

How does it work?

AWS offers a cloud environment. You, as the client, are a tenant within it. You directly open an account with AWS and are then able to pull in data, deploy servers, run services and jobs and view dashboards. It suits software companies who are able to work in this way then build an application layer that runs in AWS, offering a variety of functionality. The data then has a perimeter, without leaving AWS.

Sennen operates this way already – what is known as native hosting. We can either build this model for clients or work with the system they already have.

Once in the data lake, we can extract the key pieces of data we need. For example, our offshore product is all about integrating lots of different data sources, including business applications for managing shifts and personnel, weather forecasts, real-time readings from the site and vessel positions. We pull all the data together to build specific workflows and systems for managing processes.

The bottom line

Data lakes are a critical part of the digital transformation of the energy system. Owners and operators need high-value applications that are compatible with the data lake model and are also user-friendly, automated and problem-solving. Better data is the stepping stone to delivering efficiency and value as the renewables sector continues to grow at pace.

To learn more about how Sennen can help you effectively manage your clean energy assets, contact us today.

Subscribe to Sennen