News

Why data lakes are key in the renewables industry

1st March 2022

The shift to an energy system powered by renewables is going to demand a technological evolution that places data at its core, requiring a complete overhaul of IT architecture. Paul Grimshaw, chief technology officer at Sennen, explains why market leaders are moving to cloud native technology and data lakes.

In every sector, business leaders are realising that embracing cloud native technology is critical to their ongoing success. Renewables is a data-heavy industry on a dramatic growth curve. Real digitisation means collecting vast amounts of information from sensors, day and night.

Harvesting and processing this in the cloud offers the opportunity to save costs and massively increase the value derived from data to enhance asset value and succeed in a highly competitive market. This method of storage is also known as a ‘data lake’ – a repository that can hold huge amounts of raw data for indefinite periods of time.

What is a data lake?

A data lake is a cloud-based storage system, owned by you, which allows you to collect all data in a semi-structured form into one very large, low-cost location. When data is required for a particular application, it is extracted and transferred into a high-performance database. It’s not particularly new technology, but as a general trend in renewables, companies are starting to see its advantages.

Many SaaS (Software as a Service) companies serve the sector by collecting SCADA data, hosting it in their own software and providing analytics tools. The drawback with a third party is that you get stuck with a limited set of features and no database of your own. The insights provided are a blackbox – impossible to know if the analytics are good or bad. It also becomes too costly on large operating portfolios (500MW+).

Large companies working at scale don’t want to be limited by the old model. They want all their IoT data flowing into a data lake and to deploy software services natively.

The key players

Big cloud providers like Amazon Web Services, Azure and Google are moving aggressively into the market to be the go-to providers. They are creating very powerful IoT solutions and promoting the use of remote ‘edge’ devices, which can be located on site to collect and stream back data.

An even simpler solution is on the horizon also; newer turbine and inverter models increasingly have standardised interfaces, which can be directly interfaced with AWS and Azure.

The advantages of a data lake

Firstly, cloud storage is competitive. A data lake is a cheap option at scale and is also great at collecting everything. It will hoover up all available data, rather small subsets of data, which can be amassed to enable more accurate reporting and looked back on over time to answer questions. It’s also more secure. With a data lake, companies have more control as their IT team manages the structure. This way, software providers work within the data lake rather than being the sole handler.

What are the challenges?

There is a learning curve, during which time there’s the possibility of making mistakes. Often, companies are confused about what a data lake actually is. It’s different to a database, which is a smaller quantity of one type of data such as the real-time signals from a wind turbine.

The other challenge is getting the right architecture structure so that everything flows in correctly, triggering accurate onward processes. It requires a good cloud architect who understands best practice.

How does it work?

AWS offers a cloud environment. You, as the client, are a tenant within it. You directly open an account with AWS and are then able to pull in data, deploy servers, run services and jobs and view dashboards. It suits software companies who are able to work in this way then build an application layer that runs in AWS, offering a variety of functionality. The data then has a perimeter, without leaving AWS.

Sennen operates this way already – what is known as native hosting. We can either build this model for clients or work with the system they already have.

Once in the data lake, we can extract the key pieces of data we need. For example, our offshore product is all about integrating lots of different data sources, including business applications for managing shifts and personnel, weather forecasts, real-time readings from the site and vessel positions. We pull all the data together to build specific workflows and systems for managing processes.

The bottom line

Data lakes are a critical part of the digital transformation of the energy system. Owners and operators need high-value applications that are compatible with the data lake model and are also user-friendly, automated and problem-solving. Better data is the stepping stone to delivering efficiency and value as the renewables sector continues to grow at pace.

To learn more about how Sennen can help you effectively manage your clean energy assets, contact us today.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
__hssc	30 minutes	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__hstc	5 months 27 days	This is the main cookie set by Hubspot, for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
__lotl	5 months 27 days	This cookie is set by Lucky Orange to identify the traffic source URL of the visitor's orginal referrer, if any.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_106330015_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_lo_uid	2 years	This cookie is set by Lucky Orange as a unique identifier for the visitor.
_lo_v	1 year	This cookie is set by Lucky Orange to show the total number of visitor's visits.
_lorid	10 minutes	This cookie is set by Lucky Orange to identify the ID of the visitors current recording.
hubspotutk	5 months 27 days	HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.

Cookie	Duration	Description
_gd1655717335021	session	No description
_lfa_test_cookie_stored	past	No description
AnalyticsSyncHistory	1 month	No description
li_gc	2 years	No description