Short term energy consumption forecasting using neural basis expansion analysis for interpretable time series

Smart grids and smart homes are getting people’s attention in the modern era of smart cities. The advancements of smart technologies and smart grids have created challenges related to energy efficiency and production according to the future demand of clients. Machine learning, specifically neural network-based methods, remained successful in energy consumption prediction, but still, there are gaps due to uncertainty in the data and limitations of the algorithms. Research published in the literature has used small datasets and profiles of primarily single users; therefore, models have difficulties when applied to large datasets with profiles of different customers. Thus, a smart grid environment requires a model that handles consumption data from thousands of customers. The proposed model enhances the newly introduced method of Neural Basis Expansion Analysis for interpretable Time Series (N-BEATS) with a big dataset of energy consumption of 169 customers. Further, to validate the results of the proposed model, a performance comparison has been carried out with the Long Short Term Memory (LSTM), Blocked LSTM, Gated Recurrent Units (GRU), Blocked GRU and Temporal Convolutional Network (TCN). The proposed interpretable model improves the prediction accuracy on the big dataset containing energy consumption profiles of multiple customers. Incorporating covariates into the model improved accuracy by learning past and future energy consumption patterns. Based on a large dataset, the proposed model performed better for daily, weekly, and monthly energy consumption predictions. The forecasting accuracy of the N-BEATS interpretable model for 1-day-ahead energy consumption with “day as covariates” remained better than the 1, 2, 3, and 4-week scenarios.

Similar content being viewed by others

Deep learning-driven hybrid model for short-term load forecasting and smart grid information management

Article Open access 14 June 2024

Deep learning-based forecasting of electricity consumption

Article Open access 18 March 2024

Short-term natural gas load forecasting based on EL-VMD-Transformer-ResLSTM

Article Open access 02 September 2024

Introduction

The concept of smart technologies is gaining popularity in vibrant communities. Smart grids and smart homes are some of the facilities provided by modern smart cities. The smart grids serve as energy production units to provide unstoppable energy to smart homes 1 . The demand for smart home energy emerges in the need for a smart energy consumption prediction mechanism in the smart grids so that production units can produce the required amount of energy per resident’s demand. This concept saves the resources of production and reduces energy wastage 2 . Now researchers are focusing on making smart grids more intelligent to predict the energy consumption of the connected houses and produce energy with less involvement of humans in the energy production process.

The current business models of the grids are more focused on energy production without consideration of future demands and having information about the customers who will be connected with grids due to the rapid construction of new buildings 3,4 . The advancements in smart homes have increased the burden on smart grids; hence energy consumption has also increased 5,6 . Current smart city facilities emphasize automation and security; companies are now focused on making smart homes, smart grids, and smart cities more energy-efficient. The research in smart homes focuses on designing energy-efficient appliances and optimizing energy by devices as per external weather conditions 7,8 . Many aspects of smart homes require automation, including lighting, security, heating, and air conditioning 9 . Besides smart grids, it is also important to improve building energy efficiency. In fact, 70% of businesses focus on making buildings more energy-efficient to reduce energy consumption 10,11 . Various protocols are used in the home and grid network, including Zigbee, KNX, and Z wave 12,13 . Also, the devices can be connected via Wi-Fi or wired networks like Ethernet. Bluetooth can also be utilized in smart homes for short-range communications 11 . Security and energy efficiency are two critical concerns regarding smart homes and smart grid adaptability 14,15 , and are the areas of research where scientists are making progress every day. This paper aims to improve the energy efficiency of the smart grid production unit in accordance with future consumer demands while taking into account energy production costs and wastage.

In the traditional operating environment of smart grids, companies predict the demand for energy for the next day, month, and year. The energy production process can be improved in smart grids, and energy savings can be achieved through prediction methodologies. The prediction allows the customers and production companies to anticipate how much energy will be consumed in the future 16,17,18 . Researchers have divided load forecasters into three categories, depending on the forecasting horizon. Forecasting horizons of up to one week are defined as short-term load forecast (STLF). A medium-term load forecast (MTLF) predicts load profiles that range from 1 week to 1 year in advance, and a long-term load forecast (LTLF) predicts load profiles from one year into the future 19 . The proposed study focuses on STLF using the dataset acquired from the residential sector. Residential buildings consume more energy than commercial buildings, therefore, to overcome the increasing energy demand in the future, we need a system that saves more and more energy 20 . More energy-saving can be achieved by real-time monitoring energy consumption in smart grids 21 . Traditionally, researchers have used machine learning algorithms and statistical methods to predict energy consumption over the last few decades 22 . Until very recently, prediction models have faced significant challenges due to the nature of data and noise. Deep learning models perform better when pre-processing is performed. However, most methods cannot predict energy consumption due to noise and improper behavior of the energy consumption data 23 . The N-BEATS method has been recently introduced for the time series data and has performed better on benchmark datasets. The proposed model uses N-BEATS, which remains suitable for solving complex patterns of energy consumption data. The interpretability feature of N-BEATS makes this unique model 24 . Because of its complexity and different number of days of every month in the calendar, traditional methods need help dealing with time series data compared to N-BEATS.

The focus of the proposed model is to make the smart grids intelligent enough to behave according to the requirements of future energy consumption of multiple customers 25 . Energy production control and monitoring are the most critical topics related to energy efficiency in smart grids. The model also focuses on the optimal use of smart grid resources. The other main focus of the model is to learn the energy consumption of current customers, use that information to predict the potential energy consumption of new customers.

Several researchers have developed energy-efficient models to deal with smart grids’ energy consumption optimization and prediction issues. Due to a lack of data (for multiple customers), the models cannot predict energy consumption accurately when applied to data of multiple customers 26 . Previous models considered only future energy consumption without considering customer behavior. Adaptability and ease of use are the main disadvantages of traditional methods. This paper aims to improve the prediction accuracy of deep learning algorithms using pre-processing of the data and including the covariates to learn the exact pattern of past and future energy consumption. The proposed model enhances and fine-tunes the newly introduced method of Neural Basis Expansion Analysis for Interpretable Time Series (N-BEATS) with an extensive dataset of the energy consumption of 169 customers. Further, to make sure the validity of the results of the proposed model, a performance comparison has been carried out with the Long Short Term Memory (LSTM), Blocked LSTM, Gated Recurrent Units (GRU), Blocked GRU and Temporal Convolutional Network (TCN) available in the Darts Python library. The model has performed better with the 43 (test data) customers’ daily, weekly, and monthly energy consumption predictions, proving the proposed model’s efficiency on a big dataset. The contributions of this paper are summarized as follows:

  1. (a) An N-BEATS-based model that considers the behavior of customers (customer-based) is developed for forecasting days, weeks, and months in advance for demand-side management.
  2. (b) The model considers data having the consumption behavior of multiple customers compared to the traditional methods. Considering the data of multiple customers makes the model unique and reliable for the smart grid.
  3. (c) The N-BEATS model performs a time-series analysis of the input and the maintenance of time-series behavior as part of the training process.
  4. (d) A high-dimensional data processing model is developed to simulate the behavior pattern of load consumption over a specific period, which eliminates the problem of over-fitting caused by changes in the data pattern over time due to varying data patterns.
  5. (e) A variety of state-of-the-art deep learning algorithms, including LSTMs, interpretable LSTMs, GRUs, interpretable GRUs and TCNs, are used to evaluate the proposed N-BEATS model.

The organization of rest of the paper is as follows. The “Related work” section presents the literature review. The “Research design and methods” section discusses research design and methods. The “Results and discussion” section presents the experimental results and detailed discussion; finally, the “Conclusion” section presents the study’s conclusion.

Related work

Energy consumption forecasting remain a hot topic for researchers; hence different studies have been published in the literature. The focus areas of studies vary from pricing schemes to energy prediction techniques in different domains. Researchers have evaluated how response time and non-linearity impact system identification accuracy in energy forecasting models for buildings. The other technique proposed in Ref. 27 is classifying buildings into high-power and low-power consumption buildings based on the multi-layer perceptron and random forest 27 . It helps to identify the buildings that consume too much energy and provide them with energy for their needs. In addition to optimizing energy consumption, the classification methods notify customers to change their energy consumption behavior 28 . Initially, the classification methods do not help reduce consumption but only notify the authorities. The optimization frameworks also remain helpful for proper energy distribution. Hui et al. 29 proposed a real-time local electricity market (LEM) framework to maximize inverter-based HVACs’ regulation potential with multiple DERs, and developed a distribution network optimization framework. Users can use it to evaluate transactive capacity in LEMs to determine regulatory capacities. The LEM also avoids real-time iterations, easing participation difficulties for smaller users. The combination of prediction and optimization algorithms have been used in the smart grid environment for various purposes, including energy management 30,31,32,33 . These methods focus on integrating demand, storage and energy production. The adaptive elements and forecasting techniques manage grid resources optimally. Ullah et al. 34 proposed a hybrid deep learning model to detect electricity thieves in smart grids. Under-sampling, also known as a near miss, solves the class imbalance problem. With AlexNet, the curse of dimensionality issue has been handled, while adaptive boosting (AdaBoost) classified normal consumers and energy thieves. The tuning of hyper-parameters remain critical to achieving better prediction accuracy; hence a bee colony optimization algorithm has been used to tune the AdaBoost, and AlexNet 35 . Comparing the hybrid model to its counterparts, the proposed hybrid model achieves maximum classification accuracy. Han et al. 36 proposed a novel approach to model smart buildings to assess energy consumption based on the concept of physical-data fusion modeling (PFM). Ye et al. 37 proposed a theoretical benchmark for optimizing the coordination of local electricity markets (LEM) using a system-centric model. The approach serves as a model-free coordination method for consumer-centric LEM. Authors have used the multi-agent deep reinforcement learning method to integrate multi-actor attention-critic, and prioritized experience replay approaches. The proposed LEM design successfully compresses flexibility services (FS) provision functions and local energy trading functions, remaining more effective than previous methods. The most prominent studies focused on pricing schemes in the smart grid environment. Aurangzeb et al. 38 developed a fair pricing strategy (FPS) based on power demand predictions using an extreme learning machine (ELM) to save up to 11% of the cost of electricity. Mansouri et al. 39 propose a novel approach for microgrid scheduling and distribution feeder reconfiguration (DFR) considering load demand, power production and market price. The simulation findings reveal that when the distribution system operator (DSO) can alter the system, the divergence from ideal microgrid scheduling is significantly lower than in cases where the system design is fixed. Wu et al. 40 present an innovative predictability model that multiple factors and optimization algorithms can interpret. This model performs a variational mode decomposition using a wind speed sequence with several parameters of temporal fusion transformers (TFTs) optimized using adaptive differential evolution. Liu and Wu 41 used an adjacent nonhomogeneous gray model to predict the consumption of renewable energy in Europe by weighing the latest value compared with the historical data based on the principle of adjacent accumulation. The social media information-based model of oil market forecasting of the US is another dominant forecasting model by Wu et al. 42 .

The forecasting has been carried out in two different areas focusing on smart homes and smart grids. The energy prediction in the smart grid environment remains critical as the grid remains responsible for the power supply and communication with the production units. However, it is necessary to understand and critically evaluate the models of smart homes and grids. The forecasting has been divided into three categories based on the forecasting horizons as; STLF, MTLF and LTLF 19 . The studies focusing on three forecast horizons have been critically evaluated to identify the limitations and research gaps.

Short-term load forecast (STLF)

Due to the higher production cost of electrical energy, production companies, scientists, and researchers are trying to optimize energy usage and production to avoid wastage and excess energy production. The models considering energy consumption forecasting up to one week are categorized as STLF. Most studies have examined energy consumption predictions hourly, daily, and weekly. The half-hourly energy consumption prediction has been very rare in studies 43,44 . Considering the complexity and cost of the calculation, most of the research concentrates on the hourly and daily predictions of energy. Various algorithms have shown better accuracy, like using a hybrid approach that uses switching delayed particle swarm optimization (SDPSO) for short-term load forecasting; Zeng et al. 45 used an extreme learning machine and SDPSO algorithm for short-term load forecasting. Predictions are for the short-term, which are mainly based on 1 h to 1 week. With the enhanced capabilities of the SDPSO, a global search can be performed to reach the optimal solution. The SDPSO has been used in extreme learning machines to optimize hidden node parameters. Although the hybrid models improve accuracy, they also increase the complexity of the system 46 . Hence, the model has higher complexity and more calculation time than the single algorithm. The complexity of the model makes it unsuitable for the smart grid environment 47 . A comprehensive study on the short-term energy prediction methods has been published 47 , and it covers the methodological perspectives of the different models. The adaptive method of short-term load forecasting using self-organized maps and SVM by Fan et al. 48 also contributed to the field of energy efficiency.

Ramos et al. 49 focused on the energy consumption prediction of a building involving sensors and device consumption recording. They analyzed two prediction methods: k-Nearest Neighbor and artificial neural network (ANN). A multi-armed bandit algorithm is used in the decision-making process in the reinforcement learning framework to establish the most significant possible algorithm in each interval of five minutes, thus enhancing prediction accuracy. Various exploration alternatives have been tested with reinforcement learning in upper confidence bounds, and greedy algorithms 49 .

Torres et al. 50 used a long short-term memory (LSTM) network to forecast short-term energy consumption due to its capability of dealing with sequences of time series data. Before using a coronavirus optimization algorithm (CVOA) 51 , the best values for various hyper-parameters were obtained by calculating how the SARS-Cov-2 (CVOA) virus spreads. With the optimal LSTM, the electricity demand has been predicted with a 4-h forecast horizon and compared with CVOA. As a comparison, recent deep neural networks have been optimized with grid search techniques, including temporal fusion and deep feed-forward neural networks.

Karijadi and Chou 52 proposed a hybrid approach using long short-term memory (LSTM) and random forests (RF) to estimate building energy consumption. They transformed energy consumption data into multiple components and predicted the highest frequency component using RF, then LSTM for the remaining components. Jogunola et al. 53 developed a hybrid deep learning architecture to predict commercial and residential building energy usage accurately. The bidirectional BLSTM designs, convolutional neural networks (CNNs), and auto-encoders (AEs) with bidirectional long short-term memory (LSTM) 54 . The AE-BLSTM and LSTM layers make predictions, while the CNN layer gathers features from the dataset. The findings improved calculation time and mean squared error compared to a vanilla LSTM and CNN BLSTM-based framework (EECP-CBL). Fu et al’s 55 models performance often improve with increased computation time when using deep reinforcement learning (DRL) for energy usage estimation. The deep-forest-based DQN (DF-DQN) proved more accurate than the deep deterministic policy gradient (DDPG).

Bilgili et al. 56 used long short-term memory (LSTM) neural network, adaptive neuro-fuzzy inference system (ANFIS) with subtractive clustering, ANFIS with fuzzy c means, and ANFIS with grid partition for the short-term one-day ahead energy consumption prediction. All of the ANFIS models were surpassed by the LSTM model. Peng et al. 57 used wavelet transform and LSTM to predict energy consumption accurately. Somu et al.’s 58 model used LSTM and kCNN for energy consumption forecasting because of the spatiotemporal dependencies in the energy consumption data.

Medium-term load forecast (MTLF)

The MTLF forecasting models range from one week to one year. As a result of the difficulty in finding large datasets, the previous studies have mainly looked at weekly rather than one-year forecasting 59,60 . On the other hand, deep learning methods require larger datasets for proper training, but only some researchers have succeeded in improving accuracy with small datasets. The 1-week to 1-month MTLF method by Fayaz and Kim 61 has used a deep extreme learning machine model to predict energy consumption in smart homes and compared it with the adaptive neuro-fuzzy inference system (ANFIS) and an artificial neural network (ANN). Deep extreme learning outperformed the other two algorithms, using the method of trial and error for activation functions and the selection of hidden layers. The disadvantage of the trial and error method is the extra calculations to find the optimal solution 62 . The problem with the small datasets is that when time series is used, it reduces the model’s performance because of the limited number of data 63 . The time series prediction requires sufficient data so that deep learning algorithms can learn the data patterns for the prediction. The other most prominent deep learning-based MTLF techniques are 64,65,66 , although they fall under the STLF as well because most of the authors have considered STLF and MTLF in their studies 67,68 . The quantile regression and statistical methods also performed better for the MTLF 69,70,71 . Wahid et al. 72 used the multi-layer perceptron, logistic regression and random forest techniques to predict daily energy consumption. However, it has limitations, as the authors have used a small dataset. Because the statistical methods are simple, the algorithms perform poorly when data from multiple customers is incorporated 73 . In comparing the three classifiers, logistic regression was better than the other two methods 72 . The deep learning methods have been applied to the distribution feeders for load forecasting 74 . Jogunola et al. 75 assessed energy usage in commercial buildings in a post-COVID-19 environment while investigating the influence of digitization to uncover potential new opportunities using actual power consumption data. The primary goal was to determine how energy demand varies with occupancy rate. The findings show that the reduction in energy demand is different from occupancy, resulting in high energy costs. Because inefficient energy use increases consumption, improving energy efficiency techniques such as time of use and scheduled energy use can help conserve energy.

Long-term load forecast (LTLF)

The long-term load forecasting techniques have been presented using the machine learning and statistical methods. The particle swarm optimization performed better for the LTLF model in the Kuwait energy demand network 76 . The problem with the LTLF is the requirement of big datasets so that models can be trained, although the statistical methods can perform better with the small datasets compared with the deep learning models 77,78,79 . The alternative to the big data has been considered as monthly energy prediction for 1 year, instead of considering yearly energy consumption datasets 80,81 . The backprogation-based methods have performed better with the LTLF as the adjustment factor enhanced the performance of traditional BPA 82 . The LTLF has been carried out using different optimization algorithms for the electricity’s load of the Sivas province of Turkey 83 . The model helps to meet the energy demand of the province. The literature review has revealed that only a few authors have considered the LTLF due to the unavailability and complexity of the data. The studies mainly considered MTLF for a longer duration of months and extended it up to one year, so the LTLF and MTLF remained interlinked with one other. The other significant problem with the LTLF models is the consideration of data from very few customers; hence these models need to be tested on larger datasets so that they can be implemented in the smart grid environment. Due to the short duration of the data, many authors have yet to consider the time series data, which is very important while predicting the energy consumption load.

A detailed review of these methods has revealed that the deep learning models remain successful in predicting short and long-term energy consumption. For the shorter datasets, the statistical models have performed better. The main issue with these models is to deal with the complex sequences of the time series data hence enhancing the need for a robust model to tackle the issue of energy consumption prediction and prediction of energy consumption behavior of new customers who are going to join the smart grid in future. The majority of the methods are focused on complete datasets without consideration of the individual energy consumption behavior of the customer. Hence, the model needs to handle the time series data better and handle multiple customers’ complex energy consumption behavior.

Research design and methods

The proposed methodology of future energy consumption prediction aims to enhance the accuracy of the N-BEATS interpretable algorithm 84 . The methodology starts with pre-processing of the data and removal of the outliers. The second step is smoothing data and then an N-BEATS interpretable model based on the n number of stacks containing n number of blocks with fully connected layers (FC) stacks having ReLU as activation function with backcast and forecast functions. The proposed model can be seen in Fig. 3.

For the determination of the optimal structure of N-BEATS interpretable, a trial and error method has been adopted. Finally, a model has been designed with optimal structure with an input chunk size of 30, an output chunk size of 15, 10 block size, 20 hidden layers having layer width of 512, a learning rate of 1e−3, number of epochs from 100–200 having epoch validation period of 1, and considering the batch size of 1000–1500. The model uses ReLU as an activation function for the hidden layers. The parameter setting of the other algorithms can be seen in Table 1. The methods used in this study were carried out by relevant guidelines and regulations.

figure 1

Interquartile Range (IQR) refers to the difference between a dataset’s fourth and third percentiles (the upper and lower quartiles). Therefore, the interquartile range of the dataset would follow a breakup point of 25%. IQRs are used to identify outliers in box plots when expressed as deviations. An outlier is an observation that falls below or exceeds Q1 + 1.5 IQR. In the proposed model, the outlier identification and removal in Python have been done using NumPy. The pre-processed data can be seen in Fig. 2. The IQR can be calculated by (3).

$$\begin OQR=Q3-Q1, \end$$

where the upper quartile can be denoted by Q3 and lower quartile as Q1.

figure 2

N-BEATS: neural basis expansion analysis for interpretable time series

The proposed N-BEATS interpretable model can be seen in Fig. 3. It must be highlighted that the functioning, details, idea of the model diagram, functional components and equations of N-BEATS have been taken from the 84 . The reader may refer 84 for further details of the N-BEATS interpretable algorithm. The major building block of the N-BEATS is the blocks; hence the proposed N-BEATS interpretable contains 10 blocks. For simplicity, Fig. 3 depicts two blocks only. The stacks are responsible for holding different blocks inside; hence Fig. 3 shows 1 stack having 2 blocks. The basic function of \(i^\) block is to take input to suppose \(a_\) and provide the output of \(bx_\) and \(by_\) . The first block of the N-BEATS interpretable takes input \(x_\) along with the look-back windows. In comparison, the last measured observation by the block remains the ending point of the look-back window. The proposed method contains blocks having multi-layers forming a fully connected layers (FC) network with ReLU function. While there is a total of 20 hidden layers having 512 layers width, making a complex deep network. The layers predict the expansion coefficients for forecast \(\theta ^ \) and backcast \(\theta ^ \) of energy consumption 84 .

figure 3

The doubly residual stacking has been used to connect all 10 blocks of the proposed model having \(g_\) and \(g_\) shared among different layers of stacks for the hierarchical aggregation of the forecast. The hierarchical aggregation allows the designing of deep neural networks having interpretable outputs. The forecast horizons H, of 1, 7, 14, 21, and 30 have been considered to learn the behavior of energy consumption by different customers. The residual outputs of previous blocks are the input \(x_\) 84 . The model contains a fully connected network of prediction components with a forecast \(\theta ^ _\) and backcast \(\theta ^ _\) . Every block has output forecast \(\hat_\) and backcast \(\hat_\) , having a length of H. The second prediction components with backcast \(g^ _\) and \(g^ _\) to accept the forecast \(\theta ^ _\) and backcast \(\theta ^ _ \) expansion coefficients. It helps in the production of backcast \(\hat_\) and forecast outputs \(\hat_\) . The block of N-BEATS \(l-th\) can be described by Eqs. (4a), (4b), (4c), (4d), (4e) and (4f) 84 .