0% found this document useful (0 votes)
59 views18 pages

Hybrid BiLSTM for Electricity Forecasting

This research article presents a novel hybrid method for forecasting electricity loads and prices using a combination of ensemble empirical mode decomposition (EEMD) and a bidirectional long short-term memory with attention mechanism (BiLSTM-AM) model. The proposed method effectively predicts short- and medium-term spikes in electricity loads and prices, demonstrating superior accuracy compared to existing methods, with a mean absolute percentage error (MAPE) reduction of up to 60%. Validation with multiple datasets confirms the model's reliability and performance in capturing temporal variations in energy markets.

Uploaded by

Ruchira Tabassum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views18 pages

Hybrid BiLSTM for Electricity Forecasting

This research article presents a novel hybrid method for forecasting electricity loads and prices using a combination of ensemble empirical mode decomposition (EEMD) and a bidirectional long short-term memory with attention mechanism (BiLSTM-AM) model. The proposed method effectively predicts short- and medium-term spikes in electricity loads and prices, demonstrating superior accuracy compared to existing methods, with a mean absolute percentage error (MAPE) reduction of up to 60%. Validation with multiple datasets confirms the model's reliability and performance in capturing temporal variations in energy markets.

Uploaded by

Ruchira Tabassum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Hindawi

International Journal of Energy Research


Volume 2023, Article ID 3815063, 18 pages
[Link]

Research Article
Electricity Load and Price Forecasting Using a Hybrid Method
Based Bidirectional Long Short-Term Memory with Attention
Mechanism Model

William Gomez,1 Fu-Kwun Wang ,1 and Zemenu Endalamaw Amogne1,2


1
Department of Industrial Management, National Taiwan University of Science and Technology, Taipei, Taiwan
2
Faculty of Mechanical and Industrial Engineering, Bahir Dar Institute of Technology, Bahir Dar University, Bahir Dar, Ethiopia

Correspondence should be addressed to Fu-Kwun Wang; fukwun@[Link]

Received 24 October 2022; Revised 6 December 2022; Accepted 15 December 2022; Published 3 February 2023

Academic Editor: Longxing Wu

Copyright © 2023 William Gomez et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.

The accuracy of forecasting short- or medium-term electricity loads and prices is critical in the energy market. The amplitude and
duration of abnormally high prices and load spikes can be detrimental to retailers and production systems. Therefore, predicting
these spikes to effectively manage risk is critical. In this paper, a novel hybrid method that combines ensemble empirical mode
decomposition (EEMD) algorithm and a bidirectional long short-term memory with attention mechanism (BiLSTM-AM)
model is proposed to predict electricity loads and prices. A simple approach is proposed to determine the number of intrinsic
mode functions (IMFs) that decompose raw data using EEMD to avoid overdecomposition, irrelevant components, and high
computational cost. Each selected mode is then modeled with BiLSTM-AM to obtain a predicted sequence. These sequences
are summed and then reverted to obtain the final predicted value. The proposed method is validated using two datasets (PMJ
and Australian Energy Market Operator) with different time intervals to demonstrate the generality and robustness of the
forecasts, especially in temporal valley or peak forecasting. The results show that the proposed method outperforms other
methods in prediction accuracy and spike-capturing ability, with EEMD reducing the mean absolute percentage error (MAPE)
by 53%, 54%, and 60%, respectively. In the three forecast periods, the average MAPE and R 2 are 0.097 and 0.92, respectively.
̲
Furthermore, we use Kolmogorov-Smirnov predictive accuracy (KSPA) test and model confidence set (MCS) test to validate
the superiority of the proposed model. The results demonstrate its suitability, reliability, and performance in short- and
medium-term forecasting.

1. Introduction The ability to forecast energy market loads and prices


can enable consumers to adjust their consumption or supply
The energy market is one of the most important markets in the while still achieving their economic and environmental
world, and its price fluctuations affect the global economy. In goals. The production, transmission, distribution, and mar-
the energy industry, reliable power system forecasts allow kets of electrical energy require accurate price and load fore-
market participants, especially production sectors, to base casts from suppliers, investors, financial institutions, and
their scheduling and production plans on forecast informa- others. Electricity operators can develop effective market
tion. Investors are interested in short-term and long-term eco- plans to increase the benefits of energy management through
nomic price trends. In addition to helping reduce volatility in accurate energy price forecasts [1]. Therefore, developing
speculative markets, energy forecasts can also help predict predictive models that can provide accurate prices is a major
major financial and economic events, such as the 2009 finan- priority. Different types of literature recommend various
cial crisis or energy shortages from COVID-19, to ensure bet- forecasting methods, ranging from traditional statistical time
ter industrial and economic progress. series to modern data analytic models. However, the
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 International Journal of Energy Research

performance of a single technique-based predictive model is ple models to form these methods shows that they perform
insufficient due to inherent limitations. Energy prices are better than a single model. Decomposition methods include
difficult to predict due to rapid peaks and troughs caused frequency scale resolution using wavelet transform analysis
by changes in supply and demand due to intraday system [20] and empirical mode decomposition (EMD) [21] for
constraints [2]. nonlinear and nonstationary data. The task of combining
Therefore, a generalized forecasting model will help in models through decomposition algorithms such as EMD is
grid monitoring, optimal dispatch, and energy production the process of finding hyperparameters and the number of
and conversion control. Not much works has attempted to recommended intrinsic mode functions (IMFs). This pro-
model electricity load and price with the same model to cess involves an optimization analysis of the trade-off
examine its generality, flexibility, and reliability. Different between accuracy and computation time. Determining
time-range datasets were also tested to test its ability to han- spikes and peaks in the energy market is necessary for a wide
dle different variational frequency data and better capture range of decisions, such as risk management, transmission
valley-peak. We applied a simple technique to determine congestion, and outage monitoring, so we apply an ensemble
the number IMFs to avoid over-decomposition and irrele- empirical mode decomposition (EEMD) method.
vant components and reduce computation time. Highly unpredictable price spikes can occur due to many
High demand for electric vehicles and microgrids will complex factors affecting intraday grid conditions. We pres-
lead to increased energy distribution and frequent market ent a novel method based on ensemble empirical mode
surges. So, electricity load and price forecasting is becoming decomposition and bidirectional long short-term memory
more challenging. Much of the recent literature looks more with attention mechanism (BiLSTM-AM) to the electricity
at models that can improve accuracy and capture abnormal load and price forecasting problem. BiLSTM-AM can cap-
price situations. ture nonlinear and temporal trends in energy prices and
Over the years, machine learning (ML) models have demand. The BiLSTM-AM model is used to predict normal
made great contributions to high accuracy forecasting of economic situations and record price and load spikes above
electricity load and price forecasting [3]. The nonlinear a predetermined threshold. The proposed model is validated
learning capabilities and fault tolerance of these models using two electricity loads and one electricity price data. The
make them excellent decision-making tools for rapidly contributions of this paper are summarized as follows:
changing energy markets [4]. Lu et al. [5] combined three
machine learning methods, including least absolute shrink- (1) A Bayesian optimization (BO) algorithm based on
age and selection operator (LASSO) for features extraction, random forest regressors is used to tune the hyper-
principal component analysis (PCA) for dimensionality parameters of the proposed hybrid EEMD and
reduction, and Bayesian ridge regression (BRR) for predictive BiLSTM-AM model
load. Khan et al. [6] applied a hybrid model combining multi-
(2) The proposed model can be used to monitor electric-
layer perceptron, support vector regression (SVR), and
ity load and price
CatBoost. Bhinge et al. [7] developed a nonparametric model
based on a Gaussian process regression (GPR) approach to (3) Provide a simple and effective method to determine
predict energy consumption with a relative error of less than the number of IMFs, avoiding over-decomposition,
6% on test data. A gradient boosting regression trees (GBRT) irrelevant components, and high calculation costs
method was developed for 15-minute electric load forecast-
ing in time domain industry energy [8]. Zhao et al. [9] used The rest of this article is organized as follows. The liter-
a kernel extreme learning machine (KELM) model to ature related to the energy market forecasting models are
improve predictive performance. Tschora et al. [10] applied reviewed in Section 2. Section 3 describes the proposed
some machine learning models called ×AI to study the ability model. Section 4 provides the analysis results. Finally, we
of different machine learning techniques to accurately pre- make conclusions and further research directions.
dict electricity prices.
Recently, deep learning models have received more 2. Related Work
attention due to their ability to handle energy time-series
data’s nonlinear and chaotic properties. Several methods, Price and load spikes frequently occur in the energy market
including long short-term memory (LSTM) [11], hybrid that can significantly impact economic activity. Therefore,
methods based on autoencoders and variational mode predicting normal price and load fluctuations as well as
decomposition (VMD) [12], reinforcement learning [13], spike situations, helps gauge the risks to the economy asso-
support vector machine (SVM) [14], multiobjective optimi- ciated with changes in energy markets. In energy markets,
zation [15], and combining convolution, gated recurrent capturing spikes is vital as they are a core component of
unit (GRU), and LSTM [16] have been applied to short- market attention. The BiLSTM model, coupled with an
term prediction. attention mechanism, allows us to model load and price
Modeling data using attention mechanisms has achieved spikes and provide accurate predictions. A number of
great success in many areas of deep learning [17], recogni- methods in the recent literature have been used to address
tion of speech [18], and energy time series data [19]. Recur- price or load spikes. Singhal and Swarup [22] applied an
rent neural networks with attention mechanisms are more artificial neuron network (ANN) to capture normal, small,
accurate than traditional neural networks. Combining multi- and large price spikes to predict market-clearing prices
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
International Journal of Energy Research 3

Table 1: Previous related work on electricity price and load.

Data type Reference Models Data source


[1] ̂ + ARIMA,ALG − M
ALG − M ̂ + day ago Southwest power Pool (SPP) market
[2] F-ARIMA PJM
[23] SEMPP Australian energy market operator (AEMO)
Price
[27] FFNN Electricity markets of mainland Spain and California
[28] RNN PJM
[32] SEPNet New York, USA (ENGIE group database)
[2] DL-EVT PJM
[13] SD-RL-BPNN Korea
Load [14] SVM Eastern Province in Saudi Arabia
[15] MOFA-GRNN Australia
[30] SD-EMD-LSTM ISO New England
Note: F-ARIMA: Fourier-autoregressive integrated moving average; SEMPP: multivariate self-exciting point process model; FFNN: feedforward neural
network; RNN: recurrent neural network; SEPNet: combing variational mode decomposition (VMD), a convolutional neural network (CNN), and gated
recurrent unit (GRU); DL-EVT: deep learning with extreme value theory; SD-RL-BPNN: similar days reinforcement learning back-propagation neural
network; SVM: support vector machine; MOFA-GRNN: modified generalized regression neural network with a multiobjective firefly algorithm; SD-EMD-
LSTM: similar days-empirical mode decomposition-long short-term memory.

(MCPs) in day-ahead energy markets. Clements et al. [23] [34] used EMD with ANN for wind speed prediction.
investigated the regional relationship of load spikes in Aus- Table 1 summarizes some previous related work on energy
tralia. Christensen et al. [24] used autoregressive conditional market forecasting.
hazard to handle spikes by treating prices as discrete-time
point processes. 2.1. LSTM and BiLSTM Models. The availability of data and
Several methods, such as similarity-day approach [13], computing power has made deep learning an integral part of
SVM [14], and ANN and fuzzy logic methods [25], have time series forecasting. In machine learning, an LSTM model
progressed in load forecasting to help overcome the limita- is a special type of recurrent neural network (RNN) devel-
tions of traditional forecasting models. Kiartzis et al. [26] oped by Hochreiter and Schmidhuber [35]. It quickly gained
proposed an expert system-based peak prediction model. popularity, particularly for solving time series prediction
Polson et al. [2] presented a combination of deep learning issues in many research areas. These networks were intro-
and extreme value theory (DL-EVT) to deal with sharp duced to eliminate long-term dependency and vanishing
peaks, troughs, and sudden price changes in energy markets gradient problems. In regular RNNs, the problem mainly
due to the fluctuations in demand and supply caused by occurs when previous information is connected with new
intraday constraints. Still, this approach fails to capture the information. As a result, LSTM units are used instead of hid-
ups and downs in the levels. Radovanovic et al. [1] devel- den layers to better process timing-related inputs. LSTM
oped a holistic approach to recovering many energy market units consist of forget gate ð f t Þ, input gate ðit Þ , cell state
ðC~ t Þ, a memory unit ðC t Þ, output gate ðOt Þ, and hidden
structures and forecasting node prices based solely on pub-
licly available data, particularly historical prices, system layer ðht Þ.
loads, and a grid-wide mix of generation types. Huang À Á
et al. [21] presented an adaptive signal time-frequency pro- f t = σ W f h ht−1 + W f x xt + b f , ð1Þ
cessing method called EMD, which is particularly suitable
for analyzing and processing nonlinear and nonstationary it = i1t ∗ i2t , ð2Þ
signals. Wang et al. [29] used the time-frequency spectrum
obtained from the EEMD to overcome the mode mixing where i1t = σðW ih ht−1 + W ix xt + bi Þ and i2t = tanh ðW gh
problem and more accurately reflect time series conditions. ht−1 + W gx xt + bg Þ.
Wu et al. [30] presented an improved EEMD with LSTM
for crude oil price forecasting. Zheng et al. [31] proposed ~ t = tanh ðW θh ht−1 + W θx xt + bθ Þ,
C ð3Þ
empirical mode decomposition and long short-term mem-
ory (EMD-LSTM) with the Xgboost algorithm for short- ~t,
C t = f t ∗ Ct−1 + it ∗ C ð4Þ
term load prediction. Regarding the hybrid methods,
Meng et al. [19] proposed a load prediction method based Ot = σðW oh ht−1 + W ox xt + bo Þ, ð5Þ
on EMD and long short-term memory with an attention
mechanism. Huang [32] proposed a hybrid deep neural ht = tanh Ct ∗ Ot : ð6Þ
network model for short-term electricity price prediction.
Bedi and Toshniwal [33] adopted a deep learning model These equations contain an offset term (b) and a
based on EEMD to predict crude oil prices. Liu et al. weight coefficient matrix (W ), a hyperbolic tangent
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4 International Journal of Energy Research

activation function ( tanh), and a sigmoid activation func- denoted C i ðtÞ, satisfying the two conditions (i) and
tion (σ). (ii) mentioned above. hðtÞ = C i ðtÞ = IMF 1 ; If not,
The concept of BiLSTM can be defined as making a neu- consider hðtÞ as the new signal to perform (1) to
ral network that has information in both directions, back- (4). If hk ðtÞ does not meet the definition of IMF,
ward (from future to past) and forwards (from past to repeat k times, then extract IMF1: hk ðtÞ = C1 ðtÞ =
future) [36]. Compared to regular LSTM where input can IMF1 .
only flow in one direction, BiLSTMs can let input flow in
both directions [37]. The bidirectional recurrent neural net- (5) A new signal RðtÞ = xðtÞ − IMF1 can be obtained by
works (BRNNs) concept is simple to grasp; backward subtracting IMF1 from original signal xðtÞ. the above
LSTMs are calculated similarly to forward LSTMs, except steps 1 to 4 are repeated to get IMF 1 .
that the time direction is reversed. The forward layer, the (6) Repeat steps 1-5 by using RðtÞ as the input series
backward layer and the hidden layer’s final output are until the termination condition is met, where RðtÞ
satisfies monotonicity; then, EMD decomposition
h⟶ = f ðw⟶1 xt + w⟶2 ht−1 Þ, of the original signal ends
h⟵ = f ðw⟵1 xt + w⟵2 ht−1 Þ, ð7Þ Using the number of IMFs = n, the original signal xðtÞ is
yi = gðwo1 ∗ h⟶ + wo2 ∗ h⟵ + bo Þ, reconstructed as follows:

where h⟶ is the forward output, h⟵ is the backward out- n

put, and yi is the final output of LSTM. It is the concatena- xðt Þ = 〠 IMF i ðt Þ + Rðt Þ, ð8Þ
i=1
tion of the two hidden states h⟶ and h⟵ that determines
the final output yt of the BiLSTM.
In this study, the BiLSTM-AM model is used to capture where RðtÞ is the last residue:
spike situations in the energy market, the periodicity in the To overcome the mode mixing problem of the EMD
data, and past and future effects on current data. method, Wu and Huang [38] suggested a noise-assisted
EMD algorithm called ensemble empirical mode decompo-
2.2. Decomposition Algorithms. EMD is an adaptive time- sition (EEMD). EEMD is an improvement on the tradi-
space analytic approach for nonstationary and nonlinear tional EMD algorithm. It introduces some noise into the
series processing. EMD executes operations that partition a process so that the IMFs produced on the other end
series into IMFs without leaving the time domain. Huang should be more accurate. Unlike the standard EMD
et al. [21] proposed the EMD method, which divides the method, this method allows a better scale separation. With
complex original signal into a sequence of IMFs with varying EEMD, white noise is added to the original signal accord-
amplitudes and a residual. In this case, the series is decom- ing to the EMD algorithm. The signal is evenly distributed
posed into a finite number of IMFs. Since numerous eco- at the extreme point interval throughout the band, reduc-
nomic and social factors cause a change in energy market ing mode mixing effects.
price and demand, they show various characteristics of the EEMD algorithm [39] involves the following steps:
original signal at different time scales. The term IMF refers
to a monocomponent function with one instantaneous fre- (1) Add white noise εi ðtÞ to the original signal xðtÞ with
quency that must satisfy the two conditions as follows: a standard normal distribution Nð0, 1Þ. Y i ðtÞ = xðtÞ
+ εi ðtÞ, for i = 1, 2,:,:,:,N, where N is the number of
(i) The local extreme value points and zero-crossing ensembles
points must be equal, or at least one difference, dur-
ing the whole period (2) The EMD algorithm decomposed the transformed
signal Y i ðtÞ into a set of IMFs and a residual
(ii) There is a zero-mean value for the envelopes defined
by the upper and lower envelope at every point in
M−1
the time series ðiÞ
Y n ðt Þ = 〠 IMF ðmiÞ ðt Þ + r M ðt Þ, ð9Þ
m=1
The EMD decomposition steps are as follows:

(1) All maxima and minima in the time series xðtÞ are where M − 1 is the total number of IMFs produced.
calculated using cubic spline interpolation ðiÞ
IMF ðiÞ th
m is the m IMF and r M the residual at i trial
th

(2) Do a local mean value calculation for the upper and (3) Steps (1) and (2) are repeated for N trials. A different
lower envelopes mðtÞ = ðUðtÞ + LðtÞÞ/2. white noise series εi ðtÞ is added to the original signal
(3) mðtÞ is subtracted from the original time series xðtÞ in each trial
to obtain the intermediate signal hðtÞ = xðtÞ − mðtÞ. (4) The last IMF of the EEMD algorithm ðIMFave m Þ is
(4) Input series hðtÞ are repeated until the mean of them computed by averaging the total m IMF in relation
N i
approaches zero, and the ith IMF is computed, m ðtÞ = 1/N∑i=1 IMFm ðtÞ, therefore
to N trials: IMF ave
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
International Journal of Energy Research 5

Output layer (Predicted load/price) Yt

Attention
layer

Backward
LSTM LSTM LSTM LSTM LSTM
layer

BiLSTM Layer

Forward LSTM
LSTM LSTM LSTM LSTM
layer

Inputs (load/price) Xt–1 Xt Xt+1 Xt+2 Xn

Figure 1: Architecture of BiLSTM with attention mechanism.

IMF 1 = 1/N∑Ni=1 IMFi1 ðtÞ and Residual r 1 ðtÞ = xðtÞ more accurate by assigning different probability weights to
− IMF1 inputs to highlight more important factors. The normaliza-
tion term in the RNN keeps all summed inputs of the rescal-
The results produced by EEMD depend on the number ing layer constant, making the hidden-to-hidden dynamics
of ensemble N and the added noise β which should satisfy more stable [17–18].
the following relation [18]. The many-to-one attention mechanism is calculated as
follows:
β À À ÁÁ
α= , ð10Þ
√N exp Score h f , hb
Attention weight : αfb = À À ÁÁ :
∑ni=1 exp Score h f , hb ′
where α is the final standard deviation of the residual.
Attention vector : πt = σðC t ht Þ = vtanhðWht + Wht + bÞ,
3. Proposed Method ð11Þ
We propose a novel deep learning framework based on inte-
where the
grated ensemble empirical mode decomposition and bidirec-
tional long short-term memory with attention to forecasting n
energy market data. The purpose of combining the attention Context vector : C = 〠 α f b hi , ð12Þ
mechanism with BiLSTM is to capture the influential part of i=1
the input sequence during the prediction process. With the
introduction of the attention mechanism, BiLSTM becomes

8 
>
< h T
Wh Luong ′ s multiplicative style
À Á f b
Score h f , hb = : ð13Þ
: vT vtanhÀW h + W h Á
>
Bahdanan ′ s additive style
α 1 f 2 b
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
6 International Journal of Energy Research

Energy data (Load/Price)

Normalized
data

Set initial number of IMFs


(k = 1,2,..,8 $ 1,2,..,9) using EEMD

Training set Test set

IMFs Residual
n+1
LSTM LSTM

Compute
RMSEn and RMSEn+1

Yes RMSEn+1 < RMSEn

No

Trained model
Bayesian optimization (BiLSTM-AM) with
sliding window approach

Output predicted
values
(load/price)

Figure 2: Flow chart of the proposed method.

The hidden layer state value of the final output is calcu- decompose using EEMD method prior to modeling. The
lated as H t′ = GðC, ht , xt Þ. normalization formula is
As shown in Figure 1, the proposed BiLSTM-AM model
consists of five layers: input vector, forward LSTM hidden
layer, backward LSTM hidden layer, attention layer, fully X − X min
Xn = , ð14Þ
connected layer, and output layer. The input sequence data X max − X min
is passed to a forward LSTM hidden layer and a backward
LSTM hidden layer, which combine to output a processed
vector. The attention layer computes a weight vector based where the normalized data is X n , the overall sample mini-
on the LSTM model and then combines the weight vector mum value is X min , and the overall sample maximum value
with the shallow output to compute the predictions of the is X max . After making predictions using EEMD-BiLSTM-
fully connected layer. The structural architecture of the pro- AM, the predicted values from the sum of IMFs predicted
posed BiLSTM-AM is shown in Figure 1. values are restored to the original values using the following
We transform the data to the range between [0,1] using equation;
the min-max normalization method. Due to the noisy nature
of time series data, to reduce the noise impact of the col-
lected data for better forecasting, we normalize and then Y t = X t′ðX max − X min Þ + X min , ð15Þ
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
International Journal of Energy Research 7

Input: Dataset X, m (# of modes by default)


1: Divide the dataset into a training dataset Xtr and a test dataset Xte based on the prediction period P.
2: Fx ðiÞ⟵IMFsx and r x:
3: Assign: Number of modes n = 1, ⋯, m: Where m is the number of IMFs by default.
4: for n =1,…m do
5: decompose Fx using EEMD algorithm
6: train IMFsx and r x using LSTM
7: calculate the RMSE at each number of IMFs
8: if RMSEn+1 ≤RMSEn then
9: repeat steps 5 and 6
10: else
11: the decomposition will stop, and n will be the optimum number of IMFs
12: end if
13: end for
Output: Optimum number of IMFs
Note: m = 12 for PJM price and load, m = 13 for NSW load.

Algorithm 1: The optimal IMF number selection procedure.

60000

55000
50000
45000
Load (MWh)

40000
35000
30000
25000
20000
15000
12/8/2015 6/8/2016 12/8/2016 6/8/2017 12/8/2017 6/8/2018
Time (hourly)
Load (MWh)
(a)
600

500
Price ($/MWh)

400

300

200

100

12/18/2019 3/18/2020 6/18/2020 9/18/2020 12/18/2020 3/18/2021 6/18/2021


Time (hourly)
Price ($/MWh)
(b)

Figure 3: PJM electricity load and price graphs.


ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
8 International Journal of Energy Research

14400

Price ($/MWh)
9600

4800

13500
Load (MWh)

10800

8100

5400

11/30/2019 8/30/2020 5/30/2021


Time (30 mins.)

Price ($/MWh)
Load (MWh)
(a)
100
9500
9000 80
8500

Price ($/MWh)
60
Load (MWh)

8000
7500
40
7000
6500 20
6000
0
5500

12/31/2020 1/2/2021 1/4/2021 1/6/2021 1/8/2021 1/10/2021 1/12/2021


Time (30 mins.)

Load (MWh)
Price ($/MWh)
(b)

Figure 4: NSW electricity load and price graphs.

where X t′ is the forecast value from the proposed model and ation process is stopped and the optimal number of
Y t is the final reconstructed prediction value. IMFs is obtained
Figure 2 provides an illustration of the proposed method. sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
The proposed method consists of four parts: 1 N
RMSE = 〠 ðy − f Þ2 , ð16Þ
N t=1 t t
(1) The data is split into training and test sets, and pre-
processed the training data using Eq. (14). Then, we
apply the EEMD to decompose the data where N is the total number of predicted values, yt
represents the actual value, and f t is the predicted
(2) Since the number of mode components is difficult to value. Algorithm 1 gives the optimal IMF number
determine, much research has been done on how to selection process.
determine the number of components to decompose (3) We model each sequence by using the BiLSTM-AM
from the original data. A simple approach based on model. The predicted subseries are sum and we use
the model evaluation method is proposed to deter- Equation (15) for reverse normalization to obtain
mine the number of mode components. The iteration the predicted value
process applies the LSTM model to compare the root
mean square error (RMSE) of the mode components (4) The trained BiLSTM-AM model with the sliding
(MCn ) and MCn+1 by Eq. (16). When the RMSEn+1 window approach is used to obtain the final pre-
of MC n+1 is greater than the RMSEn of MC n , the iter- dicted value for the test data
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
International Journal of Energy Research 9

Table 2: Descriptive statistical analysis.

Data Size Mean SD Skewness Kurtosis Minimum Maximum


PJM load 20591 30992.72 6233.56 0.85 0.86 19255.00 56391.00
PJM price 13122 22.60 14.74 9.37 198.66 0.04 537.79
NSW load 29007 7773.72 1264.87 0.80 3.59 5221.13 13634.63

Table 3: Experimental results of LSTM model with different numbers of IMFs.

(a)

PJM load 2-day 10-day 30-day


IMF # 1 +1 2+1 3+1 4+1 1+1 2+1 3+1 4+1 5+1 1+1 2+1 3+1 4+1 5+1
RMSE 323.44 290.51 254.01 267.87 323.01 276.46 258.27 253.62 264.19 341.61 321.42 266.55 258.37 258.39
Time 10.36 13.25 19.16 25.05 9.57 12.45 18.56 24.35 31.39 9.52 12.06 16.13 23.57 28.13

(b)

PJM price 2-day 10-day 30-day


IMF # 1+1 2 +1 3+1 4+1 5+1 6+1 1+1 2+1 3+1 4+1 1+1 2+1 3+1 4+1
RMSE 4.65 4.10 4.09 4.06 4.03 4.22 4.38 4.26 4.07 4.27 4.48 4.25 4.14 4.17
Time 7.47 11.54 15.14 21.16 27.37 35.32 7.54 11.45 14.32 20.19 5.57 9.64 13.29 18.11

(c)

NSW load 2 -day 10-day 30-day


IMF # 1+1 2+1 3+1 4+1 5+1 1+1 2+1 3+1 4+1 5+1 1+1 2+1 3+1 4+1 5+1
RMSE 65.96 44.78 44.58 44.23 45.88 57.47 44.74 44.47 44.22 45.36 59.25 45.75 44.02 43.96 44.56
Time 19.31 29.21 38.05 47.06 54.07 18.24 32.19 49.28 54.35 59.29 17.09 28.51 37.45 48.04 53.46

Table 4: Optimal hyperparameters of the proposed model for The second dataset is from the Australian Energy Market
NSW 2 days prediction. Operator (AEMO) [40], which is aggregated half-hourly
demand and price data. Australia’s AEMO manages electric-
Parameters IMF1 IMF2 IMF3 IMF4 Residual ity and gas networks in five states. We use the load data from
Num_layers 2 1 1 3 1 New South Wales (NSW) for our predictive analysis. Data
Neuros [128,32] 256 430 [200,64,16] 300 range from January 2020 to August 2021. Figure 4(a) shows
Dropout 0.01 0 0.1 0 0.04 the load plot for NSW half-hourly data, and Figure 4(b)
Optimizer Adam Adam Adam Adam Adam compares the time-series relationship between price and
load for the first ten days from January 1 to January 10,
Learning_rate 0.0001 0.001 0.00201 0.000742 0.005
2021. Both the NSW and PJM interconnection load dia-
grams depict relatively homogeneous and periodic charac-
4. Datasets teristics, while the price graphs show non-homogeneous
and rarely periodic characteristics.
We use two datasets of hourly and half-hourly electricity Table 2 shows the summary statistics of the three data-
loads and prices for model validation. The hourly dataset is sets. The skewness values of all variables are greater than 0,
from PJM [39]. The price of electricity in the real-time indicating a right-skewed effect.
(RT) market reflects the actual cost of supplying electricity
to a location, given the actual constraints of the grid. PJM 5. Analysis Results
uses hourly RT energy market to be locational marginal pric-
ing (LMP). Wholesale electricity prices are based on LMP, Using our proposed method in Figure 2, the optimal number
loads, generation patterns, and physical constraints on the of IMFs for PJM load under 2, 10, and 30-day forecast
transmission system, so they reflect the value of electricity periods are 3, 4, and 4, respectively (see Table 3). For the
in different regions. The raw data for daily-ahead (DA) 2-day forecast of PJM load, the RMSEs of different numbers
and RT price from January 1, 2020 to June 30, 2021. of IMFs, such as 1, 2, 3, and 4, are 323.44, 290.51, 254.01,
Figure 3(a) shows a plot of load data in megawatt-hour and 267.87, respectively. It shows that the number of IMFs
(MWh). Figure 3(b) shows the RT price diagram. = 3 for the PJM load provides the smallest RMSE. Therefore,
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10 International Journal of Energy Research

Table 5: Prediction results of different models for PJM load.

2-day 10-day 30-day


Models
MAPE RMSE R2 MAPE RMSE R2 MAPE RMSE R2
SVR 2.313 1178.131 0.880 1.966 1086.501 0.948 1.732 874.927 0.945
ARIMA 4.242 1427.782 0.714 2.307 1071.099 0.949 1.967 843.188 0.947
GBRT 2.829 1027.620 0.908 1.801 960.128 0.959 1.901 968.480 0.959
ELM 5.032 1570.296 0.786 3.861 1418.494 0.911 3.100 1140.230 0.907
BPNN 3.171 1241.558 0.866 2.432 1099.250 0.947 2.207 913.289 0.940
LSTM 1.906 1062.302 0.902 1.860 972.864 0.959 1.714 785.709 0.956
BiLSTM 1.547 959.596 0.920 1.492 903.385 0.965 1.699 725.910 0.963
BiLSTM-AM 2.265 1144.277 0.896 1.693 913.058 0.964 1.621 742.578 0.961
EMD-BiLSTM-AM 1.136 548.860 0.974 1.666 620.915 0.983 1.290 484.715 0.981
EEMD-BiLSTM-AM 0.775 232.989 0.996 0.729 258.473 0.997 0.636 228.945 0.996

Table 6: Prediction results of different models for NSW load.

2-day 10-day 30-day


Models
MAPE RMSE R2 MAPE RMSE R2 MAPE RMSE R2
SVR 1.438 136.066 0.983 1.317 150.174 0.976 1.476 141.756 0.985
ARIMA 1.386 142.645 0.982 1.305 149.570 0.980 1.326 153.025 0.982
GBRT 1.189 123.385 0.987 1.287 136.988 0.981 1.155 130.563 0.987
ELM 2.208 217.035 0.961 1.489 162.097 0.969 1.937 214.687 0.966
BPNN 1.541 155.548 0.980 1.579 174.986 0.960 1.475 159.355 0.981
LSTM 1.356 135.817 0.985 1.276 143.269 0.986 1.287 148.425 0.984
BiLSTM 1.279 132.497 0.986 1.248 141.162 0.987 1.377 147.741 0.984
BiLSTM-AM 1.108 133.766 0.989 1.312 144.34 0.986 1.368 148.16 0.984
EMD-BiLSTM-AM 0.731 68.782 0.994 0.805 93.998 0.994 1.088 139.421 0.986
EEMD-BiLSTM-AM 0.480 48.216 0.998 0.602 62.464 0.997 0.553 56.181 0.998

the optimal number of IMFs for PJM load under 2-day pre- meters based on the training data set. The $100/MWh
diction using the LSTM model is 3. The optimal number of threshold is generally considered to be the threshold for
IMFs for PJM price under 2, 10, and 30-day forecast periods exceeding an extreme price event in Australia, so a spike is
are 5, 3, and 3, respectively. defined as Pt ≥ $100 MWh [23].
For a fair comparison, all models are hyperparameter The performance of our proposed model and bench-
tuned to achieve their best predicted values. Hyperpara- mark models is evaluated by using RMSE: mean absolute
meters of the proposed model and other models used for percentage error (MAPE), R2 : mean absolute error (MAE)
comparison were tuned using a Bayesian optimization and mean square error (MSE), where MAPE, R2 , MAE,
(BO) algorithm based on a random forest regressor as a sur- and MSE are defined as
rogate model. Table 4 describes the optimal hyperpara-
meters for two days predictive analysis. N + 1 means the !
N
number of IMFs = N, and 1 is the residual. 1 y −f
MAPE = 〠 t t x 100%,
For predictive analysis, the data is first normalized by N t=1 yt
Equation (14) and then split into training and test data.
A multistep prediction approach with a sliding window ∑Nt=1 ðyi − f t Þ2
R2 = 1 − ,
is used for the prediction. For example, we take a 10-day ∑Nt=1 ðyi − yÞ2 ð17Þ
forecast: PJM price (Jan./1/2020–Jun./20/2021) as training
data and (Jun./21/2021–Jun./30/2021) as test data; and ∑Nt=1 jy t − y t j
MAE = ,
NSW load (Jan./1/2020 at 0 : 30 am–20/8/2021 at 7 : 00 am) N
as training data and (Aug./20/2021 at 7 : 30 am–Aug./30/
∑Nt=1 ðyt − yt Þ2
2021 at 23 : 30 pm) as test data. The same segmentation pro- MSE = ,
cedure as above was applied for other prediction periods, N
with the length of the prediction period as the test set and
the rest as the training set. where N is the total number of predicted values, yt repre-
All DL models use 10% of the training set as a validation sents the actual electricity price or load, f t is the predicted
set to evaluate the model’s fitness while tuning its hyperpara- value, y is the mean of all actual values, and R2 is the
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
International Journal of Energy Research 11

10500

10000

9500
Actual
ARIMA
9000 GBRT
SVR
ELM
8500 BP
Actual
LSTM
BiLSTM EMD-BiLSTM-AM
BiLSTM-AM
8000 EMD-BiLSTM-AM
EEMD-BiLSTM-AM
EEMD-BiLSTM-AM
(c)
30 32 34 36 38 40 42
(b)
11000
Actual
ARIMA
GBRT
10000 SVR
ELM
Load (MWh)

BP
9000 LSTM
BiLSTM
BiLSTM-AM
8000 EMD-BiLSTM-AM
EEMD-BiLSTM-AM

7000

6000

0 20 40 60 80 100
Time (half-hourly)
(a)

Figure 5: Prediction results for 2 days for NSW electricity load.

Table 7: Peak points and time difference analysis of 2-day prediction for NSW load.

Models Actual peak load Predicted peak Load difference Actual peak time Predicted peak time Time difference
SVR 10,151.67 10,529.31 -377.64 8/29/21 19 : 00 8/29/21 19 : 30 30 min
ARIMA 10,151.67 10,447.81 -296.139 8/29/21 19 : 00 8/29/21 20 : 00 60 min
GBRT 10,151.67 10190.85 -39.1778 8/29/21 19 : 00 8/29/21 19 : 00 —
ELM 10,151.67 10,448.04 -296.37 8/29/21 19 : 00 8/29/21 19 : 30 30 min
BP 10,151.67 10,501.46 -349.79 8/29/21 19 : 00 8/29/21 19 : 00 —
LSTM 10,151.67 10,393.57 -241.90 8/29/21 19 : 00 8/29/21 19 : 00 —
BiLSTM 10,151.67 10,370.99 -219.32 8/29/21 19 : 00 8/29/21 19 : 00 —
BiLSTM-AM 10,151.67 10,341.87 -190.20 8/29/21 19 : 00 8/29/21 19 : 00 —
EMD-BiLSTM-AM 10,151.67 10,071.37 80.30 8/29/21 19 : 00 8/29/21 19 : 00 —
EEMD-BiLSTM-AM 10,151.67 10,149.69 1.98 8/29/21 19 : 00 8/29/21 19 : 00 —

coefficient of determination. Different evaluation measures We compared the proposed model with nine other
tend to produce different results in predicting effects. There- models in terms of prediction accuracy and ability to capture
fore, we also apply the model confidence set (MCS) test [41] peaks. The comparison models include four machine learn-
and Kolmogorov-Smirnov predictive accuracy (KSPA) test ing models (support vector regression (SVR), extreme learn-
[42] to compare the predictive ability of each model. ing machine (ELM), gradient boosting regression tree
To validate the proposed model and EMD-BiLSTM-AM, (GBRT), and back-propagation neuron network (BPNN)),
the predicted subsequences are summed using an ensemble three deep learning models (LSTM, BiLSTM, and BiLSTM-
summation strategy, and all predicted values are renorma- AM), one classical autoregressive integrated moving average
lized to the original data scale. The spike threshold for (ARIMA) model, and a hybrid model (EMD-BiLSTM-AM).
PJM price was set to $100 per MWh, the NSW load peak The training lengths for the three forecast periods (2, 10, and
was set to 11,000 MWh, and the PJM load peak was set to 30 days) for PJM prices are 13,074 hours, 12,882 hours, and
39,000 MWh, as shown in Figures 3 and 4. 12,402 hours, respectively.
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
12 International Journal of Energy Research

140

120
Price ($/MWh)
100 Threshold level

80

60

40

20

0 50 100 150 200 250


Time (hourly)
(a)

140

120

100

80

60

40

20

0
180 185 190 195 200
(b)

Actual LSTM
ARIMA BiLSTM
GBRT BiLSTM-AM
SVR EMD-BiLSTM-AM
ELM EEMD-BiLSTM-AM
BP

Figure 6: Prediction for PJM electricity price.

Table 8: The results of different models for PJM price under different forecast periods.

2-day 10-day 30-day


Models
MAPE RMSE R2 MAPE RMSE R2 MAPE RMSE R2
SVR 0.102 10.936 0.572 0.117 11.559 0.600 0.143 11.793 0.451
ARIMA 0.225 11.698 0.457 0.190 12.127 0.560 0.220 14.538 0.330
GBRT 0.203 17.802 0.241 0.121 12.002 0.568 0.132 11.368 0.489
ELM 0.180 11.899 0.494 0.172 11.937 0.574 0.203 11.661 0.463
BPNN 0.120 10.006 0.642 0.124 10.670 0.636 0.149 10.557 0.560
LSTM 0.154 11.280 0.545 0.164 10.930 0.643 0.145 10.419 0.571
BiLSTM 0.132 11.393 0.536 0.108 10.993 0.639 0.147 10.668 0.551
BiLSTM-AM 0.139 12.082 0.478 0.118 10.620 0.663 0.156 10.406 0.572
EMD-BiLSTM-AM 0.178 7.859 0.779 0.122 6.853 0.889 0.129 7.203 0.842
EEMD-BiLSTM-AM 0.089 5.134 0.906 0.101 4.890 0.929 0.100 4.392 0.924
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
International Journal of Energy Research 13

Table 9: Model computation time in seconds.

PJM price NSW load


Models
2 days 10 days 30 days 2 days 10 days
SVR 8.016 9.845 11.922 30.701 31.908
ARIMA 76.135 70.542 68.287 260.041 215.041
GBRT 24.750 25.000 24.625 22.82813 23.4375
ELM 4.313 3.953 7.031 12.945 14.870
BPNN 9.701 13.659 14.091 35.98067 37.89465
LSTM 479.035 505.580 593.804 1304.921 1263.869
BiLSTM 574.893 608.733 693.300 2594.816 2232.46
BiLSTM-AM 406.360 451.976 527.875 1984.828 1864.757
EMD-BiLSTM-AM 2495.809 2709.836 2907.448 6272.267 5803.947
EEMD-BiLSTM-AM 2593.908 2860.899 3009.294 6666.825 6298.189

Python is used to implement all analyses of the model. improve predictive performance, especially on high-
TensorFlow is installed in the Spyder 3.7.3 environment. frequency data, as shown in Figure 5 for the 30-minute
Bayesian optimization (BO) based on random forest regres- interval data for NSW.
sion was used to tune the hyperparameters of all other Table 7 shows each model’s peak time and amplitude
models except the ELM model with default parameters analysis in Figure 5(b). The table shows that the proposed
implemented using the scikit-learn neural network package model has the closest amplitude to the actual peak and cap-
and automatic ARIMA. The proposed model and other DL tures the exact time of the peak. All other models captured
models used for comparison apply a “ReLU” activation func- the exact time index of the peak except SVR, ARIMA, and
tion, the loss function is the mean square error (MSE), ELM but failed to capture the amplitude. The proposed
Adam as the optimizer, a lookback = 6, epochs = 100, and model underestimates the load peak by only 1.98 MW,
batch size = 4 for the load dataset, 3 batch size for the price which shows the prediction power of the proposed model
dataset. All experiments are performed on a desktop with in capturing the peak and spike.
processors: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz As shown in Figure 6, the spike threshold level is set at
and NVIDIA GeForce GTX 1660 Ti GPU; OS: Windows $100/MWh, and the 10-day forecast from June 21st-30th,
x64; Memory: 16.00 GiB. 2021 shows a one-time spike situation as on June 29th,
Tables 5 and 6 show the prediction accuracy of the eight 2021. ELM, EMD-BiLSTM-AM, SVR, and the proposed
models under different forecast periods in terms of MAPE, model can capture the spikes. LSTM also captures the spikes,
RMSE, and R2 for PJM and NSW loads. The proposed but not in amplitude compared to others, while the rest fail
model has a MAPE of less than 1% in the three forecast to capture the spikes. Compared with the other models in
periods. The proposed model has the lowest RMSE for all Table 8, the proposed model outperforms the comparative
different forecast periods. Using the R2 metric, the proposed models in all different forecast periods and better captures
model can achieve 99.6% to 99.8% accuracy in the three fore- the peak time and amplitude, as shown in Figure 6.
cast periods. The results show that the proposed model Although the EMD-BiLSTM-AM model captures the same
achieves the best results on benchmarks for all the prediction peak level as the proposed model, it still has lower prediction
periods, followed by other deep learning models. All models accuracy than the proposed model.
can capture peaks in load data, while some cannot capture On average, the MAPE and R2 of the proposed model for
price spikes. Regardless of the time interval difference, the three test periods were 0.097 and 0.920, respectively.
EEMD-BiLSTM-AM accurately predicts the actual electricity EEMD-BiLSTM-AM outperforms other models in all per-
load while better capturing peaks. Therefore, the proposed formance metrics used. We can see in Table 8 that using
model can predict the electrical load at different time intervals. EEMD-BiLSTM-AM improves the MAPE of the prediction
On the MAPE metric for NSW load data, applying the results by 50%, 17%, and 37% compared to EMD with
EEMD method to BiLSTM-AM can improve the prediction BiLSTM-AM in 2-day, 10-day, and 30-day predictions,
results by 53%, 54%, and 60% for 2-day, 10-day and 30-day respectively.
forecasts, compared to using BiLSTM. Figure 5 shows the 2- From the above analysis, we conclude that the proposed
day test set load forecast for NSW with good accuracy for the method can predict data with different fluctuation frequen-
spike and peak capture levels for the proposed model. cies, time intervals, and range, i.e. load and price data types.
Machine learning models and classical statistical model Figures 5 and 6 compare the prediction accuracy of nine
ARIMA perform poorly in valley-peak prediction as com- benchmark models and our proposed model on different
pared to the hybrid models. SVR and ELM also miss the data types and different time horizons. EEMD-BiLSTM-
peak time predictions. This suggests that hybrid deep learn- AM shows higher forecasting performance and less peak-
ing models combined with decomposition methods can help to-valley amplitude.
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
14 International Journal of Energy Research

30
25
30 30
30

30
25 25 20 25

20
20 20
20
15
Frequency

Frequency

Frequency

Frequency

Frequency
15
15 15 15
10
10
10 10 10

5
5 5 5 5

0 0 0 0 0

0 200 0 200 0 200 500 0 200 500 0 200

(a) Statistical and machine learning models

35 30
20
30
25
30 25

25
20 15
25
20
20
20
Frequency
Frequency
Frequency
Frequency
Frequency

15
15 10
15
15
10
10
10
10
5
5 5 5
5

0 0 0 0 0

0 200 400 0 200 400 0 200 400 0 50 100 0 40 80 120


LSTM absolute BiLSTM absolute BiLSTM-AM absolute EMD-BiLSTM-AM Proposed model
errors errors errors absolute errors absolute errors
(b) Deep learning and hybrid deep learning models

Figure 7: Absolute error distribution plots for 2 days NSW forecast data.

Table 9 provides the computation times in Python, other models in terms of prediction performance and
including hyperparameter tuning, model training, and pre- valley-peak capture.
diction. Hybrid deep learning models take longer to com- The MCS test is used to select the best model with a
pute than deep learning and machine learning models. given level of confidence [42]. The KSPA test is a nonpara-
However, the hybrid deep learning models outperform metric test based on the Kolmogorov-Smirnov (KS) test
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
International Journal of Energy Research 15

1.0

0.8

Fn (x) 0.6

0.4

0.2

0.0

0 100 200 300 400


x

ARIMA BiLSTM absolute errors


GBRT absolute errors BiLSTM-AM absolute errors
SVR absolute errors EMD-BiLSTM-AM absolute errors
LSTM absolute errors EEMD-BiLSTM-AM absolute errors
(a) 2 days NSW load forecast data

1.0

0.8

0.6
Fn (x)

0.4

0.2

0.0

0 20 40 60
x

ARIMA absolute errors LSTM absolute errors


SVR absolute errors BiLSTM absolute errors
GBRT absolute errors BiLSTM-AM absolute errors
ELM absolute errors EMD-BiLSTM-AM absolute errors
BPNN absolute errors EEMD-BiLSTM-AM absolute errors

(b) 30 days PJM price forecast data

Figure 8: Empirical cumulative distribution function of error plots.


ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
16 International Journal of Energy Research

Table 10: P-values of MCS test using different models for NSW load and PJM price.

2 days NSW load 10 days PJM price


Models
MAE MAPE RMSE MSE MAE MAPE RMSE MSE
SVR 0.0000 0.0000 0.0000 0.0000 0.0000 0.5128 0.2834 0.0000
ELM 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
GBRT 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
BPNN 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
ARIMA 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
LSTM 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
BiLSTM 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
BiLSTM-AM 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
EMD-BiLSTM-AM 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000
EEMD-BiLSTM-AM 1.0000 1.0000 1.0000 1.0000 1.000 1.0000 1.0000 1.0000
Note: A higher p value means a more accurate model.

[43]. This test was designed to determine whether the fore- data with different time horizons and different volatility pat-
cast error distributions were statistically significantly differ- terns, we examine the predictive performance, efficiency,
ent and to distinguish between more predictive and less and consistency of the proposed model by forecasting the
predictive models. short to medium term. The proposed method is compared
We provide statistical test analysis results of 2 days NSW with night benchmark methods, including four machine
load and 30 days PJM price forecast data. The KSPA error learning models (SVR, ELM, GBRT, and BPNN), three deep
distribution and empirical cumulative distribution function learning models (LSTM, BiLSTM, and BiLSTM-AM), a clas-
are shown in Figures 7 and 8 and, respectively. The two- sical ARIMA model, and a hybrid method (EMD-BiLSTM-
sided (p value) of the KSPA test for all compared models AM). The results show that the proposed method outper-
was less than 0.01 for the 2 days NSW load data. This con- forms other models in terms of prediction accuracy and
firms the statistical difference between the proposed and spike capture ability, with EEMD reducing the mean abso-
competing models and shows that the proposed model out- lute percentage error (MAPE) by 53%, 54%, and 60%,
performs the others. One-sided KSPA tests are also per- respectively. Over the three forecast periods, the average
formed to show the low errors of the proposed model and MAPE and R2 are 0.097 and 0.92, respectively. Furthermore,
its maximum prediction accuracy. The calculated statistical the analytical results of MCS and KSPA tests show that the
errors of the predictions show that the proposed model out- EEMD-BiLSTM-AM model has the best predictive ability.
performs the comparison models. The random error devia- The analysis results show that the price data exhibits
tion is better explained by the proposed EEMD-BiLSTM- large extreme volatility and uncertainty in peak behavior.
AM model, which shows the smallest error and the best pre- Its heterogeneity and many seasonality make modeling more
diction performance. Our proposed model better captures challenging compared to loads.
random bias with minimum error. The p value results of
MCS test are shown in Table 10. We set a threshold of Data Availability
0.25 for the p value to indicate that the model survived under
the loss function, which is underlined. A larger p value The data that support the findings of this study are available
means a more accurate model. According to the above anal- from the corresponding author upon reasonable request.
ysis, the prediction ability of the EEMD-BiLSTM-AM model
is the best. Disclosure
6. Conclusions The manuscript has not been previously published and is
not currently submitted for review to any other journals.
Energy market participants have always been nervous about The manuscript is approved by all authors and tacitly or
the economic consequences of soaring electricity prices. explicitly by the responsible authorities where the work
Many forecasting models in the energy market focus on pre- was carried out. Submission also implies that, if accepted,
dicting normal price and load movements without dealing it will not be published elsewhere in the same form, in
with extreme load or price situations in the market. In this English or in any other language, without the written con-
paper, we combine EEMD and BiLSTM-AM to predict elec- sent of the publisher.
tricity loads and prices while capturing spikes and peak
points. The EEMD algorithm decomposes the data to reduce Conflicts of Interest
the effect of noise. We then applied the BiLSTM-AM model
to obtain the occurrence and normal fluctuation values of The author(s) declare(s) that they have no conflicts of
price and load peaks from the decomposed data. Using three interest.
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
International Journal of Energy Research 17

Acknowledgments [16] H. Li, H. Liu, H. Ji, S. Zhang, and P. Li, “Ultra-short-term load
demand forecast model framework based on deep learning,”
This work was supported by the Ministry of Science and Tech- Energies, vol. 13, no. 18, p. 4900, 2020.
nology, Taiwan [Grant MOST-109-2221-E011-098-MY3]. [17] J. Du, “power load forecasting using BiLSTM-Attention,” Envi-
ronmental Sciences, vol. 440, no. 3, article 032115, 2020.
[18] G. Hinton, L. Deng, D. Yu et al., “Deep neural networks for
References acoustic modeling in speech recognition: the shared views of
four research groups,” IEEE Signal Processing Magazine,
[1] A. Radovanovic, T. Nesti, and B. Chen, “A holistic approach to vol. 29, no. 6, pp. 82–97, 2012.
forecasting wholesale energy market prices,” IEEE Transac-
tions on Power Apparatus and Systems, vol. 34, no. 6, [19] Z. Meng, Y. Xie, and J. Sun, “Short-term load forecasting using
pp. 4317–4328, 2019. neural attention model based on EMD,” Electrical Engineering,
vol. 104, pp. 1857–1866, 2022.
[2] M. Polson and V. Sokolov, “Deep learning for energy mar-
kets,” Applied Stochastic Models in Business and Industry, [20] I. Daubechies, J. Lu, and H. T. Wu, “Synchrosqueezed wavelet
vol. 36, no. 1, pp. 195–209, 2020. transforms: an empirical mode decomposition-like tool,”
Applied and Computational Harmonic Analysis, vol. 30,
[3] H. S. Hippert, C. E. Pedreira, and R. C. Souza, “Neural net-
no. 2, pp. 243–261, 2011.
works for short-term load forecasting: a review and evalua-
tion,” IEEE Transactions on Power Apparatus and Systems, [21] N. E. Huang, Z. Shen, S. R. Long et al., “The empirical mode
vol. 16, no. 1, pp. 44–55, 2001. decomposition and the Hilbert spectrum for nonlinear and
[4] H. Wang, Z. Lei, X. Zhang, B. Zhou, and J. Peng, “A review of non-stationary time series analysis,” Proceedings of the Royal
deep learning for renewable energy forecasting,” Energy Con- Society of London. Series A: mathematical, physical and engi-
version and Management, vol. 198, article 111799, 2019. neering sciences, vol. 454, no. 1971, pp. 903–995, 1998.
[5] D. Lu, D. Zhao, and Z. Li, “Short-term nodal load forecasting [22] D. Singhal and K. S. Swarup, “Electricity price forecasting
based on machine learning techniques,” International Trans- using artificial neural networks,” International Journal of Elec-
actions on Electrical Energy Systems, vol. 31, no. 9, p. 31, 2021. trical Power & Energy Systems, vol. 33, no. 3, pp. 550–555,
2011.
[6] P. W. Khan, Y.-C. Byun, S.-J. Lee, D.-H. Kang, J.-Y. Kang, and
H.-S. Park, “Machine learning-based approach to predict [23] A. E. Clements, R. Herrera, and A. S. Hurn, “Modelling inter-
energy consumption of renewable and nonrenewable power regional links in electricity price spikes,” Energy Economics,
sources,” Energies, vol. 13, no. 18, p. 4870, 2020. vol. 51, pp. 383–393, 2015.
[7] R. Bhinge, J. Park, K. H. Law, D. A. Dornfeld, M. Helu, and [24] T. M. Christensen, A. S. Hurn, and K. A. Lindsay, “Forecasting
S. Rachuri, “Towards a generalized energy prediction model spikes in electricity prices,” International Journal of Forecast-
for machine tools,” Journal of Manufacturing Science and ing, vol. 28, no. 2, pp. 400–411, 2012.
Engineering, vol. 139, no. 4, p. 139, 2017. [25] A. Badri, Z. Ameli, and A. M. Birjandi, “Application of artifi-
[8] J. Walther, D. Spanier, N. Panten, and E. Abele, “Very short- cial neural networks and fuzzy logic methods for short term
term load forecasting on factory level - a machine learning load forecasting,” Energy Procedia, vol. 14, pp. 1883–1888,
approach,” Procedia CIRP, vol. 80, pp. 705–710, 2019. 2012.
[9] X. Zhao, J. Wang, T. Zhang, D. Cui, G. Li, and M. Zhou, “A [26] S. J. Kiartzis, A. G. Bakirtzis, J. B. Theocharis, and G. Tsagas,
novel short-term load forecasting approach based on kernel “A fuzzy expert system for peak load forecasting: application
extreme learning machine: a provincial case in China,” IET to the Greek power system,” in 2000 10th Mediterranean Elec-
Renewable Power Generation, vol. 16, no. 12, pp. 2658–2666, trotechnical Conference. Information Technology and Electro-
2022. technology for the Mediterranean Countries. Proceedings.
[10] L. Tschora, E. Pierre, M. Plantevit, and C. Robardet, “Electric- MeleCon 2000 (Cat. No.00CH37099), pp. 1097–1100, Lemesos,
ity price forecasting on the day-ahead market using machine Cyprus, 2000.
learning,” Applied Energy, vol. 313, article 118752, 2022. [27] J. P. S. Catalão, S. J. P. S. Mariano, V. M. F. Mendes, and L. A.
[11] G. Sun, C. Jiang, X. Wang, and X. Yang, “Short-term building F. M. Ferreira, “Short-term electricity prices forecasting in a
load forecast based on a data-mining feature selection and competitive market: a neural network approach,” Electric
LSTM-RNN method,” IEEJ Transactions on Electrical and Power Systems Research, vol. 77, no. 10, pp. 1297–1304, 2007.
Electronic Engineering, vol. 15, no. 7, pp. 1002–1010, 2020. [28] Y. Y. Hong and C. Y. Hsiao, “Locational marginal price fore-
[12] J. Bedi and D. Toshniwal, “Energy load time-series forecast casting in deregulated electric markets using a recurrent neural
using decomposition and autoencoder integrated memory net- network,” in 2001 IEEE Power Engineering Society Winter
work,” Applied Soft Computing, vol. 93, article 106390, 2020. Meeting. Conference Proceedings (Cat. No.01CH37194),
[13] R. J. Park, K. B. Song, and B. S. Kwon, “Short-term load fore- pp. 539–544, Columbus, OH, USA, 2001.
casting algorithm using a similar day selection method based [29] T. Wang, M. Zhang, Q. Yu, and H. Zhang, “Comparing the
on reinforcement learning,” Energies, vol. 13, no. 10, p. 2640, applications of EMD and EEMD on time-frequency analysis
2020. of seismic signal,” Journal of Applied Geophysics, vol. 83,
[14] M. Mohandes, “Support vector machines for short-term elec- pp. 29–34, 2012.
trical load forecasting,” International Journal of Energy [30] Y. X. Wu, Q. B. Wu, and J. Q. Zhu, “Improved EEMD-based
Research, vol. 26, no. 4, pp. 335–345, 2002. crude oil price forecasting using LSTM networks,” Physica A:
[15] L. Xiao, W. Shao, C. Wang, K. Zhang, and H. Lu, “Research Statistical Mechanics and its Applications, vol. 516, pp. 114–
and application of a hybrid model based on multi-objective 124, 2019.
optimization for electrical load forecasting,” Applied Energy, [31] H. Zheng, J. Yuan, and L. Chen, “Short-term load forecasting
vol. 180, pp. 213–233, 2016. using EMD-LSTM neural networks with a xgboost algorithm
ijer, 2023, 1, Downloaded from [Link] by Bangladesh Hinari NPL, Wiley Online Library on [27/03/2025]. See the Terms and Conditions ([Link] on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
18 International Journal of Energy Research

for feature importance evaluation,” Energies, vol. 10, no. 8,


p. 1168, 2017.
[32] C. Huang, Y. Shen, Y. Chen, and H. Chen, “A novel hybrid
deep neural network model for short-term electricity price
forecasting,” International Journal of Energy Research,
vol. 45, no. 2, pp. 2511–2532, 2021.
[33] J. Bedi and D. Toshniwal, “Empirical mode decomposition
based deep learning for electricity demand Forecasting,” IEEE
access, vol. 6, pp. 49144–49156, 2018.
[34] H. Liu, C. Chen, H. Tian, and Y. Li, “A hybrid model for wind
speed prediction using empirical mode decomposition and
artificial neural networks,” Renewable Energy, vol. 48,
pp. 545–556, 2012.
[35] S. Hochreiter and J. Schmidhuber, “Long short-term mem-
ory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[36] A. Graves and J. Schmidhuber, “Framewise phoneme classifi-
cation with bidirectional LSTM and other neural network
architectures,” Neural Networks, vol. 18, no. 5-6, pp. 602–
610, 2005.
[37] Y. Li, Z. Zhu, D. Kong, H. Han, and Y. Zhao, “EA-LSTM: evo-
lutionary attention-based LSTM for time series prediction,”
Knowledge-Based Systems, vol. 181, article 104785, 2019.
[38] Z. Wu and N. E. Huang, “Ensemble empirical mode decompo-
sition: A noise-assisted data analysis method,” Advances in
Adaptive Data Analysis, vol. 1, no. 1, pp. 1–41, 2009.
[39] PJM, “Real-time hourly LMPs, data miner 2,” September 2022,
[Link]
[40] Australian Energy Market Operator (AEMO)September 2022,
[Link]
[41] P. R. Hansen, A. Lunde, and J. M. Nason, “The model confi-
dence set,” Econometrica, vol. 79, no. 2, pp. 453–497, 2011.
[42] G. F. Fan, L. Z. Zhang, M. Yu, W. C. Hong, and S. Q. Dong,
“Applications of random forest in multivariable response sur-
face for short-term load forecasting,” International Journal of
Electrical Power & Energy Systems, vol. 139, article 108073,
2022.

You might also like