I Introduction
Operational and planning problems in the power system domain often involve the assessment of (sub)system performance across a range of probabilistically modelled scenarios. For all but the simplest power system models, this cannot be done analytically, and Monte Carlo (MC) simulations are used instead. MC simulations are a powerful general purpose computation method with a long tradition in power system applications [1], but convergence to the correct answer may be slow. A number of different variance reduction methods exist to speed up convergence of Monte Carlo estimates, e.g. [1, 2]. One of these, importance sampling, has recently grown in popularity for power system applications, especially in combination with automatic tuning of model bias parameters using the crossentropy approach [3, 4]. However, implementing importance sampling typically requires deep insight into the model, and limits the design freedom, e.g. for simulations involving complex decision making or sequential actions.
The Multilevel Monte Carlo (MLMC) method was introduced in the context of computational finance to speed up averaging over sample paths, without compromising model detail or accuracy [5]. Initial applications involved the combination of multiresolution models (geometric sequences), but other applications have subsequently evolved. A good overview of the method and its applications is given in [5]. The MLMC approach has recently been used in a reliability context to speed up the estimation of the average mission time of large systems in [6]. In [7], electrical distribution system risk metrics were estimated using MLMC, using a multiscale approach to simulate component failures and repairs.
This paper considers how the MLMC framework [5] can be used to accelerate risk calculations, in particular in applications relating to system adequacy assessment of complex systems. The contributions of this work are as follows.

A concise overview of the MLMC approach to the estimation of risks is given. It is shown how the structure required for MLMC simulation naturally occurs in adequacy assessment problems, and can often be implemented with minimal changes to the constituent models. Two examples of common model patterns are given.

An intuitive speed metric is introduced that allows for fair comparison between Monte Carlo simulation approaches, and across risk measures.

Two case studies are presented, each representing one of the common model patterns. The MLMC approach results in large speedups, in one case speeding up simulations by a factor 2000 compared to conventional Monte Carlo sampling. The sensitivity of computational speed to the model stack is investigated.
Ii Methodology
Iia Mathematical problem statement
Power system performance indicators often take the form of risk measures that are expressed as the expectation^{1}^{1}1The framework of estimating expectation values is less limiting than it may seem. For example, if one is interested in estimating the distribution of in addition to the expectation , one can define a series of quantities , so that . of a performance indicator
(a random variable), i.e.
. Formally, the random variable may be seen as a function that associates a numerical outcome with every system state in a sample space . The probabilistic behaviour of the system, and therefore of, is defined by associating probabilities with events (sets of states)
.In the context of system adequacy assessment, the probabilistic behaviour of a power system is typically specified using a bottom up model that defines demand levels, component status, generator output levels, etc. This model generates both the sample space (the set of all possible combinations of component states) and the associated probabilities. The function deterministically evaluates any specific state and computes a numerical performance measure for that state. The risk measure is then the (probability weighted) average of the function over all states.
For even moderately complex systems, it is not possible to compute the quantity of interest analytically, nor can it be computed by enumeration of all states in . In such cases, it is common to resort to Monte Carlo simulation, in which power system states are generated using the probabilistic bottomup model and analysed to provide relevant outcomes . It should be noted that at any time, multiple outcomes can be measured simultaneously, at little to no extra cost. In the mathematical analysis that follows, only a single risk measure is discussed, but the methods can trivially be applied in parallel.
IiB Conventional Monte Carlo
A brief summary of conventional Monte Carlo simulation is given in this section, as a point of reference for following sections. In conventional Monte Carlo simulation, the quantity is approximated by the Monte Carlo estimator
(1) 
where represents a random sample^{2}^{2}2We use the statistics convention that a sample is a set of sampled values, rather than the computational science convention where each is a sample. from , with each independent and identically distributed to . Note that we distinguish the random variable that represents the th random draw from , and its realisation in a particular experiment or simulation run. The MC estimate for a simulation run is thus given by
(2) 
We proceed to use the generic expression (1) to reason about the convergence of the result. The error obtained in this approximation is
(3) 
The MC estimator
is unbiased, and, as a result of the central limit theorem, for a sufficiently large sample size
,is normally distributed, so that
(4) 
The variance of follows from the MC estimator (1):
(5) 
As a result, the standard error
, indicating the typical convergence of MC simulations.For quantification of the computational efficiency of an MC simulation, we denote by the average time required to generate a single realisation . The time spent to generate a sample of size is then
(6) 
Using this relation, the variance (5) can be expressed as
(7) 
IiC Multilevel Monte Carlo
For multilevel Monte Carlo (MLMC), we assume to have at our disposal a hierarchy of models that generate random outputs , which approximate with increasing accuracy. Specifically, we consider the case where the top level model is the model of interest, i.e. and . The lower level models are used to speed up the calculations.
The material in this section is generic, and can be found using slightly different notation in e.g. [5]. The basis for the MLMC method is the trivial identity that is the telescopic sum:
(8) 
The quantity of interest is decomposed into a crude estimate plus iterative refinements. In MLMC, each of the terms is independently estimated using (1). This results in the MLMC estimator
(9a)  
with  
(9b)  
(9c) 
An additional superscipt has been added to the random model output to denote the level pair it is associated with. The model outputs are distributed according to (equality in distribution) and they are assumed to be mutually independent, except for the pairs , which are jointly sampled from a common distribution in such a way that the marginal distributions are and , respectively.
The MLMC estimator is unbiased and asymptotically normally distributed, by virtue of its constituent MC estimators. Its variance follows from the MLMC estimator (9a) and the mutual independence of sampled values:
(10) 
(11) 
Here, the superscipt on the simulation outputs is maintained, because the covariance term depends on the joint sampling process of the pairs . Clearly, the variance is minimised if the sample pairs are highly correlated.
For a given set of models , the challenge is to optimally choose the samples sizes . Defining the average time to generate a single value as , the total time taken to produce an MLMC estimate is given by
(12) 
The optimal sample counts can now be determined by minimising the variance (10) with respect to while keeping constant. Using (12) to substitute and setting for results in optimal sample counts (ignoring their discrete nature)
(13) 
With this optimal choice of , the computational effort spent on each level pair is proportional to (see (12)), and the total variance (10) can be expressed as a function of computational time as
(14) 
IiD Measuring simulation speed
By comparing the expressions for the variance of the conventional and multilevel MC approaches, we can investigate the potential speedup resulting from the MLMC approach. Let us consider the times and required to converge to a given variance . Then, combining (5) and (14) results in the expression
(15) 
In practice, the variance of the lowest level is similar to that of the direct MC simulator, , and the cost of evaluating the highest level pair is at least that of a direct evaluation of the highest level, i.e. . However, considerable speedups are possible if for all . Intuitively, this occurs when each simplified model is much faster than the next level , but returns very similar results for the majority of samples. Examples where this occurs naturally in the context of power system adequacy assessment will be discussed in Sections IV and V.
In order to compare the compuational efficiency of various implementation, we require an operational definition of ‘computational speed’. Monte Carlo simulations are often run with the goal to estimate the quantity with a certain relative accuracy, expressed using the coefficient of variation . We note that both (7) and (14) can be brought into the form
(16) 
This implicitly defines the computation speed as
(17) 
This definition may be compared with the ‘figure of merit’ used in [8]. The inclusion of the quantity in (17) has a number of advantages, provided that . First, the speed has dimensions , independent of the measure . Second, speeds corresponding to different metrics are directly comparable. For example, when , this indicates that the LOLE estimator is the limiting factor in achieving convergence to a given coefficient of variation. And finally, the speed metric and the implied computational distance are easily interpretable in terms of simulation outcomes. For example, in order to achieve a coefficient of variation of 1% (‘distance’ 10,000) using a speed of , a simulation run of is required.
Iii Considerations for implementation
Iiia Joint sample spaces
The core of the MLMC algorithm is the joint generation of sample pairs , used in (9b), in such a way that they are maximally correlated. The random variables and have sample spaces and , respectively, which must be combined into a joint sample space . We highlight two common model patterns that naturally achieve this.
IiiA1 Pattern 1: component subsets
One common occurrence in system adequacy studies is that the lower level model omits components that are present in the higher level model . As a result, the sample space can be written as a Cartesian product
(22) 
where is the sample space of components present in but not in . We may then identify and . In practical terms this means that samples can be generated at the higher level and unused elements are discarded for the simpler models . An example of this design pattern is explored in Section IV.
IiiA2 Pattern 2: identical randomness
It is also easy to conceive of scenarios where and have identical sample spaces, so that
(23) 
This occurs when both models are driven by the same set of random inputs, but the higher level model performs more complex processing. An example of this model pattern is given in Section V.
IiiB Direct evaluation of expectations
Occasionally, the base model is sufficiently simple to permit direct computation of , either analytically or using a numerical approximation procedure. In those cases, the long run efficiency is enhanced by evaluating
directly instead of using its MC estimate. The standard deviation
is then equal to 0, or a value commensurate with the accuracy of the numerical approximation of . Although direct evaluation of the lowest level is nearly always preferred, there may be cases where the evaluation of is a comparatively timeconsuming operation and the optimal tradeoff is more complex. In the examples that follow in Sections IV and V, direct evaluation is always possible, and results in faster convergence of the overall MLMC estimator.The use of an analytical result at the lowest level also highlights a connection between the MLMC method and the control variate approach[5]. The control variate similarly makes use of a simplified model for which an explicit solution can be calculated. It can therefore be considered as a special case of a bilevel MLMC procedure where the value is known and the output is scaled for optimal convergence. The control variate approach was used in [2] to speed up composite system adequacy assessment  a problem that is also addressed in Section IV.
IiiC Implementation
Simulations were implemented in Python 3.7 and were run on an Intel i57360U CPU under macOS 10.14.6. A generic multilevel sampler was developed with specialisations for particular simulation studies. No effort was made to optimise the execution speed of individual models, because the aim of this paper is not to maximise execution speed per se, but to investigate the relative speed between sampling strategies.
All MLMC simulations started with an exploratory run in which a sample with fixed size is taken at each level set , in order to determine initial estimates of the evaluation cost and variance . This initial run is followed by a sequence of followup runs, each parameterised by a target run time . Given , optimal sample sizes at each level were determined using (13) and the most up to date estimates of evaluation times and variances . For all results in this paper, 10 runs with an estimated run time of 60 seconds (each) were used, for a total runtime of approximately 600 seconds.
One practical concern with determining optimal sample sizes using (13) is that the values of are estimated using relatively small data sets. In power system risk assessment, the simulation outputs often have longtailed distributions, so that there is a high probability that that (or even ). If the estimated value is used naively in (13), this leads to undersampling of , thereby exacerbating the problem because fewer samples are generated that can correct the estimate of . To mitigate this risk, the variance estimators were adjusted as follows. First, a conservative estimate for the variance of was obtained as
(24) 
It was then assumed that the decrease in true variance between subsequent levels was bounded by a factor , resulting in updated estimates
(25) 
for those pairs where is estimated by sampling. For the simulations, the value of
was heuristically set to 0.1.
Finally, in simulations, multiple risk measures were estimated in parallel. In determining optimal sample sizes, one of these was selected as the ‘target measure’ to optimise for, so that its mean and variance estimates were inserted in (13).
LOLP estimation  EPNS estimation  

estimator  run time [s]  LOLP  [1/s]  speedup  EPNS [MW]  [1/s]  speedup 
MC  582  0.31  n/a  0.238(24)  0.17  n/a  
MLMC (sampling)  627  0.79  2.5  1.73  10  
MLMC (with expectation)  601  1.04  3.3  2.54  15 
Iv Composite system adequacy assessment
The first case study is a system adequacy assessment of the single area IEEE Reliability Test System (RTS) [9]. A twolevel MLMC approach is used, where the upper level, i.e. the study of interest, is a hierarchical level 2 (HL2) study [1]: a composite system adequacy assessment that takes into account transmission line outages and constraints. The lower level HL1 is a single node assessment that omits the transmission system. This is in accordance with the subset model pattern in Section IIIA1.
Iva Models
IvA1 Composite system adequacy assessment (HL2)
The RTS model defines outage probabilities of generators and transmission lines, which were modelled as independent two state Markov models. Maintenance and transient outages were not considered. Load levels were sampled by uniformly selecting an hour from the annual demand trace and assigning loads to each node in proportion to the maximum nodal demands.
Therefore, at the upper level (), a sampled system state consists of: (i) the nodal demand for , the set of nodes; (ii) the generator status for , the set of generators in node ; (iii) the line status for , the set of transmission lines. Let generator and line flow limits be given by and . Then, the amount of curtailment
is computed by the linear program
(26) 
subject to
where the matrix relates bus injections and line flows, using the sampledependent linenode incidence matrix and the diagonal matrix of inverse line reactances. The elementwise constant ensures invertibility, eliminating the need for a designated slack bus. In cases where line outages resulted in multiple islands, problem (26) was formulated and solved for each island independently and the curtailments were summed to obtain the total system curtailment. Linear optimisation was performed using scipy.optimize.linprog, with the revised simplex method.
IvA2 Generation adequacy assessment (HL1)
For HL1 analysis, a singlenode generation adequacy analysis is performed, without transmission line constraints and outages. The lower level system state can thus be obtained from by omitting the line status variables. For this HL1 study, the curtailment is calculated as
(27) 
IvA3 Risk measures
IvB Results
For all runs, an initial exploratory run with was used, followed by 10 runs of approximately 60 seconds. The target risk measure for sample size optimisation was EPNS. Unless stated otherwise, thermal line ratings were scaled to 80% of the nominal values, to tighten network constraints. Throughout, Monte Carlo estimates of risk measures are given with the relevant number of significant digits, followed by the estimated standard error in parentheses. Thus, stands for an estimate of with a standard error of .
Table I compares the results of three different estimators. The top row is the conventional Monte Carlo estimator that directly performs the HL2 study. The middle row represents a twolevel MLMC approach where HL2 sampling is combined with HL1 sampling, immediate leading to significant speedups of 2.5 (for LOLP) and 10 (for EPNS). In the third configuration (bottom row), further speedups are obtained by eliminating sampling of the lower level model, and computing the lower level estimates and directly by convolution using 1 MW discretisation steps.
An interesting observation is that, for the regular MC sampler, the speed of LOLP estimation (0.31 s) is larger than that of EPNS estimation (0.17 s). However, the MLMC sampler sees much more substantial speedups for EPNS estimation than for LOLP estimation. This is only partially caused by the EPNSfocused sample size optimisation. The other factor is that the discontinuous LOL performance measure (28) is less amenable to successive approximation [5].
term  LOLP  EPNS [MW]  [ms]  

93 158  
4 380 194  
sum 
Table II gives insight into the multilevel structure of the regular MLMC estimate. For both LOLP and EPNS, the refinement term is substantially smaller than the crude estimate . More importantly, sampling from the HL1 model is substantially faster (0.023 ms per evaluation) than the HL2HL1 difference term (5.4 ms per evaluation), due to the linear program (26) involved in the latter. The MLMC algorithm adapts to this difference by invoking the HL1 model nearly 50 times as often.
Finally, Table III shows the impact on convergence speed of varying the thermal line ratings between 80% and 100% of the nominal values. Higher line ratings cause fewer constraints, which results in a slight reduction in speed for the regular MC sampler. On the other hand, the MLMC sampler experiences very large speedups as the difference between the results from the HL1 and HL2 models becomes smaller, so that fewer (expensive) HL2 evaluations are required. Once again, the gains in EPNS estimation speed exceed the gains in LOLP estimation speed.
line  LOLP estimation  EPNS estimation  

scaling  speedup  speedup  
0.8  0.31  1.04  3.3  0.17  2.54  15 
0.9  0.26  1.38  5.3  0.14  4.69  34 
1.0  0.25  2.11  8.6  0.12  16.7  143 
V Dispatch of storage
The second example concerns the assessment of system adequacy in the presence of energyconstrained storage units (e.g. batteries). The energy constraints couple decisions in subsequent time slots, thus necessitating the use of timesequential Monte Carlo simulations. Convergence for timesequential simulations tends to be much slower than for snapshot problems, due to significant correlations in visited system states. An additional complication is deciding an appropriate dispatch strategy for energy storage units. A greedy EENSminimising discharging strategy was recently proposed in [10], as a reasonable default dispatch strategy for adequacy studies.
Va Models
The Great Britain (GB) adequacy study from [10] is reproduced here, with an eye on speeding up estimation of LOLE and EENS risks using the MLMC approach. Individual simulations are run for a sequence of 8760 hours (1 year). The system performance in a simulated year is driven entirely by the net generation margin trace
(30) 
where the sampled state consists of the demand trace , wind power trace and conventional generation trace . Annual demand traces are chosen randomly from historical GB demand measurements for 20062015 (net demand, [11]). Annual wind traces are similarly sampled from a synthetic data set for hypothetical GB wind power output for the period 19852014, derived from MERRA reanalysis data and an assumed constant distribution of wind generation sites with an installed capacity of 10 GW [12]. Conventional generation traces are generated using an assumed diverse portfolio of thermal units, the portfolio of 27 storage units was based on storage units contracted in the GB 2018 T4 capacity auction. The reader is referred to [10] for further details.
We consider three different storage dispatch models. The resulting storage dispatch (with sign convention that consumption is positive) is denoted by , and is entirely determined by the net generation margin . All three models are defined on the same sample space , providing an example of the model pattern described in Section IIIA2. However, the models differ tremendously in computational complexity, as is clear from the descriptions below.
LOLE estimation  EENS estimation  

estimator  run time [s]  LOLE [h/y]  [1/s]  speedup  EENS [MWh/y]  [1/s]  speedup 
MC  620  0.105  n/a  2 100(400)  0.053  n/a  
MLMC (3 layer with nostore)  636  1.10  10  1.61  30  
MLMC (2 layer with average)  618  1.88  18  38.1  719  
MLMC (3 layer with average)  615  6.88  66  112  2 113 
VA1 EENSoptimal dispatch
The storage dispatch is computed using the algorithm given in [10]. It is sequential and requires complex logic for each step.
VA2 Sequential greedy dispatch
The storage dispatch is computed using a heuristic approximation of the EENSminimising policy. Storage units are sorted by decreasing time to go (from full) , where and are energy and discharge power ratings, respectively. Then, a sequential greedy dispatch is performed, charging when possible, and discharging only when load curtailment is the alternative. Evaluating this model requires one sequential pass per storage unit, but the simulation steps are trivial.
VA3 Constant peakshaving dispatch
For this model, the storage fleet is optimistically approximated by a single storage unit with and . The historical mean daily demand profile is used to compute a peakshaving dispatch, using
(31a)  
subject to (for )  
(31b)  
(31c) 
This quadratic optimisation problem is solved using the Python quadprog package. The resulting storage dispatch is
(32) 
Because is a deterministic load offset, risk measures for this model can be computed by convolution.
VA4 Risk measures
The net generation margin and storage dispatch result in a curtailment trace as follows
(33) 
The loss of load expectation (LOLE) and expected energy not supplied (EENS) risk measures can be computed using the performance measures
(34)  
(35) 
VB Results
For all simulations, an exploratory run with was used, followed by 10 runs of 60 seconds, where sample sizes were optimised for the EENS risk measure. In all cases, the crude estimate was evaluated using a convolution approach. Results are shown in Table IV, comparing the performance of three MLMC architectures with direct MC simulation. A threelayer architecture using a model without storage as a bottom layer achieved speedups of 10 (LOLE) and 30 (EENS), but much better results were obtained when the daily average dispatch was used as the crude model  even when a twolayer MLMC stack was created by omitting the intermediate model (sequential greedy dispatch).
The results show that the MLMC performance is very sensitive to the choice of levels, but robust speedups are available even for suboptimal model choices. The best performing architecture is further analysed in Table V. It can be seen that the contribution from the final refinement is minimal, i.e. the heuristic model is very accurate, which is key to the observed speedup of . The MLMC algorithm dynamically adjusted sample sizes to generate more samples evaluating than on the costly evaluation of (accurate model).
term  LOLE [h/y]  EENS [MWh/y]  [ms]  
190  
1 771  
2.14  n/a  n/a  
sum 
Vi Conclusions and future work
This paper has set out how the MLMC approach can be applied to power system risk analysis, and specifically to system adequacy assessment problems. Common model patterns were identified that are particularly amenable to MLMC implementation, and a computational speed measure (17) was introduced to quantify simulation speed in a way that is easily comparable across tools, Monte Carlo methods and risk measures. Two case studies illustrate the potential for speeding up estimation of risk measures, and the ability to apply the method to complex simulations.
In future work, we will consider automatic selection of optimal model stacks, and explore the scope for the application of multiindex Monte Carlo [13] schemes.
Acknowledgments
We thank Kate Ward and Iain Staffell for the provision of GB demand and wind power data, and Michael Evans for sharing Python code for the EENSminimising dispatch. We are also grateful for insightful discussions with Chris Dent, Matthias Troffaes, Amy Wilson and Stan Zachary.
References
 [1] R. Billinton and W. Li, Reliability Assessment of Electric Power Systems Using Monte Carlo Methods. Boston, MA: Springer US, 1994.
 [2] R. Billinton and A. Jonnavithula, “Composite system adequacy assessment using sequential Monte Carlo simulation with variance reduction techniques,” IEE Proceedings  Generation, Transmission and Distribution, vol. 144, no. 1, p. 1, 1997.
 [3] F. Belmudes, D. Ernst, and L. Wehenkel, “CrossEntropy Based RareEvent Simulation for the Identification of Dangerous Events in Power Systems,” Probabilistic Methods Applied to Power Systems, 2008. PMAPS ’08. Proceedings of the 10th International Conference on, pp. 1–7, 2008.
 [4] A. Leite da Silva, R. Fernandez, and C. Singh, “Generating Capacity Reliability Evaluation Based on Monte Carlo Simulation and CrossEntropy Methods,” IEEE Transactions on Power Systems, vol. 25, no. 1, pp. 129–137, feb 2010.
 [5] M. B. Giles, “Multilevel Monte Carlo methods,” Acta Numerica, vol. 24, pp. 259–328, may 2015.
 [6] L. J. Aslett, T. Nagapetyan, and S. J. Vollmer, “Multilevel Monte Carlo for Reliability Theory,” Reliability Engineering & System Safety, vol. 165, pp. 188–196, sep 2017.
 [7] A. Huda and R. Živanović, “Improving distribution system reliability calculation efficiency using multilevel Monte Carlo method,” International Transactions on Electrical Energy Systems, vol. 27, no. 7, p. e2333, jul 2017.
 [8] P. Henneaux and P.E. Labeau, “Improving Monte Carlo simulation efficiency of levelI blackout probabilistic risk assessment,” in 13th International Conference on Probabilistic Methods Applied to Power Systems (PMAPS 2014), Durham, 2014.
 [9] C. Grigg, P. Wong, P. Albrecht, R. Allan, M. Bhavaraju, R. Billinton, Q. Chen, C. Fong, S. Haddad, S. Kuruganty, W. Li, R. Mukerji, D. Patton, N. Rau, D. Reppen, A. Schneider, M. Shahidehpour, and C. Singh, “The IEEE Reliability Test System1996. A report prepared by the Reliability Test System Task Force of the Application of Probability Methods Subcommittee,” IEEE Transactions on Power Systems, vol. 14, no. 3, pp. 1010–1020, 1999.
 [10] M. P. Evans, S. H. Tindemans, and D. Angeli, “Minimizing Unserved Energy Using Heterogeneous Storage Units,” IEEE Transactions on Power Systems, vol. 34, no. 5, pp. 3647–3656, sep 2019.
 [11] I. Staffell and S. Pfenninger, “The increasing impact of weather on electricity supply and demand,” Energy, vol. 145, pp. 65–78, feb 2018.
 [12] ——, “Using biascorrected reanalysis to simulate current and future wind power output,” Energy, vol. 114, pp. 1224–1239, nov 2016.
 [13] A.L. HajiAli, F. Nobile, and R. Tempone, “Multiindex Monte Carlo: when sparsity meets sampling,” Numerische Mathematik, vol. 132, no. 4, pp. 767–806, apr 2016.
Comments
There are no comments yet.