Introduction
Security and privacy in smart grids are becoming an increasingly important issue. An adversary may violate customers’ privacy in smart grids, because they are more than computerized power grids. They may monitor and manage energy usage by means of many sensors that collect information about the flow of electricity. They are in majority composed by smart meters installed at the customers’ household. Such meters are able to reveal customers’ habits, e.g., the specific pattern of power consumption can reveal the TV program that someone is watching [1]. Without security and privacy, it is infeasible to deploy smart meters, whose society can take many benefits, mainly environmental and economic.
On one hand, the transition from power grid to smart grid is driven by climate change and economic concerns, e.g., reducing CO2 emissions and avoiding the need for backup power plants, respectively. The deployment of smart meters is essential to achieve continuous electrical load balance, which gives us these benefits and is considered the main aspect of smart grid [2]. On the other hand, many customers are against the installation of smart meters, they are inhibiting the migration from an old power grid to a smarter grid [3]. For such customers, the lack of satisfactory security and privacy in the use of smart meters is unacceptable.
In this paper, we present the efficient privacy-preserving protocol for smart metering systems (EPPP4SMS) to preserve customers’ privacy and security in the network data communication. Thus, EPPP4SMS releases energy suppliers from handling their customers’ private data. Therefore, energy suppliers might outsource some services. Besides the description of the EPPP4SMS and its theoretical analysis, we also present a simulation using real-world measurement data. The simulation validates the theory and gives us a more tangible comparison with other protocols.
EPPP4SMS runs in the communication between smart meters and their energy supplier. We use an additive homomorphic encryption to create a different cryptographic primitive. Similar to other protocols [4], EPPP4SMS also protects the privacy using aggregation of customer data. Nevertheless, our approach has many benefits. For example, homomorphic encryption allows the decryption of a single measurement, while EPPP4SMS allows only the decryption of the aggregated measurements. In other words, whoever holds the private key can decrypt only the sum of the measurements. Consequently, smart meters can send their signed and encrypted measurements directly to their energy supplier. Since the encryption is end-to-end, the smart meters and their energy supplier can directly identify troubles in the communication.
EPPP4SMS is scalable and faster in the whole round than others are. Since the energy supplier receives the encrypted measurements and computes their aggregation, each smart meter spends the same processing time, independent of the number of measurements considered in the aggregation. When we increase the size of the keys, EPPP4SMS becomes increasingly faster than others with equivalent key size. EPPP4SMS only requires that smart meters hold merely a permanent private key to protect customers’ privacy, i.e., smart meters encrypt their measurements with the same key during every new round of measurements. Moreover, customers and their energy supplier can verify measurements and invoices keeping the customers’ privacy secure. The possibility of verification is important because machines might fail, as we find in the dataset of the simulation. Smart meters send only one message per measurement and an extra message per verification. The energy supplier can ask for verification when the sum of the measurements is different from the expected. Since the energy supplier knows the amount of generation and consumption of its customers, its staff can detect energy loss and fraud, which has been a problem in power grids [5].
Currently, many areas need privacy-enhancing protocols. These areas may use EPPP4SMS and its ideas. It is a protocol to a smart grid scenario, but its features improve significantly the techniques found in the literature.
The remainder of this paper is structured as follows. Section II presents the background necessary to understand this proposed protocol—especially related work. After that, the construction of EPPP4SMS is described in Section III and, thereafter, protocol analysis in Section IV and comparison with related work is given in Section V. Section VI presents how the simulation with real-world data was developed and its results. Section VII presents the conclusion about this research.
Background
In this section, we present definitions for some terms and cryptographic primitives used in this paper. We also present a review of related work.
A. Definitions
In the description of related work and the remainder of this paper, we use the following nomenclature.
Supplier is responsible to manage the power grid.
Customer is a household buyer or seller of electricity.
Meter is a temper-resistant device installed in customer residence, and that measures the flow of electric energy.
Consolidated measurement is the sum of the measurements of
meters, i.e.,N .m_{1}+m_{2}+\cdots +m_{N}
B. Cryptographic Primitives
Many privacy-enhancing protocols use Paillier’s scheme [6] because it is an additive homomorphic encryption. Thus, if two meters encrypt their measurements, \begin{equation} {\mathsf {Enc}}(m_{1})\cdot {\mathsf {Enc}} (m_{2})= {\mathsf {Enc}}(m_{1}+m_{2}). \end{equation}
The encryption function Enc with the additive homomorphic property from (1) presented by Paillier is given by \begin{equation} \begin{split} {\mathsf {Enc}}\,:\,\mathbb {Z}_{n}\times \mathbb {Z}^{*}_{n} &\longmapsto \mathbb {Z}_{n^{2}} \\ {\mathsf {Enc}}(m,r) &\longmapsto g^{m}\cdot r^{n} \mod n^{2} \end{split} \end{equation}
The decryption function Dec is given by \begin{equation} \begin{split} {\mathsf {Dec}}\,:\,\mathbb {Z}_{n^{2}} &\mapsto \mathbb {Z}_{n} \\ {\mathsf {Dec}}(c) &\mapsto \frac {L\left (c^\lambda \mod n^{2}\right )}{L\left (g^\lambda \mod n^{2}\right )} \mod n \end{split} \end{equation}
Notice that Dec does not depend on the random numbers. Therefore, we can write
C. Related Work
There is much related work in the literature. Here, we present the most relevant to our contribution in this paper. For simplicity, we show the protocols only for one round of measurements.
1) In-Network Aggregation:
Li et al. [7] introduced a privacy-enhancing protocol based on data aggregation, which uses Paillier’s scheme. To solve the malleability introduced by Paillier’s scheme, Li and Luo [8], introduced the use of homomorphic signature allowing verification to confirm the data aggregation was correct. In both papers, the protocols use in-network data aggregation, i.e., the data aggregation happens with participation of meters. Ruj and Nayak [9] presented a different case of in-network aggregation based on access control. The aggregation also uses Paillier’s scheme, but occurs in network devices, i.e., in home area network (HAN), building area network (BAN), and neighboring area network (NAN).
2) Measurements Directly to the Supplier:
Mármol et al. [10] presented a protocol in which each meter adds its key to its measurement and sends the result to the supplier. The result of the addition is the encrypted measurement, but the protocol also uses key agreement and authentication to establish a secure channel. After that, a meter aggregates all keys and sends their sum to the supplier. Such meter is called the key aggregator. The supplier decrypts the sum of all encrypted measurements, subtracting the value received from the key aggregator. Mármol et al. [10] emphasize that their work allows each meter to send its measurement directly to the supplier.
3) No Online Aggregator:
Erkin and Tsudik [11] presented a protocol based on Paillier’s scheme in which each meter sends a random number to each other meter. To avoid the overhead of messages, meters can share seeds to generate the numbers using a pseudorandom number generator. The main idea is that all random numbers can cancel each other when Paillier’s scheme is applied. For instance, if
4) EPPA: Efficient and Privacy-Preserving Aggregation:
Lu et al. [12] presented a variation of Paillier’s scheme [6]. They join
5) Aggregation of Time-Series Data:
Recently, Joye and Libert [13] presented an interesting scheme based on the idea of splitting the exponent from [11], like EPPP4SMS. To encrypt the measurements, their scheme computes
Their paper was inspired by a nonscalable privacy-preserving aggregation of time-series data [14]. There is already a line of work on privacy-preserving aggregation of time-series data [7]–[14]; compared to these approaches, our solution is novel because we address two dimensions [4], namely: consolidated consumption and billing information.
Our Construction: EPPP4SMS
We assume the existence of a public key infrastructure (PKI) in order for meters to sign their measurements. The signature Sign ensures that a message comes from a legitimate meter, and also ensures that no adversary changed the message. In summary, the signature is used for security reasons. The function Enc is responsible to ensure privacy.
In the initial set up, the supplier chooses two safe primes
Similar to Paillier’s scheme, the supplier public key is \begin{align} {\mathsf {Enc}}_{i}\,:\,\mathbb {Z}_{n}\times \mathbb {Z}^{*}_{n^{2}}\times \mathbb {Z}^{*}_{n^{2}} \mapsto & \mathbb {Z}_{n^{2}} \notag \\ {\mathsf {Enc}}_{i}(m,h_{1_{t}},h_{2_{t}}) \mapsto & g^{m}\cdot h_{1_{t}}^{k_{1_{i}}}\cdot h_{2_{t}}^{k_{2_{i}}} \mod n^{2}\quad \end{align}
We can assume the existence of a secure hash function H and define
The hash function ensures that previous encrypted measurements will not be correlated to obtain information from the measurements. We present in Section IV-B details about the importance of using a hash function, such that
Due to the exponents
Given the family of functions from (4), each meter
To decrypt the consolidated consumption, the supplier needs to multiply all encrypted measurements, that is \begin{equation*} \prod _{i=1}^{N} {\mathsf {Enc}}(m_{i})= g^{m_{1}+m_{2}+\cdots +m_{N}}h_{1}^{n-d_{1}}h_{2}^{n-d_{2}} \mod n^{2} \end{equation*}
\begin{equation*} {\mathsf {Dec}}\left (h_{1}^{d_{1}}h_{2}^{d_{2}}\prod _{i=1}^{N} {\mathsf {Enc}}(m_{i}) \mod n^{2} \right )= \sum _{i=1}^{N}m_{i} \mod n. \end{equation*}
\begin{equation*} \forall ~w\in \mathbb {Z}^{*}_{n^{2}},~\begin{cases} w^\lambda \equiv 1 \mod n,\\ w^{n\lambda }\equiv 1 \mod n^{2}. \end{cases} \end{equation*}
We show in Section IV that EPPP4SMS enables the verification of measurements. The same verification can be used to verify the invoice values. For example, the meters can send their billing information at the end of a period—perhaps a day or a week—and, then, the supplier and the meters can verify whether the sum of the measurements is right or wrong. If at the end of a day the electric energy has three different prices, the meters need to send their consumption for each price. Another option is to set up the meters to send the measurements in financial values instead of watts.
Protocol Analysis
In this section, we present how meters and their supplier can use the property of verification. We also present analyses of privacy and security. We finish this section with a discussion about parameters that may be used in EPPP4SMS and their implications in the performance of processing time.
A. Verification Property
In case of defect, if a meter did not send its measurement or send an invalid-signed message, the supplier knows exactly which meter it is. Thus, the supplier can request the meter to resend the message or provide a fix to hardware problems. In the meantime, the supplier cannot read consolidated consumption. Nevertheless, meters can cooperate to disclose the key of the damaged meter. The meter cooperation can disclose the sum of the keys of others, but they cannot disclose individual keys, unless
If the supplier identifies that something is wrong and asks for verification, a meter
B. Privacy Analysis
Suppose that an adversary has access to the private key \begin{equation} \frac {L\left (\left (g^{m_{i_{t}}}h_{1_{t}}^{k_{1_{i}}}h_{2_{t}}^{k_{2_{i}}}\right )^\lambda \mod n^{2}\right )}{L(g^\lambda \mod n^{2})} \mod n \end{equation}
\begin{equation} \frac {L\left (\left (h_{1_{t}}^{k_{1_{i}}}h_{2_{t}}^{k_{2_{i}}}\right )^\lambda \mod n^{2}\right )}{L((h_{1_{t}}h_{2_{t}})^\lambda \mod n^{2})} \mod n. \end{equation}
If some adversaries try to use encrypted measurements from the same meter \begin{equation} g^{m_{i_{1}}+m_{i_{2}}}h_{1_{1}}^{k_{1}}h_{2_{1}}^{k_{2}}h_{1_{2}}^{k_{1}}h_{2_{2}}^{k_{2}} \end{equation}
\begin{equation} g^{m_{i_{1}}-m_{i_{2}}}h_{1_{1}}^{k_{1}}h_{2_{1}}^{k_{2}}h_{1_{2}}^{-k_{1}}h_{2_{2}}^{-k_{2}} \end{equation}
The adversary can collect the encrypted measurements from all meters in the same round \begin{equation} g^{m_{1}+m_{2}+\cdots +m_{N}-m_{i}}h_{1_{t}}^{n-d_{1}-k_{1_{i}}}h_{2_{t}}^{n-d_{2}-k_{2_{i}}} \end{equation}
Considering
C. Security Analysis
If an adversary cannot decrypt an individual measurement with the knowledge of the private key of the supplier, it is much harder to decrypt a measurement without the key. In addition, if a meter
As only the consolidated consumption can be decrypted, the supplier might outsource the aggregation. An adversary cannot use malleability to change the consolidated consumption, i.e., an adversary cannot aggregate twice the same encrypted measurement
To obtain
An important question may arise: is an adversary able to solve the discrete logarithm modulo a composite? That is
One can argue that an adversary can select special values to
D. Parameters and Performance
Secure parameters that keep a good performance depend on the best algorithms to solve the integer factorization problem and the discrete logarithm problem. The first may be solving with general number field sieve (GNFS) [17]. The second may be solving with brute force attack, or with some algorithm that depends on the factors of
Asymptotically, the GNFS still has the best heuristic time: \begin{equation} \exp \,\left (\left (64/9\right )^{1/3}\log (n)^{1/3}\log (\log (n))^{2/3}\right )=2^{k/2} \end{equation}
The most costly operation in Paillier’s scheme and in EPPP4SMS is the modular exponentiation, which is processed in
On one hand, EPPP4SMS is much faster to encrypt on the meter side that has many meters with constrained hardware. On the other hand, EPPP4SMS is around three times slower to decrypt on the supplier side; nonetheless, such decryption happens only once for every round.
Relation to Previous Work
We present the best possibility for each protocol in Table I. The column setup presents the complexity of the number of messages necessary to establish the keys to provide privacy or authentication. The column key shows the total number of keys that each meter uses to encrypt its measurements to provide privacy. The column previous aggregation indicates if the protocol uses previous aggregation per round, i.e., before the protocol sends the measurements to the supplier. The column messages presents the total number of messages that the meters need to send by round. The column direct indicates whether the protocol sends messages directly to the supplier, or not. The column verification shows which protocols allow meters and their supplier to certify that their measurements are correct.
Regarding the processing time, we classify the protocols in two sets, based on either modular exponentiation or ECC, which is normally faster. EPPP4SMS is based on modular exponentiation as well as [12], [13], and Paillier’s scheme [6] that is used in [7], [9], and [11]. A key exchange that might be modular exponentiation or ECC is used in [10]. A bilinear map is used in [8], namely, in ECC. Therefore, using the best configuration of other protocols, only two of them might be faster than EPPP4SMS for encrypting individual measurements. Nevertheless, we may see in Table I that EPPP4SMS has at least two advantages over the others.
EPPP4SMS is more robust than the others are. For example, in [7]–[9] and [12], if an adversary has access to the private key of the supplier and can intercept encrypted measurements, the adversary can decrypt them. In [10], key agreement and authentication schemes are used to protect the communication channel, because if an adversary has access to the communication channel, all messages can be decrypted and disclosed. Another weakness is if an adversary has access to the key aggregator and to the messages sent to the supplier. In [11], it is only possible to reduce the huge number of keys, if we leave security behind and assume trust between the meters. The number of keys is huge because the total number of keys in the set of meters grows quadratically in relation to the number of meters
Only [8] and EPPP4SMS provide verification. In [8], the verification is incremental; thus, it is necessary to store the messages and search them with complexity
Besides verification, we can detect energy loss in
Simulation With Real-World Data
Though there is a discussion about the complexity of EPPP4SMS in relation to Paillier’s scheme in Section IV-D, and though there is a comparison with related work in Section V, we cannot estimate in seconds the difference in processing times, which depends on the measurements, number of meters, rounds and other factors. For this reason, we present the performed simulation with real-world data and the obtained results in this section.
The dataset used in the simulation has 157 992 996 measurements, which were collected from 6435 meters. They reported consumption with intervals of 30 min. The measurements start to register the consumption from 00:00:00 to 00:29:59 on Tuesday on July 14, 2009, and the last consumption from 23:30:00 to 23:59:59 on Friday on December 31, 2010. The meters that reported the measurements were installed in Irish homes and businesses.
A. Verification
We constructed a protocol that enables verification of measurements because machines might fail. Indeed, we detected in the dataset that two meters presented errors. They sent two measurements in the same round for six times. Moreover, the values of the reported measurements are different from each other in the same round. Thus, the problem was not only a duplicated message, and each of the two meters repeats the same error in six rounds. In total, we have 24 measurements in which at least 12 are certainly wrong, but we cannot verify from the dataset which one is correct or wrong. If the meters were running EPPP4SMS, we could verify. As we cannot verify or ask for new measurements, we decided to eliminate these 24 measurements.
We also detected some issues in the timestamp. The firsts 30 min are coded as 1 in the dataset, namely, the interval from 00:00:00 to 00:29:59 is coded as 1. The interval from 00:30:00 to 01:00:00 is coded as 2, and so on. As the day has 24 h, the time code of a normal day goes from 1 to 48. We also detected 25 002 messages with time code bigger than 48. The majority, i.e., 24 462 measurements were coded as 49 and 50. We detected that 12 568 and 11 874 messages are due to daylight saving time on last Sundays of October 25, 2009, and October 31, 2010. However, 20 measurements were coded as 49 and 50 on normal days. Moreover, we found 540 messages with time code from 51 to 95 that may be errors. No time code was coded as zero. Again, we cannot verify, thus we decide to erase the messages with errors and code time bigger than 48 from the dataset.
Although the total number of error seems to be big, it is small in relation to the total number of measurements. These failures in the collected data are very important to recall our attention to the fact that machines might fail and a real-world dataset guides us to develop protocols with verification.
B. Complementation and Privacy
To preserve privacy, every one of the 6435 meters in the set should report its consumption in every new round. If a meter did not send its consumption because the value is zero, an adversary can know when there is consumption or not.
After we eliminate the lines with problems in the dataset, we have 1 557 216 measurements reporting zero consumption. However, not all meters send their consumption in all rounds. For this reason, we complement the missing measurements with zero. In the total, we insert 7 578 840 new zeros, reaching a total of 9 136 056 zeros.
C. Implementation of the Core Algorithms
We implemented the source code for EPPP4SMS and Paillier’s scheme that is used in many protocols. Both implementations were written in C and compiled with GCC version 4.4.6 for GNU/Linux with Red Hat distribution. We also used the GNU multiple precision arithmetic library (GMP) to manipulate big numbers and the open multi processing (OpenMP) to parallelize the encryption.
The computer used was an Intel® Xeon®, CPU E5-2620 of 2.00 GHz, 12 recognized cores, and 70 Gigabytes of shared memory.
D. Simulation Parameters
After we sanitized the dataset, we have 6435 meters times 25 726 timestamps that gives us 165 546 810 measurements. While we should aggregate the measurements of 6435 meters per round, we decided to parallelize the encryption in 11 threads and leave the aggregation and decryption in one thread.
Since we used real-world data, we only needed to determine the primes,
E. Simulation Results
We collected the processing time of the encryption, aggregation, and decryption. With their data and the programming language R, we present box plots of the observed processing time in Fig. 2.
Box plots of the processing time spent in parallelized encryption, aggregation, and decryption of the consolidated consumption.
As expected, EPPP4SMS is much faster in encryption than many protocols that use Paillier’s scheme. The simulation of EPPP4SMS ran in a mean time of 1284 ms, while the Paillier’s scheme ran in 3514 for encryption. The cost of the aggregation is equivalent in both protocols, namely, 35.08 against 36.88 ms on average. For decryption, Paillier’s scheme is around three times faster than EPPP4SMS. The mean time of the former was 13.47 ms, and for the latter was 4.71 ms. Nevertheless, the encryption should run on the meter with constrained resources and the decryption should run on the supplier side. Moreover, the benefit of around 2230 ms using 11 threads in the encryption is much bigger than the drawback of around 9 ms using 1 threads in the decryption.
Conclusion
EPPP4SMS has three important results. First, it simultaneously provides the best features of many protocols. Second, it provides verification, i.e., the supplier can verify the measurements and the customers can prove whether their invoices are correct or not. An adversary cannot decrypt or falsify an encrypted measurement. Moreover, this is the only one that provides verification of the measurements without digital signature. Third, it is scalable—the processing time of the encryption does not depend on the number of meters—and faster than other protocols. Moreover, the difference of performance increases significantly with the level of security. Decryption is slightly slow in the simulation. However, the decryption might be as fast as other protocols, depended on the number of meters, i.e., for huge numbers.
To provide privacy, EPPP4SMS requires only a permanent key per meter, sends only a message per measurement and an extra message per verification, when the supplier requires proves. Meters might send the messages directly to their supplier or an outsourced service provider.
Similar approaches can also be used in many applications of additive homomorphic encryption—for instance: mobile sensing, sensor networks, and e-cash.
ACKNOWLEDGMENT
The authors would like to thank the Irish Social Science Data Archive and “The Commission for Energy Regulation, Electricity Customer Behaviour Trial” for the source of the dataset smart meter electricity trial data, issued by The Research Perspective Ltd., on 12-03-2012. They also thank the editor and anonymous reviewers for their helpful comments, and R. and L. Hamtil for proofreading.