Occupancy detection of residential buildings using smart meter data: A large-scale study
Introduction
Nearly 40% of the total energy in the world is consumed in the buildings to provide an indoor environment which is comfortable for the occupants [1]. As a part of the growing efforts to move towards energy sustainability, more attention is now being paid to make buildings more energy efficient. Reliable information about the occupancy status of the buildings plays a crucial role in achieving this objective. For example, the occupancy information can be used to optimize the operation of Heating, Ventilation, and Air Conditioning (HVAC) systems in the building by reducing the overall energy consumption without compromising the comfort of the occupants. The study in [2] suggests that the occupancy-based control of indoor HVAC systems can result in up to 40% reduction of the electricity usage in buildings. Similarly, the authors in [3] demonstrated that an adaptive occupancy-based lighting control for buildings could lead to up to 76% reduction of the electricity used for lighting. Moreover, the occupancy status information can be used for fault detection of electrical appliances in the buildings [4].
The recent deployment of Advanced Metering Infrastructures (AMIs) to monitor electrical power consumption over fine-grained time intervals has provided a potential to infer buildings occupancy status from high-resolution meter readings. The possibility of inferring building occupancy from meter data in real-time for controlling appliances can be quiet useful as this does not require additional sensors (e.g., CO2 measurement or infrared sensors) to be installed. At the same time, however, this poses a risk of burglary along with information leakage of working, dining, and holidaying habits of households in case the meter readings are compromised [5]. Such concerns have resulted in: the formation of some opposition and advocacy groups; legal cases in superior courts; and new legislation [6]. Consequently, it is vital to understand the extent to which smart meter data can be predictive of the occupancy status of the households. If meter data is found to be highly predictive of the buildings occupancy status, it would be promising for adaptive control of appliances but also implies the requirement for additional measures to protect households’ privacy.
Various machine learning algorithms have been developed in the past studies to detect occupancy patterns from smart meter data [7], [8]. The representation of the data in the form of variables, however, is shown to have a significant impact on the performance of most machine learning methods [9], [10]. Smart meter data include large volumes of time-series measurements that are subject to noise and may not be effective when used in the raw form [11]. Feature engineering (FE) involves creating derived variables from raw attributes to enhance the performance or interpretability of machine learning models. Without a proper feature engineering process, machine learning models are likely to underperform, overfit on the training data and to be unstable over the time [9]. Previous studies either have not used any form of feature engineering or have deployed few manually crafted features. The process of manually creating features is tedious, subjective, and the resulting set of features are likely to be sub-optimal. This is especially true considering that the optimal set of variables may vary from one machine learning algorithm to another. As a part of this study, a dynamic feature engineering mechanism for occupancy detection from meter data is introduced. The proposed method deploys a Genetic Programming (GP) approach which is agnostic to specific machine learning models, hence can be used along with existing solutions.
More importantly, current literature examines the possibility of detecting buildings occupancy status from meter data only in the past or in real-time. We hypothesize that since most households have regular daily/weekly routines, the meter data may not only establish home occupancy at the time of meter readings but also, with sufficient data, they can be predictive of the future occupancy status of the households at specific time-slots. If these hypotheses can be validated, the privacy implications of smart meter data would be much more significant. Consequently, some of the key attendant questions addressed by this paper are:
- 1.
Can machine learning models be trained to reveal occupancy status of a given household from their meter data without requiring their past occupancy information?
- 2.
What length of a household’s smart meter demand data is sufficient for predictive machine learning algorithms to infer future occupancy status accurately?
- 3.
How far in the future can predictions be made with an acceptable level of accuracy?
- 4.
How many “out-of-home” time-slots in a week can be correctly identified by predictive models?
- 5.
Which household segments are more vulnerable to such building occupancy detection attacks?
To answer these questions, this study analyzes the electricity consumption behavior of more than 5000 households over an 18-month period. To ensure that the reported accuracy values were near the upper bound of the information that could be extracted from meter data, the proposed GP-based feature engineering approach was deployed along with a range of machine learning algorithms. A joint optimization framework was used for feature engineering and tuning the hyper-parameters of machine learning models.
The remainder of this paper is organized as follows. Section 2 presents a review of the existing research. Section 3 reviews the methodology of this study’s research, including the overview of data used in the study, the GP-based feature engineering approach, and details of machine learning models and performance metrics used in this study. Consequently, the results are presented and discussed in Section 4. Finally, Section 5 presents a conclusion and final remarks.
Section snippets
Related studies
The potential applications of buildings occupancy detection to control electric appliances have been explored in a number of previous studies [2], [3], [4], [12]. In the research in [13] an aggressive duty-cycling of building HVAC systems based on occupancy status is proposed, and practical aspects of the implementation of the proposed solution are discussed. Similarly, authors in [14] used a moving window Markov Chain approach to establish occupancy status of the buildings and used that
Methods
This section starts with the overview of the data sources used in this study and then proceeds with the discussion of the data pre-processing and feature engineering methods. Finally, the machine learning algorithms and performance measures that are used throughout the study are described and discussed.
Results and discussions
This section presents the result of different models for predicting home occupancy from meter readings. It starts with the detection of past and present home occupancy, and then proceed with the discussion of how future occupancy can be inferred from past meter data. Fig. 3 illustrates a high-level overview of these classification tasks. For the identification of home occupancy in the past and present, classifications are made for a specific timeslot (T0). For predicting future occupancy
Conclusion
The introduction of smart meters within an AMI has brought potential advantages for both energy providers and consumers. However, the possibility of high-frequency meter readings raises the question of household privacy, especially in relation to buildings’ occupancy detection. This paper introduces a dynamic genetic programming based feature engineering process for detecting occupancy of residential buildings from their smart meter data. The study shows that using this method, machine learning
Conflicts of interest statement
The authors whose names are listed immediately below certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers’ bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in
References (55)
- et al.
The human dimensions of energy use in buildings: a review
Renewable Sustain. Energy Rev.
(2018) - et al.
Predictive control of indoor environment using occupant number detected by video data and CO2 concentration
Energy Build.
(2017) Adaptive occupancy-based lighting control via grey prediction
Build. Environ.
(2005)- et al.
A rule-based fault detection method for air handling units
Energy Build.
(2006) - et al.
Building occupancy estimation and detection: a review
Energy Build.
(2018) - et al.
A novel feature selection framework with hybrid feature-scaled extreme learning machine (hfs-elm) for indoor occupancy estimation
Energy Build.
(2018) - et al.
Comprehensive feature selection for appliance classification in nilm
Energy Build.
(2017) - et al.
Comprehensive feature selection for appliance classification in NILM
Energy Build.
(2017) - et al.
Strategic opportunities (and challenges) of algorithmic decision-making: a call for action on the long-term societal effects of datification’
J. Strategic Inform. Syst.
(2015) - et al.
Occupancy detection in the office by analyzing surveillance videos and its application to building energy conservation
Energy Build.
(2017)