Introduction
The term Internet of Things (IoT) has been used for general connectivity between physical and digital worlds [1]. IoT has vision for a massively connected set of entities which involve anything, anytime, anywhere, any service and any media [2]. In the past two decades, there has been a trend that traditional problem-driven machine to machine (M2M) communications are evolving into modern information and innovation-driven IoT technologies, benefiting from improved networking capabilities, continuously reduced cost, maturation of the Internet infrastructure, and open standards [3]. The emergence of cloud computing has become a key enabler to accelerate this migration trend towards the IoT era.
The essential goal of IoT is data and the intelligence gained from the data. To transparently and seamlessly incorporate a variety of end devices and systems, while providing digital services based on accumulated large quantity of data, a uniform procedure of data collection, aggregation, transmission, storage, processing and visualization must be defined. As shown in Fig. 1, major public cloud suppliers, such as Amazon, Microsoft and Google, have provided infrastructures to satisfy the increasing requirement for IoT support. Moreover, the effort to support IoT has been extended from the cloud platform to the edge device level.
However, there are still a number of obstacles and open challenges hindering the industry from reaching a data-centric IoT framework that can be adapted to a broad category of application domains and scenarios. First, the proliferation of IoT originates from a multitude of established application-specific standards and protocols. As a consequence, the interoperability and compatibility between different link layer protocols, sub-systems and back-end services pose a significant challenge to the establishment of a universal IoT framework. This communication challenge is vital to unleashing the full power of IoT, which has not been tackled by the cloud suppliers. Their efforts are mainly put into software as a service (SaaS) and platform as a service (PaaS) to enhance the computing and intelligence abilities, but they can hardly reach the communications among end devices due to the heterogeneity of hardware and protocols. Second, due to the diversity of application scenarios and applied technologies, the device commissioning, communication, authorization, as well as identity registration and management can hardly be integrated into a uniform security scheme to mitigate the rapidly growing device compromises and other security issues. Furthermore, with the advancement of machine learning and data science, one of the ultimate goal of IoT is to harvest intelligence by employing data mining on the large volume of data in the IoT world, which is, to a large extent, restricted by the heterogeneous data format, speed, and storage mechanism from different applications and fields. Therefore, in the context of IoT, a more feasible approach is to find out a general framework that caters to a category of application scenarios that share the same performance requirements while still maintaining the generality, as pointed out by [4].
In [5], IoT is categorized into four segments, namely massive IoT, broadband IoT, critical IoT and industrial automation IoT. Massive IoT targets at those low complexity and low cost devices with huge volumes and long message update intervals, such as sensors, meters, wearables and trackers, whereas broadband IoT enables lower latency but higher throughput and data volume than massive IoT. A series of typical IoT application scenarios and corresponding performance requirements in the smart city paradigm have been identified in [4] and [6], which suggest that the majority of the smart city services such as monitoring, metering, automatic vending, building automation, fleet management, and smart parking are characterized with low traffic rate (from one packet per 30 seconds to 12 hours) and tolerable delay (from 10 seconds to 30 minutes).
In light of the current IoT development trend, we propose a data-centric IoT framework based on the Azure cloud platform, to face the widespread use cases in massive IoT and broadband IoT segments. It features with three promising communication protocols, namely WiFi, Thread and LoRaWAN, to satisfy both wired and wireless, wide band and narrow band, low data rate and high speed as well as single-directional and bidirectional communication requirements. Profiting from the evolutionary maturity of the Azure public cloud platform, this framework is able to tackle the aforementioned interoperability, device management and data consistency challenges encountered by the practice of IoT technologies. Moreover, we substantiate the framework by a case study of applying the framework into an indoor plant industry to realize intelligent indoor climate monitoring and control. The contributions of this study are as follows:
Proposed a data-centric IoT framework exemplified with three most representative protocols with inherent high level security property, to cater to the local area network (LAN), personal area network (PAN) and wide area network (WAN) use cases
Eased the interoperability, co-existence, device management and data consistency challenges in the IoT practice by taking advantage of the Azure cloud infrastructure.
Validated the framework with a reference implementation and a practical case study to prove the feasibility.
The rest of the paper is organized as follows. In section II we review the state-of-the-art of IoT frameworks proposed by precedent studies and highlight the motivation of this study. In section III we details the proposed data-centric IoT framework and the utilized technologies. In section IV we describe the implementation of the framework as a proof of concept, and apply the proposed framework to a green plant wall-based indoor climate monitoring and control application as a case study. In section V we demonstrate the implementation results. Section VI concludes the paper.
Related Work
Many efforts have been put into pursuing a general solution that can be adapted to multiple IoT use cases or application areas. In this section, we review a series of precedent studies and discuss their achievements and deficiencies.
As a cornerstone of M2M communications, plenty of application-specific communication protocols have been pervasively deployed in various application scenarios. For instance, KNX [7], BACnet [8] and LonWorks [9] were targeting at building automation systems; Modbus [10] aimed at connecting industrial electronic devices; DALI was established for lighting equipment control purpose; WirelessHART [11] was designed to support industrial process field devices. To ease the interoperability challenges, the industries have had a consensus, i.e., to adopt Internet Protocol (IP) to bridge the gaps, and the migration towards native Internet Protocol (NIP) connectivity has been observed in the majority of the traditional standards, such as BACnet/IP, KNXnet/IP, HART IP, Modbus TCP/IP, ZigBee IP, 6LoWPAN-over-BLE [12]. However, the standardization efforts are still fragmented in isolated areas.
With the advancement of telecommunications, a cellular network has been regarded as a potential alternative to address the requirements of the four IoT segments proposed by the telecommunication industry [5]. Nevertheless, it still falls short due to three aspects. First, the broadband support in the cellular network highly relies on the 5G standard, which demands large scale of newly deployed network infrastructures that can be a seriously huge burden to any country or company. Therefore, it can hardly be a widespread solution in the near future. Second, the subscription cost can be increasingly high considering the large quantity of the deployed low power and low data rate sensors and actuators. Last, the compatibility with the massively deployed devices that only have traditional communication protocol interfaces is another challenge to conquer, as upgrading cannot be completed in a short term due to the cost and time limitation.
Some recent researches on application-oriented frameworks have been conducted to promote the adoption of IoT and cloud technologies in various application scenarios. In [13], Yang et al. proposed a remote pain monitoring system using IoT and cloud. In their work, a sensor node with a WiFi module is embedded in a facial mask to monitor pain intensities of patients. The biosignal is locally digitalized and transmitted to the remote cloud via a WiFi gateway using User Datagram Protocol (UDP) or Transmission Control Protocol (TCP). The streamed data are forwarded to a cloud database for signal processing, and medical analytics can be visualized with a mobile web application. Similarly, Monteiro et al. [14] proposed an e-health framework that utilizes fog and cloud computing to mitigate the incapability of embedded devices to guarantee the quality of services. The fog layer classifies the collected data and guarantees the rapid response of a healthcare system while a public cloud platform is in charge of long-term data storage and large scale data analysis. However, there is still a lack of implementation as a proof-of-concept. Barsocchi et al. [15] developed a framework for long-term urban structural health monitoring based on the message queue telemetry transport (MQTT) protocol. The three-axial acceleration, temperature, and humidity data are sensed with wired sensors and published to an MQTT broker, and then relayed to a cloud MySQL and MongoDB database. A local module with message buffering mechanism was implemented as a connectivity failover. In [16]–[18] Zhang et al. developed a remote climate control system for cultural buildings in Sweden, such as churches and museums, based on the ZigBee protocol. The sensory readings are continuously sent to a local server via the ZigBee coordinator and the heating actuator is automatically controlled according to the sensor data. The local server is periodically synchronized to a cloud server using Internet Protocol, which enables data visualization and management of the control settings. In [19], Al-Masria et al. proposed a waste management framework that was based on Microsoft Azure infrastructure. The adoption of IoT Hub simplifies the device management while edge computing improves the local processing capabilities. However, the detailed implementation of low-level hardware and communications were not disclosed. Ahmed et al. [20] proposed an IoT framework for agriculture and farming. It incorporates 6LoWPAN and Long-range WiFi protocols, to enable long range communications. The Long-range WiFi gateway takes the role as a fog device to reduce the traffic and mitigate the latency. Celesti et al. [21] designed an cloud-based IoT framework aiming at traffic monitoring and alert notification based on OpenGTS and MongoDB. The GPS information of private vehicles are transmitted using the 4G network to the cloud. Containers are exploited to enable elastic micro services for data query and data processing.
A series of service oriented IoT frameworks were proposed to serve fragmented services that constitute IoT technology. Pellegrino et al. [22] focused on solving the interoperability issue in the field of lighting management system by introducing a newly developed middleware framework. In [23] Vögler et al. proposed a scalable framework for IoT provisioning in large scale deployments. In [24], Inui et al. put forward a software framework for IoT development to improve the productivity in terms of IoT software development.
As a precursory study, Zanella et al. discussed the typical smart city applications and outlined a general framework of urban IoT network based on the web service approach in [4]. It details the architecture of the web service in the cloud and alternative technologies that can be applied to different services and components, such as data exchange, application and transport layers, network and link layers, backend servers, gateway and edge devices, and IoT peripheral nodes. A pilot smart city project was presented to exemplify a possible implementation of an urban IoT.
These aforementioned application-oriented studies have achieved their specific goals, but none of the frameworks used in these solutions can be generalized as a universal IoT framework for massive IoT and broadband IoT usages. The majority of the frameworks [13]–[19], [21] adopt a single protocol (e.g., wired communication, WiFi or ZigBee), which can hardly satisfy the diverse requirements in regard of bandwidth, data rate, power consumption and so on. The integration of Long-range WiFi with 6LoWPAN in [20] provides a certain degree of flexibility. However, the poor signal penetration, limited popularity and the use of directional antenna which is prone to violation of regulation in Long-range WiFi makes it impossible to be a prevalent solution in an urban environment [25]. As a universal IoT framework, the support of bidirectional communications between sensors/actuators and the cloud is significant to implementing remote monitoring and control functions, which are neglected in [13], [15], and [21]. The edge component is also an indispensable part to improve the field network processing capability and the flexibility of the framework, but it is not covered by [13], [15]–[18]. Most of the frameworks [13]–[18], [20] are built upon private cloud platforms, which lacks security, scalability, and reliability compared to a public cloud platform. Besides, none of the works demonstrate how to integrate intelligence into the framework by utilizing collected data. The work in [22]–[24] can be effective references to reinforce relevant services when designing IoT solutions but can hardly play the role of a general IoT framework, since they only focus on a specific layer while an IoT framework must provide guideline to a complete life cycle of data from data collection in devices to data storage in the cloud. The framework in [4] can be regarded as a high level conceptual framework and guideline for technology selection, but the lack of implementation details such as protocol adaptation, device management, makes it insufficient to become an out-of-box framework that can facilitate the deployment of practical use cases. Moreover, the essential characters of an edge component are not considered at all.
In this study, we aim to address these shortcomings with a data-centric IoT framework that incorporates three promising communication protocols, WiFi, LoRaWAN and Thread, to meet the typical communication needs for indoor/outdoor, narrow band/broad band, and high speed/low data rate use cases. The framework takes advantages of the Azure public cloud and IoT Edge infrastructures to enable bidirectional communication and to unify the device management, data storage and visualization. Machine learning-based data analytics is exhibited as references to general IoT solution design.
Data-Centric IoT Framework
A. Overview
The proposed data-centric IoT framework for massive IoT and broadband IoT is depicted in Fig. 2. The framework is split into the local field and the cloud platform two parts. In the local field, connectivities among sensors, actuators and gateways are established using the WiFi, Thread and LoRa communication protocols to guarantee the support for local area, personal area and wide area networks, which have become the cornerstones to cover the typical IoT applications. A local area network extends the Internet connection to a local field which caters to both low data rate and broadband applications, e.g., smart electronics and surveillance system. A personal area network targets at low data rate and low power devices that are used within a person’s living space, such as building automation systems, wearables and e-health systems. A wide area network intends to supply network infrastructure in a wide scope, which has been found usage in farming, logistics, metering or in a smart city paradigm. The sensory data that are continuously collected from the WiFi, Thread and other wire-based sensor networks are forwarded to the cloud via a WiFi access point to ensure high data rate transmission and bidirectional communication between the cloud platform and local networks. An edge device can be optionally deployed to allow data cleaning and local computing services. Messages of low power and low complexity sensors that are deployed in a wide area are transmitted using the LoRa network, taking advantage of the distinguishing low power consumption and long range communication features.
The cloud part consists of data processing, storage, and presentation and management units. The core component in the cloud platform is an IoT Hub service which takes responsibility of device provisioning, device identity management and data routing. According to practical needs, a series of services that are implemented with applications can be flexibly integrated into the IoT Hub infrastructure as plugins. All the incoming sensory data that are from different devices and locations are passed through a cloud gateway in an IoT Hub to enable authentication and then routed to different services for further processing, storage and visualization, according to their contents or properties. A web application is developed and hosted by a container service to realize real-time visualization, historic data display and administrative management functions for end users.
B. Azure Cloud
The proposed framework capitalizes on the IoT Hub infrastructure in Azure to realize a centralized IoT management service, to address the challenges in respect of interoperability, device management and data format consistency. The service connectivities are shown in Fig. 3.
1) IoT HUB
IoT Hub sits in the center of the cloud platform in the framework, which offers fully-managed services such as device provisioning, authentication, identity management as well as data ingestion and routing. It can simultaneously support secure and bidirectional communications with large quantity of IoT devices using HTTPS, advanced message queuing protocol (AMQP) or message queuing telemetry transport (MQTT) messaging protocols. IoT Hub enables direct connection with IP-capable devices while still maintains the possibility to establish connections to low-power and resource-constrained devices, or devices with other protocols via a field gateway that plays an agent role [26].
2) Peripheral Services
In the framework, a series of peripheral services are seamlessly plugged into the IoT Hub infrastructure to accelerate data analytics, storage or presentation of collected telemetries.
a: Function Service
Function service provides the flexibility to execute a specific task with a small piece of code, without developing a whole application or project. Functions can be written in a number of high level programming languages, such as C#, Node.js and Python, etc. In the framework, function services are regarded as glue between data ingestion and storage services. It assists the storage procedure by taking away the constraints of heterogeneous data sources and destinations, i.e., data from different devices or locations can be stored to different databases according to their contents or properties.
b: Storage Service
Data storage service can vary a lot depending on the type of target data. This framework includes four storage services to cater to the majority of storage requirements in IoT use cases. SQL database is used for storing relational data while DocumentDB features the light weight, fast query speed and simple data format (JSON). Table storage is reserved for storing table structured data. Blob storage is deployed for storing unstructured data such as uploaded video, image, raw data file, or device specific files.
c: Visualization Service
While Azure has supplied out-of-box visualization utilities, this framework relies on web application-based human-machine interface (HMI) to leave more freedom for customized visualization according to distinguished data sources. The web application is hosted in a container service which allows flexible upgrade of the user interface by simply updating the connected container registry. It fetches data from IoT Hub and databases while allowing end users to check data and manipulate remote devices.
d: Intelligence Service
Intelligence is generated by processing collected data, inserting analytic and logic units, and applying machine learning algorithms. In this framework, stream analytic services combined with built-in machine learning services can be integrated to accelerate the time series data analytics. A logic application is applied to a service bus queue to enable alert functions. Besides, more advanced and flexible application-dependent machine learning and deep learning models can be deployed in the container service to realize customized intelligent functions.
C. Local Infrastructure
1) Communication Standards
To enable interoperability among devices from different networks, several data link layer and application layer communication protocols are adopted in the framework, as listed in Table 1.
The WiFi protocol, which is based on IEEE 802.11 standard, has become the most widely used wireless local area network standard since its first release in 1997. After several generation’s evolution, the performance of WiFi such as capacity, throughput, coverage and power consumption in dense environments have been greatly elevated and verified in the market. Its stellar performance and versatility make WiFi the predominant wireless Internet access technology and a solid ground to propel innovative applications in various fields such as smart home, wearables, virtual reality (VR), augmented reality (AR), which creates countless economic and societal values. The next generation WiFi, namely WiFi 6, which features higher data rate, larger network capacity, and better power efficiency than current WiFi has become an anticipated step in the IoT evolution [27].
Thread is a newly emerging mesh networking protocol that is designed for building automation networks (BAN). It adopts the IEEE 802.15.4 standard as the physical and media access control (MAC) layers while providing native support to IPv6 connectivity by introducing a 6LoWPAN layer between the MAC layer and the network layer [28]. With native-IP support, Thread promotes the building automation industry to benefit from massive innovations evolving in the Internet world. Guaranteed low latency [29] and low power consumption [30] are additional merits promised by the Thread standard. As a personal area network, a single Thread network is capable of supporting 32 routers and more than 16000 end devices, which is sufficiently large for ordinary sensing and actuating purpose. Compared to other mesh network protocols, e.g., ZigBee and Bluetooth Mesh, Thread outperforms the rest in terms of throughput, latency and reliability in either small or large networks [31].
LoRaWAN [32] is an evolving low power wide area network (LPWAN) protocol specifically designed for low power or battery-driven devices which are expected to be operating for more than a decade. Characterized by long range communication, end-to-end security, mobility and positioning capabilities, low data rate and ultra-low power consumption, LoRaWAN has been used in various areas such as agriculture, environment monitoring, healthcare and logistics, etc. LoRaWAN theoretically supports bidirectional communications. However, the majority of the use cases only adopt the up-link communication while down-link communication is rarely used, due to the lack of a centralized consensus mechanism in the design of LoRaWAN protocol [33].
2) WiFi Network Connectivity
As shown in Fig. 4, in the proposed framework, WiFi performs the backbone network role to provide connectivity between the local infrastructures and the cloud unit. Benefiting from the native-IP support, WiFi-enabled sensors can directly establish secure connection to the cloud gateway using any of the supported messaging protocols via an access point. In parallel, for resource-constrained devices, the message exchange with IoT Hub can be bridged by the edge device. The bidirectional communication traffic between the edge device and WiFi-enabled sensors is carried by light-weight application layer protocols such as constrained application protocol (CoAP) and MQTT to satisfy the low-power requirement. The edge device can either actively retrieve and update the WiFi nodes using CoAP commands or performs the device management via an MQTT broker.
3) Thread Network Connectivity
Thread is introduced into the framework as a complement to the WiFi network so as to expand the application scenarios when WiFi is unavailable but the requirement on high data rate is not preferred, such as metering services. Meanwhile, due to the nature that building automation is natively supported by Thread, the framework is able to seamlessly incorporate building automation systems using the same infrastructures in the cloud.
As seen in Fig. 4, in this framework, the connectivity between the Thread network and the edge device is enabled by a Thread border router. All the IP traffics from the outside of a Thread network are adapted to Thread network packets with the physical layer, the MAC layer and network headers added by border router devices and vice versa. The details of the connectivity are shown in Fig. 5. A host-NCP (network co-processor) architecture is used to implement the border router functions. A Thread network co-processor implements the full Thread stack, i.e., Thread drivers, Thread stack core functions, and Thread stack APIs, while the application layer is hosted on the edge device. The network co-processor performs the role as an agent to the edge device which is in charge of creating a Thread mesh network and listening to the network traffics through the Thread interface on the network co-processor. After network creation, other Thread devices, such as routers and router eligible end devices (REED) can be commissioned to join the network using the standard Thread commissioning procedure. The communication between the edge device and the network co-processor is via a serial bus that is managed by serial controllers in both devices. All the packets from the Thread network are routed to application layer programs in the edge device through a network driver and a tunnel driver. The network driver is a user-space program that provides native IPv6 interface to the network co-processor device while tunnel driver is in charge of packet routing between different user-space programs.
In the application layer, the interaction between a Thread device and the edge device is through the MQTT protocol and an MQTT broker, similar to WiFi nodes. Since Thread only supports UDP as transport layer protocol while MQTT runs on TCP/IP stack, in this framework, a lightweight MQTT protocol, namely MQTT for sensor networks (MQTT-SN) is utilized to bridge the gap and enable Thread devices to communicate in the same way as using MQTT. MQTT-SN enables Thread devices with an MQTT broker-based unified approach to perform message delivery and device management, which largely improves the interoperability among the local field devices.
4) LoRaWAN Connectivity
As seen in Fig. 4, a LoRaWAN network is involved in the framework as a complementary solution to LAN and PAN networks, catering to the large quantity of low power and low complexity devices which are usually deployed in a wide scope to perform metering, tracking or other monitoring tasks. The present framework only utilizes the single directional up-link communication of LoRaWAN, while the down-link is neglected for reliability consideration. The services are recommended to be built upon commercial LoRaWAN operators’ network due to their large scale deployments of gateways which effectively improves the network capacity and reduces collisions. Besides, the superior compatibility between LoRaWAN servers and public cloud platforms is another merit.
5) Edge Device
As seen in Fig. 6, an edge device plays an essential role in the proposed framework which is directly in charge of the communication between local segmented networks and the remote cloud server. An edge device behaves as a local hub where all device telemetries are aggregated and then forwarded to the cloud. Meanwhile, cloud to device messages and commands are first pushed to the edge device and then relayed to target devices. In this framework, several necessary components are pre-defined to constitute the edge device and consolidate the fundamental functionalities.
a: Standalone Application
Standalone applications regard an edge device as an ordinary end device. It is running in an isolated process in parallel with other components to directly operate the hardware through the operating system and interact with sensors and actuators that are connected to the edge device using any of the supported wired communication protocols, e.g., series, I2C, and SPI, etc. A standalone application maintains a single device identity that is registered in the cloud to represent the edge device itself.
b: Broker
A broker is a message server that delivers messages from publishers to subscribers. It supports one-to-one, one-to-many and many-to-one messaging models. The broker server in the framework is vital to enable the interoperability between the edge device and Thread/WiFi devices. For all Thread and WiFi nodes that are not able to establish a direct link to the cloud, a series of device-specific topics shall be registered in the broker server for each individual device, which include telemetry, command and property topics. In this approach, end devices are able to transmit sensory telemetries and receive remote commands by publishing content to the telemetry topics and subscribing to the command topics. Property topics are used to modify end device local settings or configurations which shall be always synchronized to the device properties in the cloud.
c: Adaptation Server
Adaptation server runs a couple of application instances, of which each represents an end device. An application instance locally maintains a single device identity object, which has been provisioned in IoT Hub and is always synchronized to the device identity in the cloud. An application instance shall implement a complete set of functions to be able to transmit message to the cloud and get a direct method from the backend. An application instance interacts with the corresponding device through the broker server. It subscribes to relevant device telemetry topics to get updated sensory data and invokes direct methods by publishing method to command topics.
d: Internet Module
An edge device shall be able to establish direct communication channel with the cloud platform using Internet Protocol. Inevitably, the Internet module must be equipped. In most cases, a WiFi module is required while an Ethernet interface can also be considered.
e: Edge Computing Unit
An edge computing unit is capable of releasing the potential of the edge device and bring intelligence closer to the local infrastructures. It can perform a local gateway for device provisioning, data cleaning or light-weight analytics, and reduce the amount of data being transferred to the cloud. In this framework, an edge computing unit can be alternatively deployed in necessary cases.
6) External Broker
An external broker service is included in the framework to improve the compatibility with other protocols, which is a significant feature for those devices that communicate with traditional field network protocols. The sensory messages can be published to an external broker server, from which the sensor data update can be fetched by a function application that is running in the cloud and has subscribed to the broker. The data can be conveniently stored into the same database as data generated from other first class devices.
D. Device Model
This framework relies on IoT Hub to manage device identities for all the registered devices so as to guarantee a unified device model, as shown in Fig. 7. Despite of the heterogeneity of the hardware, the common device-specific information such as device ID, authentication key and device status code are stored in the cloud device identity registry upon provisioning in a standard method. The device management procedures are unified with three patterns: property update, bidirectional message and direct method. Each device has two identical conceptual objects, namely device twin, in the registered IoT Hub and in the local device. A device twin is a virtual device that contains device tags, desired properties and reported properties. Device tags record some device labels such as device location, manufacturer, and type, etc. A device always synchronizes local desired properties to the corresponding cloud desired properties and updates latest local reported properties to the cloud reported properties. Therefore, the backend is able to update the device by modifying desired properties and querying reported properties to check the device status. Besides, IoT Hub also offers native support for device to cloud (D2C) messages and cloud to device (C2D) messages (if bidirectional communication is supported), to enable telemetry and notification delivery. Direct method is an approach to grant a backend authority to instantly invoke specific commands in the end device to execute. With aforementioned device twin object and three device management patterns, this framework can treat all devices with a single device model.
E. Data Model
1) Data Format
To ensure data format consistency, the framework takes full advantage of the JavaScript Object Notation (JSON) encoding format due to its ubiquitousness and broad library or platform-inherent support [26]. Azure IoT Hub has fundamental support for processing JSON encoded messages. Incoming JSON formated messages can be automatically routed to different endpoints according to message contents without putting additional effort. Moreover, a unified JSON format can also benefit further data analytics and be flexibly stored into either relational database (e.g., MySQL) or non-relational database (e.g., DocumentDB). In this framework, all the sensory messages and device twin objects are encoded in the JSON format to guarantee interoperability. For Thread and WiFi-enabled devices, the sensory data are first published to the broker. After being notified of the sensor data update from the broker, the corresponding application instances will construct data telemetries using JSON format and then transmit to the cloud. For LoRaWAN devices, the sensor data are transmitted to the gateway and relayed to a LoRa backend server using raw hexadecimal format. After that, the LoRa backend server utilizes JSON to encode the sensor data together with device ID, gateway ID and other radio connection parameters into a telemetry, which is sent to the cloud using device policy connection string.
2) Data Flow
In this framework, the data flow between the cloud and end devices involves D2C telemetry, C2D direct method, desired property update, reported property update, and unstructured data upload. Fig. 8 depicts the supported data flow for a standalone application running on the edge device while Fig. 9 details the data flow for Thread, WiFi and LoRa nodes.
As shown in Fig. 8, a D2C telemetry is a JSON encoded message which contains device ID and sensor data. It is mainly used for reporting local sensor measurements, metering values or device operating status. The backend can manipulate a local device either by sending a direct method to the device to trigger an action, or by requesting a desired property update to change device local settings. In either case, the end device will execute the corresponding task by calling direct method callback functions or updating corresponding properties, and then inform the backend of the execution results with a reported property update. These four categories of data flow cater to the need for monitoring and remote control in the Massive IoT use cases while unstructured data upload corresponds to broadband IoT requirement when a large amount of unstructured data need to be transferred to the cloud blob storage. The backend is able to access the blob storage service so as to fetch the uploaded data.
In Fig. 9, different from a standalone application, a Thread or WiFi end device must publish sensor data to relevant sensor topics in the MQTT broker so that a corresponding application instance is able to construct D2C telemetries for the Thread/WiFi node upon receipt of sensor topic update from the broker. Sensor measurements from a LoRa end device are first reported to a LoRa backend server, which encapsulates received sensor data into D2C messages and transmits them to the cloud later on. Upon receipt of a direct method or desired property update notification, a Thread/WiFi node cannot immediately takes actions; instead, these notifications are relayed by a corresponding application instance and are delivered to the Thread/WiFi end device using MQTT broker-based method/property topic updates. Particularly, file upload data flow is only enabled by standalone applications due to its direct access to broadband WiFi network while an MQTT-based communication is not optimal for file transfer.
F. Security
Security consideration is critical when an IoT solution is designed. In this subsection, we detail the fundamental security schemes that are supported in the Azure cloud and in the local communication protocols in this framework.
1) Cloud Security
In the cloud platform where IoT data are transported between different cloud services and databases, the security is guaranteed by Azure security infrastructure which provides high levels of enhanced security, privacy, compliance, and threat mitigation practice [34].
IoT Hub uses a per device authentication manner to guarantee the isolation of security for separate devices, i.e., compromise of a single device cannot expose the other devices secret credentials. Moreover, each security key is bound to an access policy when it is generated, which greatly improves the security in authorization. For instance, a device bound to a DeviceConnect policy can only send D2C messages and receive C2D notifications and direct methods, but has no access to device registry or any other IoT Hub settings. Communication between IoT Hub and devices is secured with transport layer security (TLS) based handshake and encryption, which is the cornerstone of the security of the current Internet society.
2) Local Security
a: Security in WIFi
For WiFi network, the vast majority of the devices are secured with WiFi Protected Access 2 (WPA2) encryption method, which has been found vulnerabilities, e.g., it is sensitive to key reinstallation attack (KRACK) [35]. WiFi alliance has announced the approaching of the next generation WiFi protocol, 802.11ax, with an enhanced encryption method namely WPA3 [27]. WPA3 manages to mitigate the flaws in WPA2 such as offline dictionary attack and KRACK attack. Forward secrecy is another highlighted security feature of WPA3 that prevents old data being disclosed by later attacks. In general, the new WiFi release will bring security to an unprecedentedly high level.
b: Security in Thread
Thread is known for having superior security consideration. Commissioning and security are highly prioritized in the specification of Thread. The device commissioning in Thread is protected with a datagram transport layer security (DTLS) session which fulfills online and offline dictionary attack resistance, forward secrecy and known session security. The communication is protected by an AES-CCM security suite that ensures confidentiality, integrity and authenticity. Our previous research [36] has worked out a security assessment taxonomy for building automation networks, which was applied to a comprehensive security analysis on the Thread protocol. The results show that Thread has superior security mechanism support which covers a whole life cycle of a Thread device, i.e., from a device discovering and joining the network to leaving the network. A series of novel network attacks towards Thread are also identified in [36] and it turns out Thread is robust to the majority of the identified attacks except for some radio jamming attacks that can generally affect all networks.
c: Security in LoRaWAN
In LoRaWAN, end-to-end security is enabled and two types of session keys are utilized to guarantee a secure channel established between an end device and a LoRa application server. A 128-bit AES security key is bound to each LoRa device to adopt AES cryptographic primitive with several operation modes, e.g., cipher-based message authentication code (CMAC) mode for integrity protection and counter (CTR) mode for encryption. LoRaWAN implements integrity protection in a hop-by-hop manner, i.e., one hop over the air is guaranteed by LoRaWAN protocol and the other hop between the network and LoRa server is protected by secure transport solutions such as HTTPS and VPNs [37].
Security in IoT is a comprehensive subject that needs to take all dimensions into consideration. In [36], it has categorized security in wireless LPWAN and field networks into network security and device security. Network security shall guarantee secure commissioning, secure communication and secure device leaving. Device security highlights the firmware security and the ability to resist device tamper attacks.
By taking advantages of existing security schemes of the Azure public cloud and WiFi, Thread and LoRaWAN protocols, this proposed framework has taken up this key point in network security, though more details regarding device security such as secure operating system, secure application layer, and hardware tamper-proofness that are more dependent of customized implementation that shall be further polished.
Implementation
In this section, a reference implementation is described as a proof of concept to the proposed IoT framework and a guideline to any solution design. Furthermore, a case study of applying the framework to an indoor climate control system with green plant walls is presented to validate the entire concept.
A. Hardware
The hardware sets used in the implementation are shown in Fig. 10.
1) Edge Device
In this implementation, Raspberry Pi 3 (Rpi3) model B+ is selected as the edge device. It features a 64-bit Cortex A53 SoC that can run at 1.4 GHz, 1GB LPDDR2 SDRAM, 2.4 GHz and 5 GHz IEEE 802.11.b/g/n/ac wireless LAN, and rich GPIO pins support, which makes it sufficiently powerful to satisfy the requirement as an edge device to the cloud.
2) Thread Device
The Thread sets used in our evaluation are NXP FRDM-KW41Z development board and Nordic Semiconductor nRF52840 DK board. KW41Z has a Cortex-M0+ processor with integrated 2.4 GHz transceiver that supports Thread, Bluetooth Smart/Bluetooth Low Energy (BLE) v4.2, Generic FSK, and IEEE 802.15.4. nRF52840 DK is designed for facilitating development of Bluetooth 5, Bluetooth mesh, Thread, Zigbee, IEEE 802.15.4, and 2.4 GHz proprietary applications on the nRF52840 SoC. One significant feature of nRF52840 DK is the low power consumption that it can be supplied with a CR2032 battery.
3) WiFi Device
Arduino Uno WiFi rev2 is used to demonstrate the WiFi sensors. Arduino Uno WiFi is an enhanced Arduino Uno board equipped with a WiFi module. The WiFi Module is a self-contained SoC with integrated TCP/IP protocol stack, which enables Arduino to either establish a direct link with IoT Hub or connect to the edge device using an application layer messaging protocol.
4) LoRaWAN Device
A LoRaWAN network is demonstrated using a MultiConnect xDot LoRa node and MultiConnect Conduit AP gateway that are manufactured by MultiTech. The xDot node is LoRaWAN 1.0.2 compliant, providing bidirectional data communication up to 15 km line-of-sight and 2 km into buildings. The Conduit AP gateway features deep in-building penetration and connectivity to thousands of LoRa end devices, and provides Ethernet and 4G-LTE interfaces for flexible backhaul choice.
B. Software
1) Standalone Application
The architecture of the standalone application is shown in Fig. 11. It is running as a user space program directly on the edge device, i.e., a Raspberry Pi board, which is powered by the Raspbian Linux operating system. In the hardware abstraction layer, the application relies on the MRAA library to execute I/O communication tasks, such as initializing I/O pins, fetching analog signals, reading or writing digital signals, and communicating with buses, which enables the application to interact with sensors and actuators either through I/O pins or through a microcontroller using a communication bus. Azure IoT device SDK is utilized to facilitate the application to connect to IoT Hub. It provides APIs for device authentication, message exchange, direct method invocation and property update.
A reference device twin object is exhibited in Fig. 12. This device twin can be used to interact with several different types of sensors and actuators. For sensors that are read periodically, update interval is recorded in the properties while for sensors that are read according to a time schedule, trigger time property shall be recorded. Actuators are usually made up of controllers with a specific time schedule. For switch actuators, the switch on and off time schedules are marked, whereas for complex actuators that need a new thread to be created, the
The implemented direct methods that are common to most cases are listed below.
Sensors read method
Single thread actuators turn on method
Single thread actuators turn off method
Multi-thread actuators turn on method
Reboot method
Firmware upgrade method
Network report method
Program halt method
Program resume method
2) MQTT Broker
Eclipse Mosquitto is installed on Raspberry Pi to perform the MQTT broker in this study. Mosquitto runs on top of TCP/IP stack and keeps listening to port 8883 which is default for TLS encrypted connection to the broker. Each Thread and WiFi end device will establish a connection to the broker by successfully sending a CONNECT message to and receiving a CONNACK message from Mosquitto, and then they are able to publish and subscribe to device-specific topics to get messages exchanged with the edge device.
3) Thread
In this study, OpenThread is adopted to implement a Thread network. OpenThread is released by Google Nest as an open source project to accelerate the spread of Thread standard. It implements full features of the Thread specification and has native support for various operating systems and hardware platforms, including FRDM-KW41Z and nRF52840 DK boards.
a: Device Connectivity
As shown in Fig. 5, the host-NCP architecture is natively supported by OpenThread. One KW41z board is selected as NCP which is programmed as a full Thread device (FTD) with a complete Thread stack. Raspberry Pi takes the host role to manage the Thread network through a network driver, namely wpantund, which is able to provide native IPv6 interface to the NCP device. The communication between the host and the NCP is via UART and is managed using the Spinel protocol that is specifically designed for control and management of an IPv6 interface. The Thread network is created by the NCP device with predefined network parameters. After that, a Thread router device (i.e., KW41Z) and a router eligible end device (REED) (i.e., nRF52840 DK) are commissioned to join the network and form a Thread mesh network.
b: Thread Application
To enable MQTT communication, a paho MQTT-SN gateway is deployed in Raspberry Pi to perform the protocol translation between MQTT-SN and standard MQTT protocols.
The Thread end device demonstrated in this implementation is an nRF52840 DK board. With the aid of nRF5 SDK for Thread and ZigBee, a MQTT-SN client is developed and running on top of the OpenThread stack. A state machine of the application is shown in Fig. 13. Upon program startup, a client enters an initialization state, which involves a series of initialization functions to initialize hardware, scheduler, timer, device twin structure, Thread stack and an MQTT-SN client. Once the board has joined the Thread network, an MQTT-SN client will broadcast a query to look for an active gateway in the network. After successful query, it registers the identity to the broker by subscribing to desired property and direct method topics, and publishing to reported property and sensor value topics. All the topics that belong to a specific device start with the device ID and followed by sensors, properties or methods, as shown below. Specifically, methodName topic is recorded in binary format so that a direct method can be invoked by toggling the binary value.\begin{align*}&deviceId/sensor/sensorName \\&deviceId/property/desiredProperty/\ldots \\&deviceId/property/reportedProperty/\ldots \\&deviceId/method/methodName\end{align*}
The program stays in a loop to keep publishing sensor data to corresponding sensor topics and querying subscribed topic update so as to either enter method callback or update local desired property. The results are published to reported property topics to inform the cloud. In this way, a Thread end device can be seamlessly integrated into the framework following the standard IoT Hub device management procedure. A detailed program log information is exhibited in Fig. 14, which covers the whole procedure from Thread network establishment, MQTT-SN gateway discovery to publishing sensor data and receiving method topic update.
4) WiFi Node
The WiFi node is demonstrated with an Arduino Uno WiFi board. There have been mature library support for WiFi connection and MQTT protocol in the Arduino community. In the program setup, WiFi connection to the access point is first established, and then a series of device topics are created and subscribed. The device finally enters a loop function using the same logic as used by a Thread end device. It periodically publishes new sensory values to the sensor topics. Meanwhile, a keep alive message is periodically sent to the broker in order to get subscribed topics updated and prevent disconnection from the broker.
5) Application Instance
Each application instance corresponds to an end device that is incapable of creating direct link to IoT Hub. The structure of an application instance is similar to a standalone application, except that it must subscribe to device sensor value and desired property topics to get informed of the latest update of a device status. The demonstration is written in Python with the support of Azure IoT device SDK and paho MQTT Python library, and an application state machine is shown in Fig. 15. An instance initializes the same device twin object as its binding device and maintains its connection to the cloud. A separate thread is created to maintain the connection to the broker and listen to the updates from the subscribed topics. On receiving update of sensor values or reported properties, the application instance will accordingly modify its local values and relay the update to the cloud using IoT Hub SDK APIs. In case a desired property update notification or a direct method invocation from the cloud is received, the instance will publish the latest content to the desired property or method topics respectively to inform the device of the tasks.
6) LoRaWAN Node
The LoRaWAN node used in this study is a MultiConnect xDot development kit. The xDot device is powered by the Arm Mbed 5 operating system combined with the libxdot library that supplies LoRaWAN stack support. The main operation of the device is to send a sensory data message to the LoRa gateway in a periodic manner. A state machine of the application is shown in Fig. 16. In the initialization state, the program starts with a network reset to ensure its entering into a well-known state. After that, network parameters such as network ID, network key, frequency band are configured and a link check is configured to maintain the connection to the gateway. Finally, it enters a loop where sensor data are read and transmitted, and the network join status is periodically checked to ensure that the device is connected to the network. The messages are transmitted to a Conduit AP LoRaWAN gateway that has been pre-configured to relay network messages to a LoRa backend server. In this implementation, the backend server is provided by Blink Services that is a commercial LoRaWAN operator, deploying massive LoRa gateways in Sweden to facilitate the use of LoRa applications. A connection string configured to device connect policy in our IoT Hub is generated and stored in the backend server so that all the sensor data can be seamlessly routed to IoT Hub as a JSON object with attached information of device ID, gateway ID, timestamp, RSSI, and SNR, etc.
7) Cloud Storage
In this study, the storage functions are demonstrated with an Azure SQL database and a blob storage service. These services can be activated through the Azure control panel in a straightforward way. A function application written in Node.js is created to process the incoming messages. It is configured to be triggered by the IoT Hub event endpoint, i.e., every time when a message is passed into IoT Hub, the function application will be automatically invoked to read the message content and save it to the SQL database using mssql library. In parallel, a time-triggered Node.js function application is deployed to be able to periodically query an external broker server. With this function, sensor data that are transmitted by other protocols can also be fetched and saved to the same database. Whenever a file upload is requested to IoT Hub, the upload stream will be routed to the pre-defined blob storage so that unstructured files can be properly stored.
8) Cloud Visualization
The power of IoT stems from those extensively collected sensory data that are mostly presented in a sequential manner. Visualization plays an essential role in understanding the patterns and relationships between different time series variables. In our framework, a web-based visualization application is developed to better analyze and find the underlying stories behind the data, and simplify the interaction with the actuators. In the Azure platform, a web application service is created to enable the visualization application while a container registry service is required to host the docker image repository so that the web application can be upgraded by pushing a new docker image to the container registry.
The presented application is divided into two focus areas; live stream data analysis and historical data analysis. Three commonly used visual representations within the data visualization community, namely Line graph, Stacked Area graph and Horizon graph, are implemented to ensure the user with a variety of representation methods to analyze the data. With this setup the user can simply focus on current status of each IoT device, operate actuators and analyze historical data to better understand patterns.
C. Case Study: Green Indoor Climate Control System
The proposed framework has been successfully applied to an indoor climate control solution, i.e., a green plant wall system. A green plant wall is a vertical wall with vegetation planted on the surface. It incorporates watering, lighting and ventilation systems to support the plants growing so as to purify the indoor climate. Several environmental sensors that measure temperature, humidity, luminance, water level in the tank, particulate matter (PM), carbon dioxide (CO2), and other eight different gas concentrations, are deployed to the plant wall. The sensors keep measuring indoor climate parameters and transmitting them to the cloud for storage and display. The monitoring and control functions are implemented as a standalone application running on the edge device, i.e., a Raspberry Pi in this case. The details of the case study are presented in our previous work [38]. Fig. 17 shows the deployment of the framework to a green plant wall system. The green plant wall systems powered by this IoT solution have been deployed in our university lab, classroom, elderly home, workshops and other places in Sweden by Vertical Plants System AB that is a green plant wall manufacturer. The systems have been stably running for more than one year.
As a continuation, several enhancements have been made to the previous system to better demonstrate the flexibility and applicability of the proposed framework.
New direct methods such as Firmware upgrade method, Network report method and Arduino reset method are developed to strengthen and simplify remote management. Take photo method is added to provide visual monitoring of the plants. Network report method and Take photo method take advantage of the blob storage by uploading the network status file and image files to the subscribed blob storage while Firmware upgrade method triggers the local device to download a new application executable from the blob storage. Therefore, three blob containers are created in the blob storage service, catering to these three cases.
In addition to a static sensor box that is controlled by an edge device, LoRa-based ambient luminance sensor and WiFi-based temperature and humidity sensors are added to the system, which extends the climate monitoring space to a large extent. More environmental data can be collected simultaneously to guarantee the precision of the analytics.
Some plant wall systems deployed a few years ago utilize the Modbus protocol to transmit pressure sensor signals. To involve these old products into the Azure cloud, an external MQTT broker namely Cloudmqtt is used. The local sensor data are sent to a Modbus gateway and then relayed to the broker periodically. A function application is deployed in Azure and set to be automatically triggered every 30 seconds. Upon trigger, the function connects to the Cloudmqtt broker and waits for the latest update of telemetry. After receiving the update, the sensor data are saved to the same database while the function application ends by itself.
The proposed web-based visualization application is applied to all the plant walls. The live stream and historical data of the sensors can be visualized and analyzed using the aforementioned Line graph, Sacked area graph and Horizon graph.
Results
In this section, we present our results of applying this data-centric framework to massive and broadband IoT use cases as well as a comparison to other precedent studies.
A. Massive IoT
As a data-centric framework, the data are the ultimate target to achieve. An example of the data flow in the implementation is demonstrated in Fig. 18. Raw sensor monitoring values from Thread and WiFi nodes are published to the Mosquitto broker topics. These values together with device IDs are used to construct JSON format messages by the corresponding application instances. Both a standalone application and application instances send D2C telemetries to IoT Hub using a standard JSON format message. For LoRa node, the sensor reading is converted to a hexadecimal value and put into the payload that is sent to the LoRa backend server. At the server, the original payload combined with device ID, radio status and other information will be encapsulated in a JSON message and transmitted to IoT Hub. All the JSON messages are processed by one or several function applications that are triggered by an IoT Hub event. In addition to extracting sensor values from the messages, The function applications also process the messages according to different device IDs. For instance, the LoRa luminance sensor value 785 is recovered from the hexadecimal format “02f6”; the water level values from a local standalone application and a Modbus device are measured by different sensors and recorded different scales. The function applications calculate the real water level values accordingly. After these processing steps, the data are inserted into the same database in a universal approach so that they can be visualized using a uniform web application-based visualization interface, as shown in Fig. 19.
Web-based visualization application. (a) Live stream and control functions. (b) Historical data.
As a typical massive IoT use case of the data-centric framework, some of the environmental data that are collected by the green plant wall from a classroom in our university are presented in Fig. 20. The data were collected from 2018 April to present. Fig. 20a depicts the CO2 level in the classroom which in general fluctuates around 400-500 ppm and differs between day and night. Fig. 20b reflects the temperature changes during this period and the temperature variation can be clearly observed. The average temperature tends to increase during summer time and starts to decrease during autumn and winter. The water level is shown in Fig. 20c, from which the water consumption pattern of the plant wall can be easily observed, and when the water level is lower than pre-defined threshold a red alarm can be reported.
Massive IoT use case: historical data collected using the framework. (a) CO2 data. (b) Temperature data. (c) Water level data
B. Broadband IoT
Fig. 21 showcases the broadband IoT support of the proposed framework. As unstructured files, the green plant wall system is able to upload images and network record files to and download new firmware and connection string from the blob storage containers. The feature can be applied to other use cases which need to transfer data or files with a large size.
C. Performance Comparison
A comparison of our proposed framework and precedent studies is presented in Table 2. The results show that this framework surpasses precedent studies by featuring support for both massive and broadband IoT use cases with three promising protocols which have great security schemes. It relies on the Azure public cloud platform to guarantee reliability, scalability, availability and low cost. By defining a uniform device model and data model, the framework outperforms other studies in inter-operability. Finally, a complete implementation and case study have consolidated the feasibility, though the performance benchmarking of the proposed framework such as latency, data rate limitation, and valid coverage range are worthy of further exploration in the future.
Conclusion
In this study, we aim to overcome some critical obstacles that hinder the practioners from utilizing and harvesting the fruit of the IoT development, i.e., the data and the intelligence. We proposed a data-centric IoT framework that incorporates WiFi, Thread and LoRaWAN protocols, based on the Azure cloud platform with inherent support of a high level security scheme. The framework conquers the interoperability challenge among different protocols and unifies the device management procedures and data models by taking advantage of Azure IoT Hub infrastructure so as to realize a data-centric system. A reference implementation of the framework is demonstrated, which covers WiFi, Thread LoRa and edge device connectivities and communications. A case study of applying the framework to the green plant wall-based indoor climate control system is presented. Therefore, the concept is validated and the feasibility of the proposed framework is proved. As a general purpose framework that supports LAN, PAN and LPWAN networks, it has shown the potential to be flexibly applied to many use cases in both massive and broadband IoT categories.
ACKNOWLEDGEMENT
The authors thank Pär Håkansson at Nordic Semiconductor for providing Thread Development Kit to speed up our demonstration, thank Jakob Blomberg at Blink Services AB for providing LoRaWAN Gateway and Server to us to build the LoRaWAN testbed, and thank Ola Weister at Vertical Plant Systems AB for cooperation and providing green plant walls as testbeds. Adam Rohdin and Karim Samim are acknowledged for developing the prototype of the web application for visualization. We also thank Gustav Knutsson for his work on maintaining the green plant wall lab.