Journals & Magazines >IEEE Open Journal of Systems ... >Volume: 2

ViSnow: Snow-Covered Urban Roads Dataset for Computer Vision Applications

Abstract:

Road surface condition estimation is an important task in fields related to transportation systems and road maintenance, especially in adverse weather conditions, such as...Show More

Metadata

Abstract:

Road surface condition estimation is an important task in fields related to transportation systems and road maintenance, especially in adverse weather conditions, such as snowfall. In this article, we introduce an image dataset for snow-covered roads in an urban context. The dataset is an extensive collection of images captured by traffic monitoring cameras in Montreal, QC, Canada, during the winters of 2022 and 2023. We detail the process of acquiring the dataset, including the source and the methodology to enable the replication of such a process. We also present an exploratory dataset description to showcase its rich contextual representation of the urban winter scene at different times, locations, and weather conditions. We also establish a benchmark problem for the dataset that consists of automating its annotation process. This process should add value to the dataset by attributing a label describing the snow level covering the road for each image. Finally, we discuss potential applications the dataset can enable in fields, such as transportation and winter road maintenance.

Published in: IEEE Open Journal of Systems Engineering ( Volume: 2)

Page(s): 62 - 70

Date of Publication: 19 April 2024

Electronic ISSN: 2771-9987

DOI: 10.1109/OJSE.2024.3391315

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

IN RECENT years, the field of transportation has seen a breakthrough with the emergence of intelligent transportation systems (ITSs). This development has been mainly supported by computer vision applications powered by large image datasets and a rise in computational power [1]. To ensure their suitability for deployment, ITS should be able to maintain its performance regardless of the situation, especially in adverse weather conditions. In fact, winter weather poses significant challenges to vision applications, resulting in a degradation in overall performance. During winter weather events, such as snow, heavy rain, or fog, the roads, and vehicles become less visible. Vision models are often biased toward normal weather conditions that represent the majority of their training data, therefore, they experience a domain shift and fail to perform well. Other than that, ITS should have continuous information about their surroundings. During winter weather, the road surface condition (RSC) is crucial information to acquire in order to provide safe ITS that can operate on wet and slippery roads.

Multiple research works contributed to mitigating the effects of winter adverse weather on ITS vision models. The research efforts mainly focused on collecting datasets that represent different winter weather conditions. Kenk and Hassaballah [2] presented DAWN, a benchmark image dataset from traffic environments in adverse weather, including snow, rain, and fog. The dataset consists of 1000 images showcasing varying levels of road visibility. Unlike the DAWN dataset, which provides only images, most of the contributions consist of multimodal data, where different sensors are used to collect data. Bijelic et al. [3] presented the SeeingThroughFog multimodal dataset, where they mount different cameras (stereo, gated, far infrared (FIR)), light detection and ranging (LiDAR), and radio detection and ranging (RADAR) on a vehicle to record driving scenes in winter weather. The dataset offers a benchmark for object detection tasks for autonomous driving. In another work, Pitropov et al. [4] introduced a similar dataset for autonomous driving in Canadian snowy weather. The dataset includes a collection of scenes with a variety of vehicles and traffic levels with 3-D object detection annotations. However, most of these datasets are dedicated only to autonomous driving tasks and are taken from a vehicle perspective.

To push forward the development of other robust vision-based applications in the context of winter weather, specifically in snow, we present a novel dataset of snow-covered roads in urban scenes. The proposed dataset consists of a large collection of images captured by CCTV cameras during snowfall season in the city of Montreal, QC, Canada. This dataset encompasses road images with a variety of snow-cover levels within a real-world urban setting. In addition, the captured scenes offer rich contextual information as they represent different weather conditions, time periods, area types, and traffic environments. By collecting this dataset, we aim to enable different applications from a new vision perspective, i.e., CCTV cameras. From a system engineering perspective, the collection of snow-covered road images along with the corresponding metadata paves the way toward establishing comprehensive frameworks in the domain of intelligent transportation. Several system designs and strategies could be developed using the collected data to address traffic challenges during adverse weather conditions. When integrated with other datasets, such as traffic and weather forecasts, the present dataset can provide a richer decision-making tool for system designers. The contributions of this article are summarized as follows.

We collect a large dataset for snow-covered roads in an urban context. The dataset is composed of images and metadata (weather conditions and timestamps).
We provide a description of the collected dataset, showcasing its characteristics and rich contextual features.
We present benchmark problems based on the collected dataset, including automated annotation for snow-covered road levels, and other practical applications.

In the rest of this article, we review some relevant works in Section II. Section III introduces the dataset collection process and provides a description of its characteristics and challenges. Next, we propose a benchmark problem for automated dataset annotation, and discuss possible applications related to winter road maintenance and mobility in Section IV. Finally, Section V concludes this article.¹

SECTION II.

Literature Review

RSC estimation is the task of identifying the status of the road's surface. The status can concern the road cover type (water, snow, dust), its depth, or the pavement condition (cracks and potholes). Multiple related studies were carried out in the context of snowy weather. These works introduced different datasets to tackle the problem, using different resources and infrastructures for collection. Khan and Ahmed [5] collected an image dataset from cameras installed on an interstate road and annotated it manually based on defined visual characteristics. The dataset counts 15 000 images describing weather conditions (clear, light snowfall, and heavy snowfall) and 15 000 images describing RSCs (dry, wet, and snowy). In another study, Carillo et al. [6] collected a dataset comprising 14 000 images collected from road weather information systems (RWIS) across the Province of Ontario, Canada. The images are classified into the following three categories: bare pavement, partial snow cover, and full snow cover. The image dataset is associated with weather measurements, such as temperature, humidity, and pressure. Similarly, Pan et al. [7] combined three datasets of 33 000 images to distinguish RSCs based on the percentage of the covered surface. The images are collected from highways and rural roads. Landry et al. [8] introduced a more granular class definition of snow cover levels using increments of 10%. Unlike previous works, Grabowski and Dariusz [17] annotated their image dataset using sensor measurements and captured 2100 images from CCTV cameras. The dataset describes the following three road conditions: dry, snow, and wet. In recent work, the authors [18], [19] assumed that the road surface could have regions with different status. They approached the RSC estimation problem as a semantic segmentation task, where the road is segmented into regions of different covers. On the other hand, Choi et al. [20] used a generative adversarial network to provide an augmented dataset for snowy road surface detection. The dataset is augmented based on images captured in snow weather and clear weather, in order to provide large and balanced samples for each class.

Table 1 shows the characteristics of relevant datasets compared to our proposed dataset, including their size, acquisition method, environment, and availability. The datasets presented in this review share a few characteristics. The majority of images are taken on highway roads and have little to no representation of urban scenes. In addition, the images mainly focus on the road pavement, with less focus on other environmental elements. Finally, we notice that most of the datasets are not publicly available, which is not suitable for benchmarking tasks.

TABLE 1 Comparison of Different Image Datasets for RSC Estimation in Snowy Weather

SECTION III.

Data Collection and Description

In this section, we present the dataset, its source, and its characteristics, and we explain the collection process. Finally, we mention the potential challenges the dataset may present.

A. Data Source

Located at different road intersections, the City of Montreal installed a set of cameras to monitor the road network and the traffic volume. These cameras provide a real-time view of strategic roads, helping to reduce congestion and accelerate interventions in case of accidents. Besides being a tool for the local authorities, these cameras are put into public use to observe the traffic in certain areas through the city website² as image snapshots and not real-time video streams. There are 581 installed cameras across the city, covering numerous urban scenes, such as highways, residential, commercial, and industrial areas. Fig. 1 shows the distribution of the cameras across the city. These cameras are pan–tilt–zoom (PTZ) cameras, meaning that the position of the cameras can be remotely changed, which allows the monitoring of all intersection directions. Considering the distribution and mobility of the cameras, they represent a rich source of images that could be leveraged to collect a dataset.

In addition to providing public access to this data source, Montreal is one of the largest urban areas that witnesses long winters with several snowstorms. This presents an opportunity to leverage the existing cameras to collect a dataset for snow-covered roads in an urban context.

To access the camera image feed, the City of Montreal provides a list of all cameras in a geographic javascript object notation (GeoJSON) file. The javascript object notation (JSON) fields contain, for each camera, a unique identifier (ID), geolocation coordinates (latitude and longitude), road intersection names, a URL to the live feed, and URLs for reference images of the four intersection directions.

Images provided by the cameras have two resolutions: 480 × 720 or 480 × 704 (height × width). Two labels are added to the image, the first indicating the time the snapshot was taken (bottom right corner) and the second indicating the camera ID and the intersection name (top left corner). Fig. 2 shows an example of an image taken from a camera with ID 207.

Figure 1.

Distribution of the PTZ cameras in the city of Montreal.

Show All

Figure 2.

Example of an image captured by a traffic monitoring camera installed in Montreal.

Show All

Figure 3.

Flowchart of the image scraping process.

Show All

Figure 4.

Sample images of the different snow cover classes under different illumination and weather conditions. (a) Clear surface. (b) Light-covered surface. (c) Medium-to-heavy covered surface. (d) Plowed surface.

Show All

Figure 5.

Image samples collected during different periods of the day. (a) Day. (b) Night. (c) Twilight.

Show All

Figure 6.

Image samples collected during different weather events. (a) Clear weather. (b) Cloudy. (c) Rainy weather. (d) Snowfall weather.

Show All

Figure 7.

Image samples collected from different areas across Montreal. (a) Residential. (b) Downtown. (c) Industrial area. (d) Highway.

Show All

Figure 8.

Distribution of the images across different winter weather conditions and periods based on the collected metadata.

Show All

Figure 9.

Examples of challenging images within the dataset. These images require filtering as they do not offer insight about the road surface. (a) Snow-covered camera. (b) Bad camera direction. (c) Flare light. (d) Blurry image.

Show All

B. Image Collection

After identifying the data source, we proceed to automate the acquisition of the images. We perform image scraping during the months of January and February of the years 2022 and 2023, as these periods coincide with important snowfall. The image scraping process is accomplished in a discontinuous way. We target both clear and snowy weather during the day and night times.

To collect the images, we use the GeoJSON file to iterate through all the cameras, accessing their unique feed URLs to capture the current snapshot image. The cameras have a refresh rate of roughly 5 to 6 min but could be more in practice. To avoid storing identical images, we compare using the structural similarity index method (SSIM), for each camera, the last stored image with the most recent snapshot using. A match corresponds to the value of SSIM $= 1$ We also check for the availability of the image by comparing it, using SSIM, to a provided template indicating unavailability. The process is summarized in Fig. 3, we define a directory hierarchy that separates the cameras to effectively store the collected images. The image nomenclature is related to the camera ID and the time of acquisition as cam-ID_DD-MM-YY_hh-mm.jpeg. For instance, the image file cam2_04-02-22_07-39.jpeg corresponds to an image scraped from the camera with the ID 2 on the February 4, 2022 at 07:39 EST time. By the end of the image scraping, we collected 294 000 images from 528 functional cameras.

C. Expected Classes

To add value to the collected images, we suggest attributing to each image a label describing the snow level covering the road. The estimation of snow levels in a unit of measurement from images directly is a very hard task. For this reason, we opt for a categorical description of snow levels that relies on visual features. We define four classes that describe the road's surface in accordance with snow levels defined by the city of Montreal to carry out snow removal operations. Other cities in Canada (such as Toronto, Quebec City, and Laval) and worldwide (such as Helsinki) have similar policies regarding snow levels with slight differences. Generally, plowing starts between 2 cm and 3 cm accumulation. However, the visual characteristics of snow cover would be similar for a small variation, which can help generalize and align the label definition for different locations.

The classes are the following: “clear surface,” “light-covered surface,” “medium-to-heavy-covered surface,” and “plowed surface.” Table 2 mentions the visual characteristics of each label and the corresponding removal operation. Fig. 4 illustrates different samples of mentioned classes.

TABLE 2 Classes Describing Snow Cover Level

D. Metadata Collection

In addition to image scraping, we log other information related to the images to provide more context for the case in which the images were taken, such as time and weather. Each image has a corresponding log file in JSON format. This file has the same nomenclature as the image and is stored under the same directory hierarchy.

1) Weather Conditions

To log weather data, we use the OpenWeatherMap weather API.³ This API offers access to current weather indicators for a specific location. For each image, we record the following measurements:

main description (clear, rain, snow, etc.);
temperature in Celsius ( $^\circ$ C);
precipitation of snow and rain, if existing, over the last hour in millimeters (mm).

2) Timestamping

Despite the presence of a timestamp labeled on the images, we log a timestamp representing the moment the image was scraped. This is easier than extracting it from the image and does not affect time precision as the snapshot moment and the scraping moment are close. In addition to the timestamps, we record the period the image was taken as follows.

Day: The period between the sunrise and the sunset.
Night: The period between the astronomical twilight end and the astronomical twilight start.
Twilight: The period between the astronomical twilight start and the sunrise and between the sunset and the astronomical twilight end.

Astronomical twilight is both periods separating day and night (dawn and dusk), where illumination is not as bright as during the day and not as dark as during the night. These periods change daily, so we track them using the sunrise–sunset API.⁴ Finally, the JSON file has the following format.

Show All

E. Dataset Description

In this part, we showcase the different contexts the collected images represent. Figs. 5–7 illustrate images from different periods of the day, different weather conditions, and different areas with varying traffic volumes, respectively. We confirm that the collected dataset holds rich visual information and is well-representative of the urban scene in winter weather. Fig. 8 shows the distribution of image instances across weather conditions and periods.

F. Image Challenges and Limitations

Collecting images from a raw data source has no guarantee of quality or sanity. We may encounter many examples of bad images that do not hold useful information, such as:

camera lens covered with snow or rain;
flare and light reflection;
distorted or bad quality images;
camera not pointing towards the road.

Fig. 9 depicts examples of collected images that have poor quality or do not represent the road surface cover. Typically, such images should be identified and eliminated from the dataset. However, due to the large size of image feeds, it is impractical to do so manually.

Traffic cameras can go down for maintenance or get broken down. In this case, the live traffic feed would show a blank image with the label “image unavailable,” We discard these images during the collection process. Another challenge related to the nature of the cameras is their changing view. The cameras are operated by agents who can turn the cameras in different directions or zoom in and out. The challenge here is to identify which direction the camera is facing using reference images. Finally, due to the nature of winter weather events, their frequency of occurrence may not be comparable. For example, it is expected to have more fully covered roads than light-covered roads. This may result in a class-imbalanced dataset where one or more classes are the majority and others are the minority.

SECTION IV.

Benchmark Problems

In this section, we formulate a benchmark problem based on our dataset. The problem concerns the dataset annotation to add value to it. Next, we mention other practical smart transportation use-cases that can be leveraged by the dataset once it is annotated. We finally highlight the relevance of the dataset and the practical use-cases to systems engineering (SE).

A. Snow-Covered Roads Image Annotation

The majority of computer vision models rely on annotated images with different levels of supervision (fully, semi, or weakly supervised learning). This makes it necessary to annotate our dataset with labels that describe the snow-covered roads. Furthermore, the labels should add real values to the dataset and be in accordance with their potential applications. Therefore, we suggest annotating the dataset according to the classes mentioned in Section III-C.

Since the proposed dataset is large (294 000 images), it is evident that manual annotation is not an option to consider as it will be an expensive, time-consuming, and tedious task. Instead, we propose developing a system that automates this process. This system should be able to take unlabeled images as input and generate for each image one label out four possible labels representing snow levels. The images can fed to the system in sequence, in camera-batch, or in totality.

We can formulate this benchmark problem as a clustering problem, where the goal is to group similar images together based on their visual features into four macroclusters $C =\lbrace c_{1}, c_{2}, c_{3}, c_{4}\rbrace$ . Let $X = \lbrace x_{1}, x_{2},.., x_{n}\rbrace$ be the set of $n$ unlabeled images. $X$ can represent the totality of the dataset or a partition of it (camera batch, sequence, etc.) Each $x_{i}$ corresponds to an image or a representation of it in a $d\text{-}$ dimensional space, where $x_{i}\in \rm \,I\!R^{d}$ . The annotation system should optimize an objective function $F$ such as

$\begin{equation*} F = \min _{x_{i}\in X, c_{j} \in C} f(x_{i},c_{j}) \end{equation*}$ View Source

where

$f$

is a criterion function that defines the quality of the clustering. The criterion

$f$

varies depending on the clustering algorithm.

To evaluate the annotation system, we define metrics that characterize the performance of label assignment. We propose to sample an image batch and annotate it manually. This image set will serve as ground truth (GT) to compare the accuracy of assigned labels as follows:

$\begin{equation*} \text{Accuracy} = \frac{\text{Correctly annotated images}}{\text{Total GT image}}. \end{equation*}$ View Source

However, there are many other metrics for evaluating clustering methods that we will not discuss in this part, as they are more related to the nature of the algorithms.

Another method we propose to evaluate the annotation efficiency is to benchmark the performance of deep learning models trained using the annotated dataset. The idea is that deep learning models will show poor performance when trained on poorly annotated data. We can use state-of-the-art models that have already proven to be efficient in similar road surface classification tasks to decrease the chances of model-related poor performance. There are many possible metrics to evaluate classification models. However, the snow-covered road classification problem is related to weather events that do not occur at the same frequency. For this reason, we speculate that the resulting dataset would have an imbalanced representation of the different classes. Thus, we propose $F1-$ score as an evaluation metric, as follows:

$\begin{equation*} F1 = \frac{2\times \text{Precision} \times \text{Recall}}{\text{Precision}+\text{Recall}}. \end{equation*}$ View Source

B. Practical Use Cases

In this part, we introduce potential use cases that can benefit from our proposed dataset. These applications can leverage CCTV images to assist in decision-making in fields related to winter transportation and mobility. Table 3 depicts other types of data that can be involved in these tasks, ranging from visual data, contextual data (weather and general transit feed specification (GTFS)), and spatial data (OpenStreetMap).

TABLE 3 Complementary Data Sources for the Proposed Applications

1) Vehicle Routing for Snow Removal Operations

One promising application our snow-covered roads image dataset can enable is the automation of snow removal operations planning. Taking the city of Montreal as an example, the CCTV cameras provide real-time coverage of a large portion of the road network. Once annotated, the image dataset can allow the training of deep learning classification models to distinguish different snow levels covering the road. The output of the classifier is matched with the appropriate snow removal operation. In addition, the metadata associated with the images can help create simulations for real-world scenarios, where weather and time are key factors for snow removal planning. Weather and time data add context to the snow cover information extracted from the images. A combination of this data with OpenStreetMap spatial data can yield priority levels for road segments for plowing. The dynamic nature of time and weather data would help adjust the priority levels for different situations.

2) Trip Planning During Adverse Weather

The mobility of individuals, no matter what medium they use (walking, private vehicles, or public transport), becomes limited and very hard during heavy snowfall events. The roads often become blocked, traffic volume increases, and public transport witnesses delays or trip cancellations. In this context, we can leverage the dataset to help individuals plan their trips and commute by choosing the best route. Vision models can provide real-time road surface (snow cover), traffic volume, and road blockage information. Based on this extracted information, we can provide the user with the best itinerary according to his means of transport, thus facilitating his mobility.

3) Public Transport Delay Analysis

During the winter, multiple perturbations affect public transport trips, resulting in a degradation of the quality of service. The images in our dataset are taken from road intersections that often coincide with bus stops or bus routes. A camera-based system can be developed to identify and assess hot spots that cause interruptions in order to reroute the buses or prioritize snow removal in these sectors. These spots are associated with traffic bottlenecks, road blockades, snow-covered roads, and snow removal operations. The proposed dataset would be used eventually to train models to detect these cases after adequate annotation. Traffic information can be obtained from GTFS real-time data for public transportation, which provides delay information and service quality indicators. Camera feed can be associated with GTFS Realtime to identify patterns and the causes of delays.

4) Winter Traffic Analysis

Our proposed dataset can provide insights for winter traffic analysis systems. It can be used to estimate traffic volume using a vehicle counting algorithm that can perform accurately during snowfall. This allows the detection of roads and intersections with heavy traffic. The task of vehicle counting and traffic flow estimation using object detectors is well explored. However, most models are trained using datasets captured in normal conditions and would experience performance degradation in winter weather. Our dataset can help improve these models' performance by adapting them to the winter visual setting (less visibility, different lighting conditions, and different scenes). This is an unsupervised domain adaptation that does not need vehicle annotation to fine-tune the object detectors for this new visual setting. In addition, we can detect which types of vehicles are causing traffic jams the most by matching the vehicle detection output with traffic delay data. It can also help understand the effect of snow removal operations on improving traffic flow.

C. Visnow, AI, and SE

The proposed ViSnow dataset presents important potential for the advancement of system designs, especially for ITS. This potential establishes a link to the theme of artificial intelligence for systems engineering. ITS can benefit from the dataset by leveraging deep computer vision methods to adapt more to particular cases of adverse snowy weather conditions. From an SE perspective, this allows for robust designs where the system can maintain its performance under different conditions. In addition, integrating deep learning components into existing systems, such as in the case of snow removal or trip planning, enables more precision, responsiveness, and dynamicity in these complex systems through real-time decision-making.

On the other hand, the benchmark problem, we suggest in Section IV-A, involves SE practices to develop its solution. The annotation process, consisting mainly of a clustering problem, relies mainly on optimization and validation methods [21]. Optimization methods allow the development of an AI-based system capable of efficiently generating image annotations. In addition, the development of the discussed applications necessitates SE frameworks to deploy and integrate these complex systems on a large scale and ensure their continuous maintenance, especially in the case of varying data flow, such as in the case of traffic analysis. These SE frameworks represent a link to systems engineering for AI (SE4AI). Future works that are based on the ViSnow dataset can leverage SE4AI to methodologically ensure the reliability of smart components.

SECTION V.

Conclusion

In this article, we have introduced a novel dataset for snow-covered roads in urban areas. The dataset comprises 294 000 images that were collected from CCTV cameras during the winter seasons of 2022 and 2023 in Montreal, QC, Canada. Unlike other datasets found in the literature, our proposed dataset offers a large collection of images with varying contextual settings. The dataset aims to generalize the description of snow-covered roads with scenes from over 500 locations in the city during different periods and weather conditions. We also provide a comprehensive description of the data collection process to enable its replication.

In addition, we define a benchmark problem based on the dataset. The problem consists of automating the annotation of the image dataset with different categories reflecting the type of snow cover. We explicitly define the potential labels and set performance metrics for the annotation task. Finally, we suggest other applications that can be enabled by using our dataset to train deep learning models and create real-world scenario simulations.

References is not available for this document.

ViSnow: Snow-Covered Urban Roads Dataset for Computer Vision Applications

Abstract:

Metadata

Abstract:

Introduction

Literature Review