Journals & Magazines >IEEE Access >Volume: 9

Topology Agnostic Bounds on Minimum Requirements for Network Failure Identification

Lower-bounds to the minimum number of paths to deploy in order to identify a desired number of nodes, in comparison with the actual number of paths placed by the sub-opti...

Abstract:

In Boolean Network Tomography (BNT), node identifiability is a crucial property that reflects the possibility of unambiguously classifying the state of the nodes of a net...Show More

Metadata

Abstract:

In Boolean Network Tomography (BNT), node identifiability is a crucial property that reflects the possibility of unambiguously classifying the state of the nodes of a network as 'working' or 'failed' through end-to-end measurement paths. Designing a monitoring scheme satisfying network identifiability is an NP problem. In this article, we provide theoretical bounds on the minimum number of necessary measurement paths to guarantee identifiability of a given number of nodes. The bounds take into consideration two different classes of routing schemes (arbitrary and consistent routing) as well as quality of service (QoS) requirements. We formally prove the tightness of such bounds for the arbitrary routing scheme, and provide an algorithmic approach to the design of network topologies and path deployment that meet the discussed limits. Due to the computational complexity of the optimal solution, We evaluate the tightness of our lower bounds by comparing their values with an upper bound, obtained by a state-of-the-art heuristic for node identifiability. For our experiments we run extensive simulations on both synthetic and real network topologies, for which we show that the two bounds are close to each other, despite the fact that the provided lower bounds are topology agnostic.

Lower-bounds to the minimum number of paths to deploy in order to identify a desired number of nodes, in comparison with the actual number of paths placed by the sub-opti...

Published in: IEEE Access ( Volume: 9)

Page(s): 6076 - 6086

Date of Publication: 01 January 2021

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2020.3048876

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

With the massive growth of the Internet, localizing node failures has become a crucial task. Single organizations have direct access only to limited portions of the internal nodes of the network, and they hardly collaborate in sharing internal performance observations because of commercial conflicts. Boolean Network Tomography (BNT) provides tools for assessing the state of a network through end-to-end monitoring paths, as they do not rely on administrative access privileges, [1]. Boolean Network Tomography overcomes the limitations faced by traditional network monitoring approaches based on pervasively deployed monitoring agents (e.g., SNMP) or pervasively supported network protocols (e.g., traceroute) caused by the complexity and the heterogeneity of modern computer communication networks. As a matter of facts, bugs and configuration errors in various customer software and network functions often induce “silent failures” that are only detectable from end-to-end connection states, [2].

Each monitoring path can be modelled as the sequence of nodes that it traverses. When a packet correctly reaches its final destination, we say that its path is in a working state, and so are its traversed nodes. On the other hand, when packet losses occur along a path, we say that the path failed. The latter situation arises when at least one of the nodes of the path is failed. Observations of the outcome of monitoring paths (working/failed) induce a system of Boolean equations where the unknowns are the Boolean states of the nodes in the network. The challenge related to this approach is that such Boolean systems are commonly under-determined, hence allowing multiple solutions, i.e., multiple failing scenarios that lead to the same observations on the path probe outcome, [3]. When the state of a failed node can be uniquely determined from the Boolean system induced by the outcome of monitoring paths, the node is said to be identifiable. Therefore, identifiability is a desirable property that allows unambiguous nodes’ state classification.

In this work, we provide topology-agnostic lower-bounds to the minimum number of measurement paths which are necessary to guarantee identifiability to a desired number of nodes. Such bounds represent the dual solution to the optimization problem studied in [4], where we introduced upper-bounds to the maximum number of identifiable nodes given a number of monitoring paths. In contrast with existing literature, we propose theoretical lower-bounds that cannot be violated, independently of the specific characteristics of the topology. The bounds formulations are only based on the number of nodes to identify, on high level routing consistency properties (arbitrary and consistent routing), and on QoS requirements, expressed in terms of maximum allowed path length. Motivated by the need to complement the analysis of [4], our bounds are a useful tool to measure the capability of a monitored topology to efficiently identify the status of its components. Implementing a monitoring system comes with the cost of installing monitors on the nodes of a network and of traffic caused by path probing; with this work, we aim at providing fundamental guidelines and minimal requirements for achieving the desired level of network identifiability (e.g., number of identifiable nodes). In addition, we formalize the Incremental Crossing Arrangement (ICA) procedure to generate monitoring schemes and underlying topologies that meet the bounds tightly, giving insights on which topology is the most suitable for failure localization. With ICA we formalize a network engineering approach which is at the basis of the theoretical analysis started in [4] and completed in this article.

We hereby list the major contribution of this work.

We study theoretical bounds on the minimum number of paths to deploy in a network for identifying a desired number of nodes. The bounds do not depend on specific network topologies (i.e., they are topology agnostic bounds).
We provide the Incremental Crossing Arrangement (ICA) algorithm, which allows topology design meeting the proposed bounds.
We evaluate the tightness of our bounds on both synthetic and real network topologies. For this purpose we compare the bounds with the results of a state-of-the-art greedy algorithm, hereby referred to as Greedy for Identifiability (GI), for maximizing network identifiability by means of client-to-server probing paths [5].

SECTION II.

Related Work

Boolean Network Tomography studies non linear relationships existing between the paths and their components, as in the case of congestion or failure localization. The early works on this topic focused on best-effort inference. For example, Duffield [6], [7] and Kompella et al. [2] aimed at finding the minimum set of failures that can explain the observed measurements, and Nguyen and Thiran [3] aimed at finding the most likely failure set that explains the observations by a probabilistic analysis of a set of experiments. More recently and with a similar goal, the authors of [8] build a Markov Decision Process in light of passive traffic data, and solve the tomography problem with a $Q$ -learning technique.

Lately, Ma et al. [9] give characterizations of maximum identifiability of node failure under different end-to-end monitoring systems, and extend this work in [10], where they outline topology-specific properties on the number of nodes whose states can be identified under a given number of failures. In contrast, as specified in [4], we provide general, topology-agnostic bounds. An optimal monitor placement for ensuring node $k$ -identifiability under different routing schemes is described in [11]. This optimization problem was introduced by Bejerano et al. in [12], where they prove its NP-hardness. The work by Cheraghchi et al. [13] studied graph-constrained group testing with the goal of minimizing the number of monitoring paths needed to identify the state (defective or normal) of all network nodes, under the assumption that the maximum number of defective nodes is given. Differently from this work, we do not investigate on the identification of failed nodes in specific failure scenarios, but rather we focus on the identifiability property.

With this article, we complete the analysis of the fundamental bounds on node identifiability in Boolean Network Tomography introduced in [4].

SECTION III.

Problem Formulation

We represent a network as a undirect graph $G=(V,E)$ , where $V$ is the set of the nodes of $G$ and $E$ is the set of its edges. Each node $v$ is either in working or failed state. In Table 1 we sum up the notation that will be used throughout this article. The state of the nodes is assessed indirectly by a set of monitoring paths, $P:=\{p_{1},\ldots, p_{m} \}$ , each being represented as the ordered sequence of nodes it traverses. Node failures cause paths disruption: when a path traverses a failed node, its communication is interrupted. On the other hand, paths traversing only working nodes are working. Each node $v$ may be labeled with a binary encoding of length $m$ , $b(v)\in \{0,1\}^{m}\setminus 0^{m}$ , where $b(v)|_{i}=1$ if $v$ is traversed by path $p_{i}$ , $b(v)|_{i}=0$ otherwise. We call crossing number of a node $v$ , $\chi (v)$ , the number of paths that traverse $v$ , i.e., the number of 1s in its binary encoding ( $\chi (v) = \sum _{i = 1}^{m} b(v)|_{i}$ ). For each path $p_{i}$ we define a path matrix as a binary matrix $M(p_{i})$ , in which each row is the binary encoding of a node on the path, and rows are sorted according to the sequence $p_{i}$ . Notice that by definition $M(p_{i})|_{*,i}$ has only ones, i.e., $M(p_{i})|_{r,i} =1, \,\,\forall r$ . We call the incident set of $v$ the set of paths traversing $v$ and denote it with $P_{v}\subseteq P$ .

TABLE 1 Notation Table

We call failure set of a network, $F$ , the set of all failed nodes. Here, we assume that nodes fail one at a time, and therefore that $|F|=1$ . In such a context, we focus on the property of 1-identifiability. With reference to [14], we give the following definition:

Definition 1:

A node $v_{i}$ is 1-identifiable with respect to $P=\{p_{1},\ldots, p_{m} \}$ if $b(v_{i})\neq 0^{m}$ and if for all $v_{j}\neq v_{i}$ , $b(v_{i})\neq b(v_{j})$ , i.e., its binary encoding is not null and not identical to that of any other node.

Node identifiability allows non ambiguous node state assessment by means of end-to-end measurement paths. We highlight that in order to be identifiable, a node must be monitored at least by one path. For this reason, a node whose binary encoding is null cannot be identifiable. In this article, we give bounds on the minimum number of paths that are needed for letting $n\le |V|$ nodes be 1-identifiable under arbitrary and consistent routing schemes.

SECTION IV.

Arbitrary Routing

In this section we study the minimum number of monitoring paths that can be employed to identify $n$ nodes in a network under deterministic, single-path arbitrary routing. We say that paths follow an arbitrary routing scheme if they do not traverse a node more than once, but they can cross each other non-restrictively.

In Section III, we explained that nodes can be represented with binary encodings depending on what paths traverse them. In addition, we noticed that, in order for nodes in a network to be identifiable, they must have all different encodings. Since the number of different binary encodings of length $m$ , excluding the string $0^{m}$ , is $2^{m}-1$ , the following holds:

Proposition 1:

The minimum number of monitoring paths to place in order to identify $n$ nodes under arbitrary routing is $m^{AR}_{\texttt {min}}=\lceil \log _{2}(n + 1) \rceil$ .

The bound represented by $m^{AR}_{\texttt {min}}$ does not take into consideration the length of the paths involved. The length of a path $p_{i}$ is the number of nodes it traverses, $d_{i}$ ( $d_{i}=|\{v\in V:b(v)|_{i} = 1\}|$ ). When constraints to the paths length are given, for instance by defining an upper bound to the maximum length, $d_{i}\leq d_{\texttt {max}}$ , or to the average path length, $\frac {1}{m}\sum _{i} d_{i} \leq \bar {d}$ , the bound of Proposition 1 may change. In order to discuss the bound on the minimal number of paths under path length constraints, we observe the following facts:

Observation 1:

The number of distinct binary strings in $\{0,1\}^{m}$ with $k~1\text{s}$ and $m-k\,\,0\text{s}$ (with $0 < k\le m$ ) is $\binom {m}{k}$ . Out of them, there are $\binom {m-1}{k-1}$ strings where the $i-$ th digit is 1. In our context, this means that a path can traverse at most $\binom {m-1}{k-1}$ nodes having crossing number $k$ in order to guarantee identifiability, that is when all encodings are distinct.

Observation 2:

The maximum length $d_{i}$ of a path $p_{i}$ is $d_{i} = \sum \limits _{i = 0}^{m-1}\binom {m-1}{i}=2^{m-1}$ . In fact, the maximum number of identifiable nodes using $m$ path under arbitrary routing is $n = 2^{m}-1$ (see Proposition 1). The statement follows from Observation 1.

These observations are illustrated in Figure 1. In this simple example we consider $n = 7$ nodes. Assuming arbitrary routing, $m= \log _{2}(8)=3$ monitoring paths are enough to identify all nodes. Each path $p_{1},p_{2},p_{3}$ traverses $\binom {m-1}{k-1}=\binom {2}{k-1}$ nodes $v$ with crossing number $\chi (v) = k \in \{1,\ldots,m\}$ , and the length of each path is $2^{m-1} = 4$ .

FIGURE 1.

Arbitrary routing example.

Show All

In Theorem 2 we provide a lower bound to the number of paths to place in order to identify $n$ nodes in a network, in the case a path length constraint is given.

Theorem 2:

The minimum number of paths $m^{AR, d_{\texttt {max}}}_{\texttt {min}}$ of maximum path length $d_{\texttt {max}}$ to identify $n$ nodes under arbitrary routing is the solution of the following problem: $\begin{equation*} \min m\quad s.t. \left \lfloor{ \frac {l_{max_{ \tilde {k}}} \cdot m}{ \tilde {k}}}\right \rfloor +\sum \limits _{i=1}^{ \tilde {k}-1}\binom {m}{i}\ge n,\tag{1a}\end{equation*}$ View Source where $\begin{align*} \tilde {k}=&\max \left \{{ j: \sum \limits _{i=1}^{j} i \cdot \binom {m}{i} \le m\cdot D}\right \}+1,\tag{1b}\\ D=&\min \left \{{ d_{\texttt {max}}, 2^{m-1} }\right \},\tag{1c}\end{align*}$ View Source and $\begin{equation*} l_{max_{ \tilde {k}}} = D- \sum \limits _{i=0}^{ \tilde {k}-2}\binom {m-1}{i}.\tag{1d}\end{equation*}$ View Source

Proof:

In order to minimize the number of paths, we want to have as many distinct encodings as possible with the minimum number of 1s. This fact translates into a strategy that consists in incrementally increasing the crossing number of the monitored nodes until the fixed average path length $d_{\texttt {max}}$ allows it or until there is no way that paths traverse more nodes without violating identifiability, equation (1c) (see Observation 2).

The quantity $\tilde {k}$ in Equation (1b) says that paths can be placed in such a way that the nodes they traverse are all distinct nodes with crossing number $o < \chi (v) < \tilde {k}$ . The quantity $m\cdot D$ is a loose upper-bound to the maximum number of nodes traversed by $m$ paths, where nodes with crossing number $j$ are counted $j$ times, as $j$ paths traverse them. Depending on the value $l_{\texttt {max}_{ \tilde {k}}}$ in Equation (1d), some more paths with crossing number $\tilde {k}$ may be traversed. Notice that $l_{\texttt {max}_{ \tilde {k}}}$ represents the number of nodes of crossing number $\tilde {k}$ that each path $p$ can traverse considering that it traversed all distinct nodes with crossing number $< \tilde {k}$ (see Observation 1). Extended to all $m$ paths, $\left \lfloor{ \frac {l_{max_{ \tilde {k}}} \cdot m}{ \tilde {k}}}\right \rfloor$ in Equation (1a) is the number of distinct nodes with crossing number $\tilde {k}$ that can be traversed by $m$ paths of length $l_{\texttt {max}_{ \tilde {k}}}+\sum \limits _{i=0}^{ \tilde {k}-2}\binom {m-1}{i}$ .

Constraints on paths length are usually imposed by QoS requirements and influence substantially the minimum amount of paths needed to identify a certain number of nodes. In a network where shortest path routing schemes are applied, the value of $d_{\texttt {max}}$ is the diameter of the network. Differently, multi-service networks serve for more than a single service, to which a number of clients access. Each service is characterised by a different service level agreement (SLA) that regulates the routing scheme to adopt as well as the reserved portion of the network. QoS requirements may also vary for each service. In this scenario, paths lengths may be different from one another depending on what service a path belongs to. Information about paths lengths for different services in multi-service networks justifies the introduction of the notion of average path length of a network, $\bar {d}$ . Furthermore, average path length can be easily computed when the network topology and the routing scheme implemented on it are known.

Corollary 1:

The bound of Theorem 2 holds also when we consider the average path length, $\bar {d}$ , or an upper-bound to it, instead of $d_{\texttt {max}}$ . The statement of Theorem 2 only changes in Equation (1c), where the value of $\bar {d}$ is to be substituted to $d_{\texttt {max}}$ . We call such bound $m^{AR,\bar {d}}_{\texttt {min}}$ .

The bound of Theorem 2 suggests that the number of nodes that $m$ monitoring paths can identify grows with the path length, (Equation (1b)). Nevertheless, we can show that the growth stops for $d_{\texttt {max}}>2^{m-1}$ .

Corollary 2:

The number of nodes that $m$ paths can identify grows with $d_{\texttt {max}}$ as long as $d_{\texttt {max}}\le 2^{m-1}$ .

Proof:

In Observation 1 we point out that, given a set of $m$ paths, each of them can traverse at most $\binom {m - 1}{k - 1}$ nodes with crossing number $\chi (v) = k$ . The maximum value for $\chi (v)$ is $m$ , and therefore the maximum path length for a path is $\sum \limits _{i = 0}^{m - 1}\binom {m-1}{i} = 2^{m-1}$ . This fact motivates the expression in Equation (1b).

We highlight that knowledge of path length does not necessarily imply explicit knowledge of the paths - in terms of what nodes they traverse.

A. Design Via Incremental Crossing Arrangement (ICA)

The proof of Theorem 2 suggests a technique to build a network topology $G=(V,E)$ and related monitoring paths $P$ with maximum identifiability, where $|P|=m$ . We call this technique Incremental Crossing Arrangement (ICA). We shall describe this strategy in detail when the average path length $\bar {d}$ is given, as the case with $d_{\texttt {max}}$ is more general.

ICA, the idea. The technique works by generating $n$ node encodings in increasing order of crossing number with respect to the $m$ monitoring paths in use, where $m$ is the optimal solution of the problem in Theorem 2. The set of encodings defines a design for the monitoring paths deployment, that must traverse nodes according to the generated encodings: path $p_{i}$ traverses any node $v$ for which $b(v)|_{i}=1$ , $\forall i \in \{1, \ldots, m \}$ . The network topology is then constructed by considering a node for each of the generated Boolean encodings, and adding links between any pair of nodes appearing sequentially in any path.

ICA in details. Algorithm 1 formalizes the incremental crossing arrangement design, used to determine the binary encodings of the identifiable nodes.

Algorithm 1:

Incremental Crossing Arrangement

Show All

As we consider $m$ paths, the node encodings will be sequences of $m$ bits in $\mathcal {B} \triangleq \{0,1 \}^{m}$ . We also denote with $\mathcal {B}|_{i} \subset \mathcal {B}$ the set of $m$ -digits binary encodings having a 1 in the $i$ -th position, i.e., $\mathcal {B}|_{i}= \{ b \in \mathcal {B}\,\,s.t.\,\,b|_{i}=1\}$ . The nodes corresponding to encodings of $\mathcal {B}|_{i}$ will be monitored (at least) by path $p_{i}$ . Moreover, we denote with $\mathcal {B}(k) \subset \mathcal {B}$ the set of all binary encodings having exactly $k$ digits equal to 1, therefore $\mathcal {B}(k) \triangleq \left\{{ b \in \mathcal {B}\,\,s.t.\,\,\sum _{i=1}^{m} b|_{i}= k}\right\}$ . The nodes corresponding to encodings in $\mathcal {B}(k)$ have crossing number equal to $k$ .

Finally, given a generic set of binary encodings $B \subseteq \mathcal {B}$ , we denote with $\ell _{i}(B)$ the number of encodings of $B$ having a one in the $i$ -th position: $\ell _{i}(B) \triangleq |B\cap \mathcal {B}|_{i}|$ . The value of $\ell _{i}(B)$ represents the length of a path $p_{i}$ traversing all the nodes in $B \cap \mathcal {B}|_{i}$ , exactly once.

Without loss of generality, we consider paths of balanced length, i.e. we set the length $d_{i}$ of path $p_{i}$ to a value $d_{i} \in \{ \lfloor \bar {d} \rfloor, \lfloor \bar {d} \rfloor +1 \}$ (lines 2 - 4).

The incremental crossing arrangement approach incrementally generates the solution set $B_{V}$ by including all the encodings of $\mathcal {B}(i)$ , $i=1, \ldots, \tilde {k}-1$ corresponding to nodes with crossing number lower than or equal to $\tilde {k}- 1$ . It then considers some encodings with $\tilde {k}$ digits equal to one. For this purpose it generates a family $\mathcal {F}$ of subsets in $\mathcal {B}({ \tilde {k}})$ , i.e., $\mathcal {F} \subseteq 2^{\mathcal {B}({ \tilde {k}})}$ (line 1) whose elements $B$ are such that $\ell _{k}(B\cup B_{V}) \leq d_{k}$ . The algorithm then looks for a maximal cardinality set $B^{*}$ in the family $\mathcal {F}$ and adds it to the solution $B_{V}$ , s.t. $B_{V}= \cup _{k=1}^{ \tilde {k}} \mathcal {B}(k) \cup B^{*}$ . Notice that the maximality of the cardinality of $B^{*}$ implies that no encoding with $\tilde {k}$ digits equal to one can be added to the set $B_{V}$ without violating the path length constraint $\ell _{k}(B_{V}) \leq d_{k}$ for some path $k=1, \ldots,m$ , or without removing at least one encoding already in $B_{V}$ .

The procedure described so far is sufficient to produce a network topology where all nodes are identifiable and where the number of paths $m$ is the solution of the problem in Theorem 2 given the average path length, $\bar {d}$ . In the produced topology, there can be values of $k \in \{1, \ldots, m \}$ for which $\ell _{k}(B_{V}) < d_{k}$ and, more precisely, given the balanced path length, $\ell _{k}(B_{V}) = d_{k} -1$ , corresponding to paths longer than strictly necessary to identify $n$ nodes, i.e., overlength paths. Overlength paths cannot traverse nodes with the same encoding without compromising the achievement of maximum identifiability. Therefore, in order for the average path distance to meet the value $\bar {d}$ , we proceed as follows, with a procedure that we call Path Completion. First, we observe that under ICA, the bound on the minimum number of monitoring paths can sometimes be met tightly even when the average path length is slightly lower than the given $\bar {d}$ . This condition is verified when the ratio inside the floor operator of the bound expression of Theorem 2 is not integer. Nevertheless, the same bound can still be met tightly with the exact average length provided as input, by operating as follows: let $S \subset \{1, \ldots, m \}$ be the set of overlength path indexes, namely $S \triangleq \{k,\,\,s.t.\,\,\ell _{k}(B_{V}) = d_{k} -1 \}$ . It holds $|S|=\left [{m\cdot D - \sum _{i=1}^{ \tilde {k}- 1} i\cdot {\binom{m }{ i }}\mod (\tilde {k})}\right]$ , hence the number of overlength paths is lower than or equal to $\tilde {k}-1$ .

We choose an encoding $b' \in B_{V} \cap \mathcal {B}(\tilde {k}-|S|)$ such that $b'|_{k}=0, \forall k \in S$ , and such that $\left ({\bigvee _{k \in S} \mathbf {e_{k}} \vee b' }\right)\notin B_{V}$ , where $\mathbf {e}_{k}$ is an $m$ -dimensional identity vector with all zeroes but a one in the $k$ -th position.¹ Then we remove $b'$ from the solution set $B_{V}$ and replace it with $b'' \triangleq \bigvee _{k \in S} \mathbf {e_{k}} \vee b'$ , i.e., with a new encoding $b''$ such that $b''|_{k}=1, \forall k \in S$ , and $b''|_{k}=b'|_{k}$ otherwise.

ICA: Example A (where path completion is not necessary). Figure 2 shows an example of a topology generated by means of incremental crossing arrangement. We are given $n=11$ nodes and $\bar {d}=4.75$ . The minumum number of paths satisfying eqs. (1a) to (1d) is $m=4$ . We set $d_{i}=5, \forall i=1,2,3$ and $d_{4} = 4$ (lines 2 - 4). According to ICA, we first generate all the encodings of $\mathcal {B}(1)$ and in $\mathcal {B}(2)$ and set $B_{V}= \{ 1000,\,\,0100,\,\,0010,\,\,0001,\,\,1100,\,\,1010,\,\,1001,\,\,0110,\,\,0101$ , $0011\}$ (line 6). Then we start generating some encodings in $\mathcal {B}(3)=\mathcal {B}(\tilde {k})$ until no other encoding can be added without violating the path length constraint (lines 7 - 9), obtaining $B_{V}= \{ 1000,\,\,0100,\,\,0010,\,\,0001,\,\,1100,\,\,1010,\,\,1001,\,\,0110,\,\,0101,\,\,0011$ , $1110\}$ , where each encoding corresponds to a node of the graph $G$ . Then we define the corresponding monitoring paths, by letting path $p_{i}$ traverse all the nodes whose encoding has a 1 in the $i$ -th position, in arbitrary order, $\forall i \in \{1, \ldots, m \}$ . Finally, we design the underlying topology by connecting each pair of nodes appearing in a sequence in any of the paths, as shown in Figure 2.

FIGURE 2.

ICA execution on Example A. Solid lines represent graph edges.

Show All

ICA: Example B (with path completion). Figure 3 shows another example of a topology generated by means of incremental crossing arrangement. We are given $n=11$ and $\bar {d}=5$ . The minimum number of paths satisfying eqs. (1a) to (1d) is again $m=4$ . To meet the requirement on average length, we set $d_{i}=5\, \forall i=1,\ldots,4$ (lines 2 - 4). According to ICA (line 6), we first generate all the encodings of $\mathcal {B}(1)$ and $\mathcal {B}(2)$ and set $B_{V}= \{1000,\,\,0100,\,\,0010,\,\,0001,\,\,{\underline {1100}},\,\,1010,\,\,1001,\,\,0101,\,\,0011 \}$ . Then we choose one of the possible $B^{*}$ ((lines 7 - 9)), for instance $B^{*}= {1110}$ , obtaining $B_{V}= \{1000,\,\,0100,\,\,0010,\,\,0001,\,\,{\underline {1100}},\,\,1010,\,\,1001,\,\,0101,\,\,0011,\,\,1110\}$

FIGURE 3.

ICA execution on Example B. Solid lines represent graph edges.

Show All

Finally, we observe that $\ell _{4} (B_{V}) =4 < d_{4}$ . We then perform the path completion procedure (line 10) and choose one of the encodings $b'$ in $B_{V} \cap \mathcal {B}(\tilde {k}-|S|)=\mathcal {B}(2)$ for which $b'|_{4}=0$ and $b' \vee \mathbf {e_{4}} \notin B_{V}$ . One encoding that satisfies this condition is $b'=1100$ . We replace $b'$ with $b''=1101$ and obtain the set of encodings $\{ 1000,~0100,~0010,~0001,~{\underline {1101}},~1010,~1001,~0101,~0110,~0011$ , $1110\}$ , each corresponding to a node of the graph $G$ . Then we define the corresponding monitoring paths, by letting path $p_{i}$ traverse all the nodes whose encoding has a 1 in the $i$ -th position, in arbitrary order, $\forall i \in \{1, \ldots, m \}$ . Finally, we design the underlying topology by connecting each pair of nodes appearing in a sequence in any of the paths, obtaining the topology of Figure 3.

It is worth observing the following.

Observation 3:

ICA produces a network topology and related monitoring paths such that all nodes have a crossing number lower than or equal to $\tilde {k}$ .

When knowledge of $\bar {d}$ is not available, and instead we limit the paths length to be at most $d_{\texttt {max}}$ , Algorithm 1 can be simplified allowing the following changes: the number of paths $m$ is computed as in Theorem 2 (line 1). To all nodes, we initially assign $d_{i}= d_{\texttt {max}}$ (lines 2-4). The algorithm continues as it is until the condition that some paths satisfy $\ell _{k}(B_{V})= d_{\texttt {max}}-1$ is met (line 10). When $d_{\texttt {max}}$ is used, path completion is not required, and for such paths we simply set $d_{k}=\ell _{k}(B_{V})$ . The returned set of encodings $B_{V}$ can be mapped onto a topology graph where all $n$ nodes are identifiable by using $m$ paths with maximum path length $d_{\texttt {max}}$ .

1) Tightness of the Bound Under Arbitrary Routing

In this section we show that the bound given by Theorem 2 can be achieved tightly for a specific family of topologies constructed via ICA.

Proposition 2 (Tightness ofTheorem 2):

For any $n\in \mathbb {Z}^{+}$ (positive integer) and $\bar {d}>0$ , there exists a set $P$ of $m$ monitoring paths with average length $\bar {d}$ , such that $m$ is the solution of the problem in Equations (1a) to (1d).

Proof:

We recall that the ICA technique builds a topology by creating nodes with unique encodings, in increasing order of crossing number, up to $\tilde {k}$ .

To prove the proposition, we need to show that the minimum number of monitoring paths required to identify $n$ nodes is provided in Theorem 2. ICA initially generates all the encodings of $\mathcal {B}(i)$ , for $i=1, \ldots, \tilde {k}- 1$ . As a consequence, it follows from Observation 1 that each path will traverse at least $d(\tilde {k}- 1)\triangleq \sum _{i=0}^{ \tilde {k}-2}{\binom{m-1}{ i}}$ identifiable nodes. In fact, the encodings of the nodes of $\mathcal {I}(p_{i})$ (identifiable nodes traversed by path $p_{i}$ ), must have a “1” in the $i$ -th position. Therefore the number of distinct encodings corresponding to nodes of $\mathcal {I}(p_{i})$ is at least equal to the number of binary sequences of $(m-1)$ elements, with up to $(\tilde {k}-2)$ ones.

Under incremental crossing arrangement, each path also traverses other nodes with crossing number equal to $\tilde {k}$ . Each of these nodes will appear in exactly $\tilde {k}$ paths. The number of such nodes is therefore given by $\left \lfloor{ \frac {\sum _{k=1}^{m} (d_{k} - d(\tilde {k}))}{ \tilde {k}}}\right \rfloor$ .

In conclusion, with this construction, ICA generates the following number of node encodings:

$\binom{m }{ i}$ encodings corresponding to nodes with crossing number equal to $i$ , for $i=1,\ldots, \tilde {k}- 1$ , and
$\left \lfloor{ \frac {\sum _{k=1}^{m} (d_{k} - d(\tilde {k}- 1))}{ \tilde {k}}}\right \rfloor$ encodings corresponding to nodes with crossing number equal to $\tilde {k}$ .

The number of generated encodings does not change if ICA applies the path completion procedure, which consists in a replacement of an encoding $b' \in \cup _{i=1}^{ \tilde {k}- 1} \mathcal {B}(i)$ with an encoding $b'' \in \mathcal {B}(\tilde {k})$ . In both cases, ICA constructs the set $B_{V}$ in a way that each encoding corresponds to a unique node, and the nodes are traversed by paths of average length $\bar {d}$ , guaranteeing identifiability of all the nodes corresponding to the generated encodings.

In order to show that the number of paths provided by Theorem 2 is enough to identify at least $n$ nodes in the network, we need to prove that $\left \lfloor{ \frac {\sum _{k=1}^{m} (d_{k} - d(\tilde {k}-1))}{ \tilde {k}} }\right \rfloor = \left \lfloor{ \frac {l_{\max _{ \tilde {k}}}\cdot m}{ \tilde {k}}}\right \rfloor$ . This holds because $\sum _{k=1}^{m} (d_{k} - d(\tilde {k}-1))=\sum _{k=1}^{m} d_{k} - m\cdot d(\tilde {k}-1)$ , that is equal to $l_{\max _{ \tilde {k}}}\cdot m$ in Equation (1d), being $\bar {d}=\frac {1}{m}\sum _{k=1}^{m}d_{k}$ .

Notice that Proposition 2 requires $\bar {d}\leq 2^{m-1}$ as having longer paths would require at least a path to traverse different nodes with duplicate encodings, losing identifiability with respect to the bound value.

While Proposition 2 gives a characterization of sufficient conditions for building a network topology achieving the bound, we note that there exist topologies that do not meet the conditions, but still achieve the bound.

Observation 4:

The statement of Proposition 2 holds also when $d_{\texttt {max}}$ is given instead of $\bar {d}$ . In fact this simply translates into the more general scenario where initially $d_{i}= d_{\texttt {max}}$ is assigned to all paths and where path completion is not performed. Indeed, path completion does not serve for identifiability increase, but only to meet the input condition on the average path length.

SECTION V.

Consistent Routing

As we have seen in Theorem 2, given a number of nodes to identify, the number of required paths can be logarithmic in the number of nodes. Nevertheless the bound of Theorem 2 is achieved only when the routing scheme allows paths to traverse arbitrary sequences of nodes.

If routing needs to meet additional requirements, the theoretical bound given by Theorem 2 can be increased.

We now consider the impact of the routing scheme on the identifiability of nodes via Boolean tomography. In the sequel, we assume that paths satisfy the following property of routing consistency.

Definition 3:

A set of paths $P$ is consistent if $\forall p,\: p'\in P$ and any two nodes $u$ and $v$ traversed by both paths (if any), $p$ and $p'$ follow the same sub-path between $u$ and $v$ .

We remark that routing consistency is satisfied by many practical routing protocols, including but not limited to shortest path routing (where ties are broken with a unique deterministic rule). Note that routing consistency implies that paths are cycle-free. In the following, we recall from [4] necessary for the proof of Theorem 15, where we provide the bound on the minimum number of paths under consistent routing.

Lemma 1:

Under the assumption of consistent routing, if any two different rows of the matrix $M(p_{i})$ are equal, then the corresponding nodes are not 1-identifiable.

Definition 4:

A column $M(p)|_{*,k}$ ( $k=1,\ldots,m$ ) of a path matrix $M(p)$ has consecutive ones if all the “1”s appear in consecutive rows, i.e., for any two rows $i$ and $j$ ( $i < j$ ), if $M(p)|_{i,k}=M|_{j,k}=1$ , then $M|_{h,k}=1$ for all $i\leq h \leq j$ .

Lemma 2:

Under the assumption of consistent routing, all the columns in all the path matrices have consecutive ones.

Lemma 3:

Given $m=|P|>1$ consistent routing paths, each path $p_{i}$ having length $d_{i}$ , the maximum number of different encodings in the rows of $M(p_{i})$ is upper-bounded by $\min \{d_{i}; 2 \cdot (m-1)\}$ .

Formal proofs to Lemma 1 to 3 are given in [4]. In order to ease their comprehension, we show them in the simple example of Figure 4, where all nodes are 1-identifiable under consistent routing. Routing consistency is verified as the $i$ -th column of all path matrices $M(p_{i})$ is $\boldsymbol {1}$ .

FIGURE 4.

Consistent routing example.

Show All

Theorem 5:

The minimum number of paths $m^{CR, d_{\texttt {max}}}_{\texttt {min}}$ of maximum path length $d_{\texttt {max}}$ to identify $n$ nodes under consistent routing is the solution of the problem of Theorem 2, where Equation (1c) reads: $\begin{equation*} D = \min \left \{{ d_{\texttt {max}}, 2\cdot (m-1) }\right \}.\tag{2}\end{equation*}$ View Source

Proof:

The proof is analogous to the one of Theorem 2. The only difference in this case is that $D = \min \left \{{ d_{\texttt {max}}, 2\cdot (m-1) }\right \}$ , because of Lemma 3.

The same considerations on the knowledge of the average path lengths for Theorem 2 hold for Theorem 5:

Corollary 3:

When $\bar {d}$ is known, the optimal solution $m^{AR, d_{\texttt {max}}}_{\texttt {min}}$ of the problem in Theorem 2, is a lower bound to the minimum number of paths to identify $n$ nodes, if we substitute $d_{\texttt {max}}$ with $\bar {d}$ in Equation (2). We call $m^{CR,\bar {d}}_{\texttt {min}}$ the bound computed with $\bar {d}$ .

Corollary 4:

The bound provided in Theorem 5 may be achieved by allowing the maximum value for the crossing number of a node to be 3.

Proof:

We need to prove that the maximum value of $\tilde {k}$ is 3 under the assumption of consistent routing. Let us assume that $\bar {d}\ge 2\cdot (m-1)$ , and therefore that $D = 2\cdot (m-1)$ . Recall that $\tilde {k}= \max \left \{{ j: \sum \limits _{i=1}^{j} i \cdot \binom {m}{i} \le m\cdot D}\right \}+1$ . For $j = 2$ and $\forall m \ge 2$ , it holds that $\begin{equation*} \sum \limits _{i=1}^{2}~i \cdot \binom {m}{i}=m + 2 \frac {m(m-1)}{2}=m^{2} < 2m\cdot (m-1),\end{equation*}$ View Source whereas for $j = 3$ : $\begin{align*} \sum \limits _{i=1}^{3}~i \cdot \binom {m}{i}=&m^{2} + 3 \frac {(m-2)(m-1)m}{6} \\=&\frac {m^{3} - m^{2}}{2}+ m>2m\cdot (m-1) \quad \forall m.\end{align*}$ View Source Since $\sum \limits _{i=1}^{N} i \cdot \binom {m}{i}$ is a growing function of $N$ , $\tilde {k}$ is at most 3.

A. Case of Study: Grid Networks

The bound provided in Theorem 5 is tight on square grid networks with $n^{2}$ nodes, using $2n - 1$ paths of maximum length $d_{\texttt {max}}=n$ . By contradiction, assume $d_{\texttt {max}}= n$ and $m^{CR, d_{\texttt {max}}}_{\texttt {min}} = 2n-2$ . It is easy to see that $\tilde {k}=2\,\,\forall n\in \mathbb {N}$ , $n > 2$ . Therefore $l_{max_{ \tilde {k}}} = n - 1$ and the number of nodes that may be identified with $m = 2n-2$ paths is $(n-1)^{2} + 2n-2 = n^{2} -1 < n^{2}$ . An example of such topology and of paths placement is in Figure 5.

$FIGURE 5. - $3\times 3$ grid network.$

FIGURE 5.

$3\times 3$ grid network.

Show All

SECTION VI.

Experimental Results

We evaluate the tightness of the bounds proposed in the previous sections in comparison to with the results obtained by a state-of-the-art heuristic ([5]). For this purpose we run experiments on synthetic as well as real network topologies, implemented in Matlab. First, in Figures 6 and 7 we show the trend of the bounds on the minimum number of paths for the identification of $n$ nodes, for the two cases of arbitrary and consistent routing, respectively (i.e., bounds of Theorems 2 and 5), under varying $n$ , and path length $d_{\texttt {max}}$ . Observe that the dependence on the path length of the values of $m^{AR, d_{\texttt {max}}}_{\texttt {min}}$ and $m^{CR, d_{\texttt {max}}}_{\texttt {min}}$ is stronger for smaller values of $d_{\texttt {max}}$ . This is an expected behaviour, as in Equation (1d), it holds that $D = 2^{m-1}$ for all values of $d_{\texttt {max}}\ge 2^{m-1}$ . As a result, for all such values of $d_{\texttt {max}}$ , the minimum number of paths needed to identify $n$ nodes is the same. This phenomenon is more evident in the case of consistent routing (Figure 7) because $D = \min \{ d_{\texttt {max}}, 2\cdot (m -1)\}$ (see Equation (2)), and therefore $D = 2\cdot (m-1)$ for all $d_{\texttt {max}}\ge 2\cdot (m-1)$ .

$FIGURE 6. - Bound of Theorem 2 $m^{AR, d_{\texttt {max}}}_{\texttt {min}}$ , varying paths lengths.$

FIGURE 6.

Bound of Theorem 2 $m^{AR, d_{\texttt {max}}}_{\texttt {min}}$ , varying paths lengths.

Show All

$FIGURE 7. - Bound of Theorem 5 $m^{CR, d_{\texttt {max}}}_{\texttt {min}}$ , varying paths lengths.$

FIGURE 7.

Bound of Theorem 5 $m^{CR, d_{\texttt {max}}}_{\texttt {min}}$ , varying paths lengths.

Show All

A. Topologies

We hereafter list the topologies (synthetic and real) used in our evaluation:

Random Geometric (RG) graphs. RG graphs are synthetic topologies [15] built as follows: $n$ nodes are placed in a unit square and a link is added between any pair of nodes whose distance is lower than or equal to a threshold parameter $\rho >0$ . In our experiments, we generate random coordinates $(x_{i},y_{i})$ for each node $v_{i}$ and we vary the value of $\rho$ . This model well simulates ad-hoc wireless networks.
Jellyfish topology. Introduced in [16], Jellyfish are emerging data centre topologies which offer high throughput and capacity, high scalability and failure resiliency. The internal nodes of the Jellyfish (nodes with degree strictly greater than one) are network switches, whereas leaf nodes are servers.
US Signal. This is the real topology of a fiber optical network in the USA. This topology was made available in the Topology Zoo archive [17], and is composed of 63 nodes and 133 edges.
Uninett. This is an existing Internet topology located in Norway. It has 69 nodes and 98 edges. It was also taken from the Topology Zoo archive [17].

B. Benchmark Heuristic

In order to evaluate the tightness of our bound, we use a state-of-the-art greedy for identifiability (GI) proposed in [5], as a benchmark for comparisons, and we adapt it to our scenario. GI was originally proposed as an algorithm to place servers for addressing Quality of Service (QoS) and failure identifiability requirements in a joint manner. Given multiple services, and related client sets, the algorithm finds the most suitable server location, among those satisfying QoS requirements, to optimize failure identifiability. The selected server locations are such that the client-to-server paths form several intersecting trees, one for each service, where servers are located at the roots and clients are located at the leaves. GI uses a greedy approach that iteratively selects a number of server positions, one for each service, such that the identifiability obtained by the adopted client-to-server paths, is maximized at each iterative step. The authors prove that the number of paths placed by this heuristic is a constant approximation of the optimal solution.

In our experiments we modify the GI approach to obtain an upper bound to the number of paths that are necessary to uniquely identify the state of a given number of nodes $n$ . In particular, in order to ensure maximum flexibility to the choice of the set of monitoring paths we consider only one client for each server, and a number of paths that is equal to the number of deployed servers. Moreover, we relax the quality of service requirements to obtain all the server locations which are at a distance lower than or equal to $d_{\texttt {max}}$ from the client. The algorithm ends as soon as the selected paths are sufficient to identify the desired number of nodes $n$ .

In our implementation of GI communication between any two endpoints is obtained through a shortest path routing algorithm. The adoption of a deterministic tie breaking rule ensures that the obtained routing scheme is deterministic. In order to prevent the use of degenerate paths (i.e., paths traversing only one node), servers are never located on the same node as the related client.

In order to boost identifiability of the greedy approach, we consider a preliminary phase where a set of paths is deployed according to a greedy for node coverage approach. This coverage phase is also common to [18], and is motivated by the observation that greedy for identifiability approaches select short paths in the early iterations, to obtain maximum identifiability, which prevents further identification in the later steps of the algorithm, due to insufficient node coverage, an issue that is easily solved by letting the algorithm use longer paths in the early execution steps.

C. Tests

The bounds of Theorems 2 and 5 are compared with the results obtained by GI, which provides an upper bound to the minimum number of paths that are necessary to uniquely identify the state of a given number of nodes $n$ . We carry out two different sets of experiments. In the first set, the maximum path length is fixed, $d_{\texttt {max}}= 4$ , whereas the number of nodes to be identified is variable. Figure 8 shows the curves of the number of paths necessary to identify an increasing number of nodes for GI with respect to the bounds. The bounds are also evaluated with the average path length resulting from the path choices of GI. Each figure represents the experiment on a different topology. For random topologies (random geometric graphs and jellyfish topologies), we generate graphs of 100 nodes. GI is then run on all such topologies with an increasing number of nodes to identify (from 10 to 80). For the fiber and the internet topologies instead, the number of nodes to identify goes from 10 to 60, being 63 and 69, respectively, the total number of nodes of the networks.

$FIGURE 8. - Number of paths to identify variable numbers of nodes on different topologies. $d_{\texttt {max}}=4$ .$

FIGURE 8.

Number of paths to identify variable numbers of nodes on different topologies. $d_{\texttt {max}}=4$ .

Show All

A similar setting was also implemented in the second set of experiments, depicted in Figure 9. For these experiments, curves represent how the number of paths necessary to identify a fixed number of nodes $n$ changes depending on different values of $d_{\texttt {max}}$ . For random topologies, $n=80$ , whereas for US Signal and Uninett, $n = 60$ . Also in this case, we generated random topologies having 100 nodes.

$FIGURE 9. - Number of paths to identify $n$ nodes with variable values of $d_{\texttt {max}}$ on different topologies.$

FIGURE 9.

Number of paths to identify $n$ nodes with variable values of $d_{\texttt {max}}$ on different topologies.

Show All

Tests on random topologies have been executed by generating 20 different graphs of each type. Shades and bars in the curves of random topologies represent the standard deviation of the mean number of paths used by GI and of the bounds with variable average path length, $\bar {d}$ (Figures 8a–8d and Figures 9a–9d). In contrast, shades and bars in the curves related to real topologies (US Signal and Uninett, Figures 8e and 8f and Figures 9e and 9f) represent the standard deviation of the mean number of paths used by a randomized version of GI where routing consistency is still satisfied.

We tested on random geometric graphs with different values of $\rho$ (0.1, 0.2, 0.3) in order to analyse the goodness of our bounds on graphs with different properties. When generating random geometric graphs, there is no guarantee that the resulting graph is connected, unless $\rho = \sqrt {2}$ (in this case, the RG graph is a clique, being $\sqrt {2}$ the maximum distance between two nodes in an unit square). Experimentally speaking, we encountered non-connected graphs only for $\rho =0.1$ . When this event occurs, we link together the least number of nodes belonging to different connected components that are the closest (by means of the Euclidean distance), until the graph is connected. Variations of $\rho$ have a great impact on the structure of the resulting graph. The characteristics of the set of 20 random geometric graphs generated with 100 nodes, with respect to different values of $\rho$ are summed up in Table 2. Figure 10 shows an example of how different values of $\rho$ change the structure of a random geometric graph with 100 nodes.

TABLE 2 Average Properties of 20 Random Geometric Graphs of 100 Nodes for Different Values of

$\rho$ . Here

$\partial_{min/max/avg}$ are the Mean Minimum, Maximum and Average Degrees of the Nodes.

$\delta$ is the Average Diameter of the Graph

$Table 2- Average Properties of 20 Random Geometric Graphs of 100 Nodes for Different Values of $\rho$ . Here $\partial_{min/max/avg}$ are the Mean Minimum, Maximum and Average Degrees of the Nodes. $\delta$ is the Average Diameter of the Graph$

$FIGURE 10. - Random geometric graphs with 100 nodes having the same geometric coordinates, built with different values of $\rho $ .$

FIGURE 10.

Random geometric graphs with 100 nodes having the same geometric coordinates, built with different values of $\rho$ .

Show All

Notice that, when $\rho$ increases, so does the average node degree, whereas the network diameter decreases correspondingly. When the maximum path length is fixed to 4 (Figures 8a–8c), different values of $\rho$ do not imply sensible differences in the performance of GI, whose trend corresponds to the one of our bounds. This confirms the fact that our bounds do not depend on topology structures that can be extremely different. On the other hand, when the maximum path length is variable (Figures 9a–9c), we can observe the following facts: first of all, in order to test for $d_{\texttt {max}}=7$ , it is necessary to set the condition that the graph diameter $\delta$ is greater or equal to 7. In our experiments, this condition is always met for $\rho =0.1$ and 0.2, whereas the same does not hold for $\rho =0.3$ (see Table 2, column $\rho = 0.3$ , where $\delta < 7$ ). In Table 2, the column $\rho =0.3^{*}$ corresponds to random geometric graphs with $\rho = 0.3$ that satisfy the condition $\delta \ge 7$ . We generated graphs satisfying this condition for the experiments in Figure 9c. Figures 9a–9c show that our bounds are closer to the curves representing the performance of GI when $\rho = 0.2$ . As a matter of fact, when $\rho = 0.3$ , the average distance between nodes (in number of hops) is smaller, and the graph diameter is never greater than 7, our maximum path length. As a consequence, nodes are highly connected on average. For this reason, despite the greater path length availability, only a very few paths reach the maximum path length. On the other hand, after the coverage phase, more shorter paths are needed in order to guarantee that $n=80$ nodes are identifiable, implying that the average path length decreases, as we can observe by the growing trend of the curves $m^{AR,\bar {d}}_{\texttt {min}}$ and $m^{CR,\bar {d}}_{\texttt {min}}$ . On the other hand, for $\rho = 0.1$ , a few nodes have a high centrality, whereas most of the nodes have degree 2, meaning that the majority of the network nodes are distributed in chains. As node identifiability holds when nodes have unique encodings, e.g., different sets of paths crossing them, chains are structures that are hard to identify by means of monitoring paths, and large path length do norepresent an advantage for this specific structure. When $\rho =0.2$ , the resulting graphs are not star-shaped, but at the same time long paths are available, showing their stronger identification power.

On jellyfish topologies (Figures 8d and 9d), our bounds are very close to the results obtained by GI, with negligible differences between the curves. One of the properties of jellyfish topologies is that servers can reach one another with shorter paths with respect to other topologies for data centres (e.g., fat trees), [16], for this reason the curves of our bounds are tighter for smaller values of $d_{\texttt {max}}$ in Figure 9d. Despite this, in both configurations (Figures 8d and 9d) the curves representing the results of GI scale analogously with our bounds.

The results shown in this section highlight that our bounds are very close to the number of monitoring paths that a greedy algorithm would use in order to guarantee nodes identifiability. We stressed our experiments to evaluate our bounds on synthetic networks with very different structures and connectivity properties. Experiments show that the bounds presented in this work represent a very trustful estimation of the number of paths for node identifiability on Jellyfish topologies, as well as on real internet and fiber optical networks. In addition, we can also observe that knowledge of $\bar {d}$ can be used to provide tighter bounds, specially when there are few paths of length $d_{\texttt {max}}$ in the network.

SECTION VII.

Conclusion

In this article we provide theoretical lower bounds on the minimum number of monitoring end-to-end paths necessary to achieve the desired level of identifiability in a network in terms of identifiable nodes. We study how the routing scheme affects the bound values by giving two different formulations, for arbitrary and consistent routing, respectively. We also study how requirements on the maximum and average path length affect the bound formulation, highlighting the dependence of the minimal number of required paths on QoS constraints. We proposed a polynomial-time algorithm, called ICA, that takes into consideration such constraints to design a network that meets the bounds for the case of arbitrary routing. We carried out an extensive set of experiments on synthetic and real topologies to evaluate the tightness of the proposed lower-bounds. The synthetic topologies that we used are commonly employed for modelling ad-hoc wireless networks and data centers, whereas the real networks are an internet and a optical fiber network located in Norway and in the USA. We used a state-of-the-art algorithm for network identifiability maximization via path deployment to obtain feasible solutions as upper bounds of the optimum and evaluate the tightness of the proposed lower bound. We show that the provided bounds provide a high approximation in all the performed experiments.

References is not available for this document.

Topology Agnostic Bounds on Minimum Requirements for Network Failure Identification

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Related Work

Problem Formulation

Definition 1:

Arbitrary Routing

Proposition 1:

Observation 1:

Observation 2:

Theorem 2:

Proof:

Corollary 1:

Corollary 2:

Proof:

A. Design Via Incremental Crossing Arrangement (ICA)

Observation 3:

1) Tightness of the Bound Under Arbitrary Routing

Proposition 2 (Tightness ofTheorem 2):

Proof:

Observation 4:

Consistent Routing

Definition 3:

Lemma 1:

Definition 4:

Lemma 2:

Lemma 3:

Theorem 5:

Proof:

Corollary 3:

Corollary 4:

Proof:

A. Case of Study: Grid Networks

Experimental Results

A. Topologies

B. Benchmark Heuristic

C. Tests

Conclusion

References