Introduction
With the massive growth of the Internet, localizing node failures has become a crucial task. Single organizations have direct access only to limited portions of the internal nodes of the network, and they hardly collaborate in sharing internal performance observations because of commercial conflicts. Boolean Network Tomography (BNT) provides tools for assessing the state of a network through end-to-end monitoring paths, as they do not rely on administrative access privileges, [1]. Boolean Network Tomography overcomes the limitations faced by traditional network monitoring approaches based on pervasively deployed monitoring agents (e.g., SNMP) or pervasively supported network protocols (e.g., traceroute) caused by the complexity and the heterogeneity of modern computer communication networks. As a matter of facts, bugs and configuration errors in various customer software and network functions often induce “silent failures” that are only detectable from end-to-end connection states, [2].
Each monitoring path can be modelled as the sequence of nodes that it traverses. When a packet correctly reaches its final destination, we say that its path is in a working state, and so are its traversed nodes. On the other hand, when packet losses occur along a path, we say that the path failed. The latter situation arises when at least one of the nodes of the path is failed. Observations of the outcome of monitoring paths (working/failed) induce a system of Boolean equations where the unknowns are the Boolean states of the nodes in the network. The challenge related to this approach is that such Boolean systems are commonly under-determined, hence allowing multiple solutions, i.e., multiple failing scenarios that lead to the same observations on the path probe outcome, [3]. When the state of a failed node can be uniquely determined from the Boolean system induced by the outcome of monitoring paths, the node is said to be identifiable. Therefore, identifiability is a desirable property that allows unambiguous nodes’ state classification.
In this work, we provide topology-agnostic lower-bounds to the minimum number of measurement paths which are necessary to guarantee identifiability to a desired number of nodes. Such bounds represent the dual solution to the optimization problem studied in [4], where we introduced upper-bounds to the maximum number of identifiable nodes given a number of monitoring paths. In contrast with existing literature, we propose theoretical lower-bounds that cannot be violated, independently of the specific characteristics of the topology. The bounds formulations are only based on the number of nodes to identify, on high level routing consistency properties (arbitrary and consistent routing), and on QoS requirements, expressed in terms of maximum allowed path length. Motivated by the need to complement the analysis of [4], our bounds are a useful tool to measure the capability of a monitored topology to efficiently identify the status of its components. Implementing a monitoring system comes with the cost of installing monitors on the nodes of a network and of traffic caused by path probing; with this work, we aim at providing fundamental guidelines and minimal requirements for achieving the desired level of network identifiability (e.g., number of identifiable nodes). In addition, we formalize the Incremental Crossing Arrangement (ICA) procedure to generate monitoring schemes and underlying topologies that meet the bounds tightly, giving insights on which topology is the most suitable for failure localization. With ICA we formalize a network engineering approach which is at the basis of the theoretical analysis started in [4] and completed in this article.
We hereby list the major contribution of this work.
We study theoretical bounds on the minimum number of paths to deploy in a network for identifying a desired number of nodes. The bounds do not depend on specific network topologies (i.e., they are topology agnostic bounds).
We provide the Incremental Crossing Arrangement (ICA) algorithm, which allows topology design meeting the proposed bounds.
We evaluate the tightness of our bounds on both synthetic and real network topologies. For this purpose we compare the bounds with the results of a state-of-the-art greedy algorithm, hereby referred to as Greedy for Identifiability (GI), for maximizing network identifiability by means of client-to-server probing paths [5].
Related Work
Boolean Network Tomography studies non linear relationships existing between the paths and their components, as in the case of congestion or failure localization. The early works on this topic focused on best-effort inference. For example, Duffield [6], [7] and Kompella et al. [2] aimed at finding the minimum set of failures that can explain the observed measurements, and Nguyen and Thiran [3] aimed at finding the most likely failure set that explains the observations by a probabilistic analysis of a set of experiments. More recently and with a similar goal, the authors of [8] build a Markov Decision Process in light of passive traffic data, and solve the tomography problem with a
Lately, Ma et al. [9] give characterizations of maximum identifiability of node failure under different end-to-end monitoring systems, and extend this work in [10], where they outline topology-specific properties on the number of nodes whose states can be identified under a given number of failures. In contrast, as specified in [4], we provide general, topology-agnostic bounds. An optimal monitor placement for ensuring node
With this article, we complete the analysis of the fundamental bounds on node identifiability in Boolean Network Tomography introduced in [4].
Problem Formulation
We represent a network as a undirect graph
We call failure set of a network,
Definition 1:
A node
Node identifiability allows non ambiguous node state assessment by means of end-to-end measurement paths. We highlight that in order to be identifiable, a node must be monitored at least by one path. For this reason, a node whose binary encoding is null cannot be identifiable. In this article, we give bounds on the minimum number of paths that are needed for letting
Arbitrary Routing
In this section we study the minimum number of monitoring paths that can be employed to identify
In Section III, we explained that nodes can be represented with binary encodings depending on what paths traverse them. In addition, we noticed that, in order for nodes in a network to be identifiable, they must have all different encodings. Since the number of different binary encodings of length
Proposition 1:
The minimum number of monitoring paths to place in order to identify
The bound represented by
Observation 1:
The number of distinct binary strings in
Observation 2:
The maximum length
These observations are illustrated in Figure 1. In this simple example we consider
In Theorem 2 we provide a lower bound to the number of paths to place in order to identify
Theorem 2:
The minimum number of paths \begin{equation*} \min m\quad s.t. \left \lfloor{ \frac {l_{max_{ \tilde {k}}} \cdot m}{ \tilde {k}}}\right \rfloor +\sum \limits _{i=1}^{ \tilde {k}-1}\binom {m}{i}\ge n,\tag{1a}\end{equation*}
\begin{align*} \tilde {k}=&\max \left \{{ j: \sum \limits _{i=1}^{j} i \cdot \binom {m}{i} \le m\cdot D}\right \}+1,\tag{1b}\\ D=&\min \left \{{ d_{\texttt {max}}, 2^{m-1} }\right \},\tag{1c}\end{align*}
\begin{equation*} l_{max_{ \tilde {k}}} = D- \sum \limits _{i=0}^{ \tilde {k}-2}\binom {m-1}{i}.\tag{1d}\end{equation*}
Proof:
In order to minimize the number of paths, we want to have as many distinct encodings as possible with the minimum number of 1s. This fact translates into a strategy that consists in incrementally increasing the crossing number of the monitored nodes until the fixed average path length
The quantity
Constraints on paths length are usually imposed by QoS requirements and influence substantially the minimum amount of paths needed to identify a certain number of nodes. In a network where shortest path routing schemes are applied, the value of
Corollary 1:
The bound of Theorem 2 holds also when we consider the average path length,
The bound of Theorem 2 suggests that the number of nodes that
Corollary 2:
The number of nodes that
Proof:
In Observation 1 we point out that, given a set of
We highlight that knowledge of path length does not necessarily imply explicit knowledge of the paths - in terms of what nodes they traverse.
A. Design Via Incremental Crossing Arrangement (ICA)
The proof of Theorem 2 suggests a technique to build a network topology
ICA, the idea. The technique works by generating
ICA in details. Algorithm 1 formalizes the incremental crossing arrangement design, used to determine the binary encodings of the identifiable nodes.
As we consider
Finally, given a generic set of binary encodings
Without loss of generality, we consider paths of balanced length, i.e. we set the length
The incremental crossing arrangement approach incrementally generates the solution set
The procedure described so far is sufficient to produce a network topology where all nodes are identifiable and where the number of paths
We choose an encoding
ICA: Example A (where path completion is not necessary). Figure 2 shows an example of a topology generated by means of incremental crossing arrangement. We are given
ICA: Example B (with path completion). Figure 3 shows another example of a topology generated by means of incremental crossing arrangement. We are given
Finally, we observe that
It is worth observing the following.
Observation 3:
ICA produces a network topology and related monitoring paths such that all nodes have a crossing number lower than or equal to
When knowledge of
1) Tightness of the Bound Under Arbitrary Routing
In this section we show that the bound given by Theorem 2 can be achieved tightly for a specific family of topologies constructed via ICA.
Proposition 2 (Tightness ofTheorem 2):
For any
Proof:
We recall that the ICA technique builds a topology by creating nodes with unique encodings, in increasing order of crossing number, up to
To prove the proposition, we need to show that the minimum number of monitoring paths required to identify
Under incremental crossing arrangement, each path also traverses other nodes with crossing number equal to
In conclusion, with this construction, ICA generates the following number of node encodings:
encodings corresponding to nodes with crossing number equal to\binom{m }{ i} , fori , andi=1,\ldots, \tilde {k}- 1 encodings corresponding to nodes with crossing number equal to\left \lfloor{ \frac {\sum _{k=1}^{m} (d_{k} - d(\tilde {k}- 1))}{ \tilde {k}}}\right \rfloor .\tilde {k}
The number of generated encodings does not change if ICA applies the path completion procedure, which consists in a replacement of an encoding
In order to show that the number of paths provided by Theorem 2 is enough to identify at least
Notice that Proposition 2 requires
While Proposition 2 gives a characterization of sufficient conditions for building a network topology achieving the bound, we note that there exist topologies that do not meet the conditions, but still achieve the bound.
Observation 4:
The statement of Proposition 2 holds also when
Consistent Routing
As we have seen in Theorem 2, given a number of nodes to identify, the number of required paths can be logarithmic in the number of nodes. Nevertheless the bound of Theorem 2 is achieved only when the routing scheme allows paths to traverse arbitrary sequences of nodes.
If routing needs to meet additional requirements, the theoretical bound given by Theorem 2 can be increased.
We now consider the impact of the routing scheme on the identifiability of nodes via Boolean tomography. In the sequel, we assume that paths satisfy the following property of routing consistency.
Definition 3:
A set of paths
We remark that routing consistency is satisfied by many practical routing protocols, including but not limited to shortest path routing (where ties are broken with a unique deterministic rule). Note that routing consistency implies that paths are cycle-free. In the following, we recall from [4] necessary for the proof of Theorem 15, where we provide the bound on the minimum number of paths under consistent routing.
Lemma 1:
Under the assumption of consistent routing, if any two different rows of the matrix
Definition 4:
A column
Lemma 2:
Under the assumption of consistent routing, all the columns in all the path matrices have consecutive ones.
Lemma 3:
Given
Formal proofs to Lemma 1 to 3 are given in [4]. In order to ease their comprehension, we show them in the simple example of Figure 4, where all nodes are 1-identifiable under consistent routing. Routing consistency is verified as the
Theorem 5:
The minimum number of paths \begin{equation*} D = \min \left \{{ d_{\texttt {max}}, 2\cdot (m-1) }\right \}.\tag{2}\end{equation*}
The same considerations on the knowledge of the average path lengths for Theorem 2 hold for Theorem 5:
Corollary 3:
When
Corollary 4:
The bound provided in Theorem 5 may be achieved by allowing the maximum value for the crossing number of a node to be 3.
Proof:
We need to prove that the maximum value of \begin{equation*} \sum \limits _{i=1}^{2}~i \cdot \binom {m}{i}=m + 2 \frac {m(m-1)}{2}=m^{2} < 2m\cdot (m-1),\end{equation*}
\begin{align*} \sum \limits _{i=1}^{3}~i \cdot \binom {m}{i}=&m^{2} + 3 \frac {(m-2)(m-1)m}{6} \\=&\frac {m^{3} - m^{2}}{2}+ m>2m\cdot (m-1) \quad \forall m.\end{align*}
A. Case of Study: Grid Networks
The bound provided in Theorem 5 is tight on square grid networks with
Experimental Results
We evaluate the tightness of the bounds proposed in the previous sections in comparison to with the results obtained by a state-of-the-art heuristic ([5]). For this purpose we run experiments on synthetic as well as real network topologies, implemented in Matlab. First, in Figures 6 and 7 we show the trend of the bounds on the minimum number of paths for the identification of
Bound of Theorem 2
Bound of Theorem 5
A. Topologies
We hereafter list the topologies (synthetic and real) used in our evaluation:
Random Geometric (RG) graphs. RG graphs are synthetic topologies [15] built as follows:
nodes are placed in a unit square and a link is added between any pair of nodes whose distance is lower than or equal to a threshold parametern . In our experiments, we generate random coordinates\rho >0 for each node(x_{i},y_{i}) and we vary the value ofv_{i} . This model well simulates ad-hoc wireless networks.\rho Jellyfish topology. Introduced in [16], Jellyfish are emerging data centre topologies which offer high throughput and capacity, high scalability and failure resiliency. The internal nodes of the Jellyfish (nodes with degree strictly greater than one) are network switches, whereas leaf nodes are servers.
US Signal. This is the real topology of a fiber optical network in the USA. This topology was made available in the Topology Zoo archive [17], and is composed of 63 nodes and 133 edges.
Uninett. This is an existing Internet topology located in Norway. It has 69 nodes and 98 edges. It was also taken from the Topology Zoo archive [17].
B. Benchmark Heuristic
In order to evaluate the tightness of our bound, we use a state-of-the-art greedy for identifiability (GI) proposed in [5], as a benchmark for comparisons, and we adapt it to our scenario. GI was originally proposed as an algorithm to place servers for addressing Quality of Service (QoS) and failure identifiability requirements in a joint manner. Given multiple services, and related client sets, the algorithm finds the most suitable server location, among those satisfying QoS requirements, to optimize failure identifiability. The selected server locations are such that the client-to-server paths form several intersecting trees, one for each service, where servers are located at the roots and clients are located at the leaves. GI uses a greedy approach that iteratively selects a number of server positions, one for each service, such that the identifiability obtained by the adopted client-to-server paths, is maximized at each iterative step. The authors prove that the number of paths placed by this heuristic is a constant approximation of the optimal solution.
In our experiments we modify the GI approach to obtain an upper bound to the number of paths that are necessary to uniquely identify the state of a given number of nodes
In our implementation of GI communication between any two endpoints is obtained through a shortest path routing algorithm. The adoption of a deterministic tie breaking rule ensures that the obtained routing scheme is deterministic. In order to prevent the use of degenerate paths (i.e., paths traversing only one node), servers are never located on the same node as the related client.
In order to boost identifiability of the greedy approach, we consider a preliminary phase where a set of paths is deployed according to a greedy for node coverage approach. This coverage phase is also common to [18], and is motivated by the observation that greedy for identifiability approaches select short paths in the early iterations, to obtain maximum identifiability, which prevents further identification in the later steps of the algorithm, due to insufficient node coverage, an issue that is easily solved by letting the algorithm use longer paths in the early execution steps.
C. Tests
The bounds of Theorems 2 and 5 are compared with the results obtained by GI, which provides an upper bound to the minimum number of paths that are necessary to uniquely identify the state of a given number of nodes
Number of paths to identify variable numbers of nodes on different topologies.
A similar setting was also implemented in the second set of experiments, depicted in Figure 9. For these experiments, curves represent how the number of paths necessary to identify a fixed number of nodes
Number of paths to identify
Tests on random topologies have been executed by generating 20 different graphs of each type. Shades and bars in the curves of random topologies represent the standard deviation of the mean number of paths used by GI and of the bounds with variable average path length,
We tested on random geometric graphs with different values of
Random geometric graphs with 100 nodes having the same geometric coordinates, built with different values of
Notice that, when
On jellyfish topologies (Figures 8d and 9d), our bounds are very close to the results obtained by GI, with negligible differences between the curves. One of the properties of jellyfish topologies is that servers can reach one another with shorter paths with respect to other topologies for data centres (e.g., fat trees), [16], for this reason the curves of our bounds are tighter for smaller values of
The results shown in this section highlight that our bounds are very close to the number of monitoring paths that a greedy algorithm would use in order to guarantee nodes identifiability. We stressed our experiments to evaluate our bounds on synthetic networks with very different structures and connectivity properties. Experiments show that the bounds presented in this work represent a very trustful estimation of the number of paths for node identifiability on Jellyfish topologies, as well as on real internet and fiber optical networks. In addition, we can also observe that knowledge of
Conclusion
In this article we provide theoretical lower bounds on the minimum number of monitoring end-to-end paths necessary to achieve the desired level of identifiability in a network in terms of identifiable nodes. We study how the routing scheme affects the bound values by giving two different formulations, for arbitrary and consistent routing, respectively. We also study how requirements on the maximum and average path length affect the bound formulation, highlighting the dependence of the minimal number of required paths on QoS constraints. We proposed a polynomial-time algorithm, called ICA, that takes into consideration such constraints to design a network that meets the bounds for the case of arbitrary routing. We carried out an extensive set of experiments on synthetic and real topologies to evaluate the tightness of the proposed lower-bounds. The synthetic topologies that we used are commonly employed for modelling ad-hoc wireless networks and data centers, whereas the real networks are an internet and a optical fiber network located in Norway and in the USA. We used a state-of-the-art algorithm for network identifiability maximization via path deployment to obtain feasible solutions as upper bounds of the optimum and evaluate the tightness of the proposed lower bound. We show that the provided bounds provide a high approximation in all the performed experiments.