Introduction
The growing requirement of cloud-based services has been bringing large amounts of traffic into data centers (DCs). Annual global DC IP traffic has reached 3.1 zettabytes in 2013 and is growing by 23% annually. More than three quarters of this traffic account for the Intra-DC traffic portion [1]. This has fueled rapid emergence of the large-scale data center networks (DCNs). The current intra-DCN communication, which is still based on high-capacity electrical switches, is facing technical challenges of high power consumption and huge latency [2]. To cope with the huge intra-DC traffic in an energy efficient way, hybrid optical/electrical switching or all-optical switching have attracted much attention [3]– [5].
Optical packet switching (OPS) is a potential candidate for future DCN [6]– [8] to provide flexible, fine-granularity, and high-link-utilization communications due to the statistical multiplexing. Whereas optical circuit switching (OCS) is preferred by huge-volume traffic communication where high reliability is required. By taking the advantage of the approximate bimodal flow length distribution in data centers [9], we have proposed an OPS-based DCN supporting agile OCS with a unified platform by introducing an instantaneous dedicated OCS path [10]– [13]. A shortest path called Express path (ExP) is instantaneously reserved from source to destination before the start of the transmission. Then packets of the flow will be routed along the established ExP and incur no contention or loss. After the transmission is done, the ExP is torn down immediately. ExPs are managed by a central controller which is aware of resource usage in the whole network. Ordinary packets are supported by non-reserved links with deflection routing for contention resolution. However, the coexistence of OCS in OPS-based networks could cause severe performance degradation if the ExPs are not set up properly. This is because the dedicated ExPs may break the original network topology and cause OPS packet circulations, as observed in [13].
In this paper, we propose to preserve bypassing routes for the ordinary packets when setting up the ExPs. For each reserved link on the ExP, at least one bypassing route is preserved. The bypassing routes are chosen from the routes that the ordinary OPS packets are more likely to be deflected to. The bypassing routes cannot overlap with existing ExPs. Otherwise, the selected path for the ExP should be discarded and a new path should be re-selected. In this way, the ExPs are set up in a sparse way and the OPS packets are more likely to bypass the ExPs via the bypassing routes, instead of being trapped by undergoing circulations in the network. The adverse effect of ExPs on the ordinary packets are mitigated, and the coexistence of the OCS and OPS communications are facilitated.
Preliminaries: OPS-Based DCN With OCS Capability
2.1. Architecture and Topology
The OPS-based DCN is as given in [13]. The router, illustrated in Fig. 1, is called a Hybrid Optoelectronic Router (HOPR) [10], combining the optical switch with the electronic buffer. Key notations in Fig. 1 and in the paper are listed in Table 1. HOPR supports 100 Gb/s link rate. Arriving optical packets firstly go through the label processors which extract the label information for forwarding decisions. Then the optical switch is configured accordingly and forwards the packets to their intended output links. Fiber delay lines (FDLs) are utilized for contention resolution. The shared electronic buffer serves as a bridge between the 10 Gb/s electronic domain Top-of-Rack (ToR) switches and the 100 Gb/s optical domain transponders. Each ToR switch connects to a rack of 1 Gb/s servers (not shown in Fig. 1). The shared electronic buffer provides packet buffering during injection and reception. It can also be used for contention resolution.
The network topology is torus.
\begin{equation*}\mathbf{a}\, \boxplus\, \mathbf{e}_{i}:= (\mathbf{a} + \mathbf{e}_{i})\ \bmod K,\quad \mathbf{a}\, \boxminus\, \mathbf{e}_{i} := (\mathbf{a} - \mathbf{e}_{i})\ \bmod K.\tag{1}\end{equation*}
2.2. Routing and Contention Resolution
For the high-speed OPS network, label processing should be done extremely fast. Therefore, complicated and time-consuming routing algorithms are not appropriate. We choose simple routing and contention resolution methodologies as detailed in [13]. Briefly speaking, the basic rules are as follows. Packet closer to its destination has higher priority. For packet routing, we consider the following points in turn.
Dimension with longer distance has higher priority.
Lower dimension has higher priority.
direction has higher priority than$\mathbf{e}_{i}+$ direction.$\mathbf{e}_{i}-$
If more than one packet is forwarded to the same output port, we apply contention resolution strategies. Note that, packet delays are highly affected by different contention resolution strategies. With the delay values shown in Fig. 1, additional delays introduced to the shortest deflection routing (0 ns) by contention resolution methods are given in Table 2. To reduce the packet delays, contention resolution methods are chosen following the priority in Table 2, where shortest deflection routing is given the highest priority.
Following the above rules, the packet is routed via a deterministic shortest path without contention. An example of packet routing is illustrated in Fig. 2. In a 3-D 7-radix torus-topology DCN, for a packet that has the source S7 : (4, 4, 6) and destination S3 : (6, 6, 6), the path in red, i.e., S7–S8–S5–S6–S3, is chosen originally. If contention happens at the link S7-S8, the packet is deflected to an alternative path, (dashed line in green) S7–S4–S5–S6–S3.
The extinction ratio of the switch is around 40 dB to allow for a large number of hop counts without suffering from crosstalk effects. Besides, packet can be regenerated in the electronic buffer with forward error correction (FEC) after a number of hops. It was demonstrated on the last generation prototype that the BERs are maintained below the FEC limit after five hops [14]. For the current prototype, we expected that eight hop counts or even more can be easily achieved before reaching the FEC limit.
To maintain the quality of the signal, a time-to-live (TTL) field is set in the label of each packet before the transmission and is deducted by one each time the packet passes a label processor. If the packet follows the shortest path without being buffered (optical buffering or electronic buffering), the number of times that it passes label processors is called the minimum TTL required by this packet. Here, we define an additional TTL, denoted by
\begin{equation*}\textrm{TTL}=\text{additional TTL}+\text{minimum TTL}.\tag{2}\end{equation*}
Recall that ExPs are established by exclusively reserving paths for a long period of time, compared to the transmission duration of a single ordinary packet. As discussed in
[13], based on the above OPS routing and contention resolution rules, ordinary packets can be trapped in path loops. Examples are shown in
Fig. 3. This figure is the same as
Fig. 4 in
[13]. In
Fig. 3(a), one packet is contended with an ExP at the last router just before the destination router, the shortest path deflection routing is not applicable for contention resolution. Then the packet keeps circulating in the FDL until the ExP is torn down. This quickly decreases the TTL and hence easily causes the packet dropping.
Fig. 3(b) illustrates the case when an ordinary packet is circulating between two neighboring routers. Consider that in a 2-D torus-topology DCN, an ordinary packet whose destination is router
Examples of ordinary packet circulation (a) in FDLs of the same router and (b) among several routers.
To break these path loops and mitigate the performance degradation, additional contention resolution strategies have briefly been proposed in [13] as follows:
Intelligent FDL last: For an ordinary packet contenting with an ExP, FDLs can be used but with the lowest priority.
Back-last deflection routing: In the deflection routing, the port to the router where the packet came from in the last hop is given the lowest priority, despite the fact that this port may lead to the shortest path.
Path Loops
Even with the additional packet-ExP contention resolution methods, the existence of the OCS path still affect the OPS performance in a quite considerable degree. Examples for inappropriate ExPs that cause path-loops are demonstrated in Fig. 4, where Fig. 4(a)–(f) shows parallel ExPs and Fig. 4(g)–(j) shows orthogonal ExPs. Take Fig. 4(a) as an example. The blue links are on ExPs and the red lines are the path of the OPS packets. Consider an ordinary OPS packet arriving at node 1 with destination marked with red dot in the figure, ignoring FDL and buffer as they will not move the packet to other nodes of the network, supposing for very lightly loaded cases which means all ports not reserved by ExPs are always idle. The packet is forwarded as follows:
node 1
node 2: farther path deflection routing to the$\rightarrow$ direction$+\mathbf{e}_{1}$ node 2
node 1: shortest path routing (only one path)$\rightarrow$ node 1
node 3: farther path deflection routing; the back direction is given the lowest priority$\rightarrow$ node 3
node 1: shortest path routing$\rightarrow$ go to 1), packet circulates
Other cases in
Fig. 4(b)–(j) can be analyzed similarly according to the routing algorithms and contention resolution strategies. From the figures we can see that orthogonal ExPs and parallel ExPs can cause OPS packet path loops. OPS packets are trapped in the loops until their TTL values become zero and then they are dropped. This kind of packet dropping is not caused by traffic congestion but related to the relative positions of the packets' destinations and the ExPs. The packet dropping probability (PDP) can be high even for very lightly loaded cases. As seen from
Figs. 7 and
10, in the lightly loaded region, the PDP (for the case that
Bypassing Routes Strategy
To break the path loops of OPS packets, we propose to preserve bypassing routes for the OPS packets when an OCS path is established. Each link along the ExP is associated with at least one bypassing route, which is prohibited to be reserved by other ExPs. These bypassing routes are dedicated for the OPS packets to bypass the ExP without been trapped in path loops.
We denote a link by
via
:$\mathbf{e}_{j}$ $\mathbf{a} \rightarrow \mathbf{a}\, \boxplus\, \mathbf{e}_{j} \rightarrow \mathbf{a}\, \boxplus\, \mathbf{e}_{j}\, \boxplus\, \mathbf{e}_{i} \rightarrow \mathbf{a}\, \boxplus\, \mathbf{e}_{i}$ via
:$\mathbf{-e}_{j}$ $\mathbf{a} \rightarrow \mathbf{a}\, \boxminus\, \mathbf{e}_{j} \rightarrow \mathbf{a}\, \boxminus\, \mathbf{e}_{j}\, \boxplus\, \mathbf{e}_{i} \rightarrow \mathbf{a}\, \boxplus\, \mathbf{e}_{i}$
Similarly, for link
via
:$\mathbf{e}_{j}$ $\mathbf{a} \rightarrow \mathbf{a}\, \boxplus\, \mathbf{e}_{j} \rightarrow \mathbf{a}\, \boxplus\, \mathbf{e}_{j}\, \boxminus\, \mathbf{e}_{i} \rightarrow \mathbf{a}\, \boxminus\, \mathbf{e}_{i}$ via
:$\mathbf{-e}_{j}$ $\mathbf{a} \rightarrow \mathbf{a}\, \boxminus\, \mathbf{e}_{j} \rightarrow \mathbf{a}\, \boxminus\, \mathbf{e}_{j}\, \boxminus\, \mathbf{e}_{i} \rightarrow \mathbf{a}\, \boxminus\, \mathbf{e}_{i}$
For any link on the ExP, there are
Applying the bypassing route method, when an ExP is requested from a source router, the central controller will find an available shortest path for this ExP. This path has to pass the bypassing test before it can be set up. For each link on the path, the bypassing routes should be found not be reserved by other existing ExPs. Otherwise, the path for the ExP is re-selected. The bypassing routes can be shared by different links of the same ExP or by links from different ExPs.
Notice that bypassing routing strategy is substantially different with deflection routing. Deflection routing is applied to the OPS packets to resolve packet contentions, in both pure OPS scenario and OPS/ExP scenario. Bypassing routing is applied in the OPS/ExP scenario when setting up the ExPs, to avoid potential path loops of the OPS packets. However, bypassing routing strategy highly depends on the deflection routing strategy adopted in the network. Essentially, bypassing routing strategy intends to preserve appropriate links, such that the OPS packets escape from path loops through deflection routing. Therefore, the selection of these preserved links depends on the deflection routing strategy. In this paper, we consider a specific network topology, i.e., torus, with a given deflection routing strategy (as described in Section 2.2), while the concept of the bypassing routing can be applied to other network topologies with other contention resolution methods. Bear in mind that when setting up the ExPs, not only the ExP itself is reserved for the ExP packets but some links around the ExPs should be preserved for ordinary OPS packets as well.
Performance Evaluation
The DCN is simulated by a cycle-accurate
For each scenario, six times the simulations are done and averaged. In each run of simulation, a number of ExPs are set up before the OPS traffic are injected. The ExPs last for the whole simulation run, which corresponding to setting up ExPs for OCS traffic with extremely large flow length. This is practical, as according to [2], although 80% of the flows are smaller than 10 KB, most of the bytes are in the top 10% of large flows.
The simulation results with different degree values for 205 ExPs are shown in
Figs. 7
–9. The PDP for the case of 820 ExPs are shown in
Fig. 10. From
Figs. 7 and
10, we observe that the PDP is significantly reduced in the bypassing scenario. The PDP floor due to packet circulations for the lightly loaded cases without applying the bypassing strategies
The bypassing scenarios with different number of ExPs are also investigated with the PDP shown in Fig. 11. The number of ExPs are 0, 205, 410, and 820; correspondingly, 0%, 5%, 10%, and 20% routers are ExP sources, respectively. With the constraint that the PDP < 0.1%, the throughput are approximately 0.80, 0.70, 0.64, and 0.58, respectively. This reflects the relationship between the maximum number of simultaneous ExPs and the maximum carried OPS traffic.
In terms of the average delay, we find from
Fig. 9 that the average delay for the case with ExPs are slightly larger than the case with pure OPS. This is because the introduction of the ExPs will result in a longer path for ordinary packets due to deflection routing. There is not much difference in the average latency for cases with different degree values. For the case with
Conclusion
Setting up OCS paths (i.e., ExPs) inappropriately in OPS-based networks could cause severe performance degradation, because the ExPs may break the original topology and cause OPS packet circulations. We propose to provide bypassing routes for the ordinary packets when setting up ExPs. This method tries to set up the ExPs as sparsely as possible so that performance of the ordinary packets will not be significantly affected. The proposal facilitates the existence of OCS communications in OPS-based networks.