Skip to Main content Skip to Navigation
Journal articles

Local Fast Failover Routing With Low Stretch

Abstract : Network failures are frequent and disruptive, and can significantly reduce the throughput even in highly connected and regular networks such as datacenters. While many modern networks support some kind of local fast failover to quickly reroute flows encountering link failures to new paths, employing such mechanisms is known to be non-trivial, as conditional failover rules can only depend on local failure information. While over the last years, important insights have been gained on how to design failover schemes providing high resiliency, existing approaches have the shortcoming that the resulting failover routes may be unnecessarily long, i.e., they have a large stretch compared to the original route length. This is a serious drawback, as long routes entail higher la-tencies and introduce loads, which may cause the rerouted flows to interfere with existing flows and harm throughput. This paper presents the first deterministic local fast failover algorithms providing provable resiliency and failover route lengths, even in the presence of many concurrent failures. We present stretch-optimal failover algorithms for different network topologies, including multi-dimensional grids, hyper-cubes and Clos networks, as they are frequently deployed in the context of HPC clusters and datacenters. We show that the computed failover routes are optimal in the sense that no failover algorithm can provide shorter paths for a given number of link failures.
Complete list of metadata

Cited literature [22 references]  Display  Hide  Download

https://hal.laas.fr/hal-02043783
Contributor : Gilles Tredan <>
Submitted on : Thursday, February 21, 2019 - 11:18:49 AM
Last modification on : Thursday, June 10, 2021 - 3:02:59 AM
Long-term archiving on: : Wednesday, May 22, 2019 - 2:05:18 PM

File

ccr18failover.pdf
Files produced by the author(s)

Identifiers

Citation

Klaus-Tycho Foerster, Yvonne-Anne Pignolet, Stefan Schmid, Gilles Trédan. Local Fast Failover Routing With Low Stretch. Computer Communication Review, Association for Computing Machinery, 2018, 48 (1), pp.35-41. ⟨10.1145/3211852.3211858⟩. ⟨hal-02043783⟩

Share

Metrics

Record views

102

Files downloads

188