Existence, Stability and Scalability of Orthogonal Convolutional Neural Networks - IRT Saint Exupéry - Institut de Recherche Technologique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

Existence, Stability and Scalability of Orthogonal Convolutional Neural Networks

Résumé

Imposing orthogonality on the layers of neural networks is known to facilitate the learning by limiting the exploding/vanishing of the gradient; decorrelate the features; improve the robustness. This paper studies theoretical properties of orthogonal convolutional layers. \indent We establish necessary and sufficient conditions on the layer architecture guaranteeing the existence of an orthogonal convolutional transform. The conditions prove that orthogonal convolutional transforms exist for almost all architectures used in practice for 'circular' padding. We also exhibit limitations with 'valid' boundary condition and 'same' boundary condition with zero padding. \indent Recently, a regularization term imposing the orthogonality of convolutional layers has been proposed, and impressive empirical results have been obtained in different applications (Wang et al. 2020). The second motivation of the present paper is to specify the theory behind this. We make the link between this regularization term and orthogonality measures. In doing so, we show that this regularization strategy is stable with respect to numerical and optimization errors and that, in the presence of small errors and when the size of the signal/image is large, the convolutional layers remain close to isometric. The theoretical results are confirmed with experiments, the landscape of the regularization term is studied and the regularization strategy is validated on real datasets. \indent Altogether, the study guarantees that the regularization with L_{orth} (Wang et al. 2020) is an efficient, flexible and stable numerical strategy to learn orthogonal convolutional layers.
Fichier principal
Vignette du fichier
main.pdf (632.05 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03315801 , version 1 (09-08-2021)
hal-03315801 , version 2 (08-02-2022)
hal-03315801 , version 3 (10-01-2023)

Identifiants

Citer

El Mehdi Achour, François Malgouyres, Franck Mamalet. Existence, Stability and Scalability of Orthogonal Convolutional Neural Networks. 2022. ⟨hal-03315801v2⟩
282 Consultations
306 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More