Licence Creative Commons Spherical Perspective on Learning with Batch Normalization by Simon Roburrin (LIGM) [28 juin 2021]

 Description

GdR ISIS Théorie du deep learning - June 28, 2021

Spherical Perspective on Learning with Batch Normalization

By Simon Roburrin (LIGM)

Batch Normalization (BN) is a prominent deep learning technique. In spite of its apparent simplicity, its implications over optimization are yet to be fully understood. In this paper, we introduce a spherical framework to study the optimization of neural networks with BN layers from a geometric perspective. More precisely, we leverage the radial invariance of groups of parameters, such as filters for convolutional neural networks, to translate the optimization steps on the L2 unit hypersphere. This formulation and the associated geometric interpretation shed new light on the training dynamics. Firstly, we use it to derive the first effective learning rate expression of Adam. Then we show that, in the presence of BN layers, performing SGD alone is actually equivalent to a variant of Adam constrained to the unit hypersphere. Finally, our analysis outlines phenomena that previous variants of Adam act on and we experimentally validate their importance in the optimization process.

This is a cowork by Simon Roburin, Yann de Mont-Marin, Andrei Bursuc, Renaud Marlet, Patrick Pérez, Mathieu Aubry

 Informations

  • Ajouté par :

  • Contributeur(s) :

    • Simon Roburrin (LIGM) (auteur)
  • Mis à jour le :

    29 juin 2021 16:47
  • Durée :

    00:22:08
  • Nombre de vues :

    27
  • Type :

  • Langue principale :

    Français
  • Public :

    Autre
  • Discipline(s) :

 Téléchargements

 Intégrer/Partager

Réseaux sociaux

 Options
Cocher cette case pour lancer la lecture automatiquement.
Cocher cette case pour lire la vidéo en boucle.
Cocher la case pour indiquer le début de lecture souhaité.
 Intégrer dans une page web
 Partager le lien
qrcode