Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization

Haoran Wang; Chong Li; Thibaut Tachon; Hongxing Wang; Sheng Yang; Sébastien Limet; Sophie Robert

doi:10.1007/978-3-030-85665-6_13

Communication Dans Un Congrès Année : 2021

Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization

(1, 2) , (2) , (2) , (2) , (2) , (1) , (1)

1
2

Haoran Wang

Fonction : Auteur
PersonId : 1252258
ORCID : 0000-0001-5828-5545
IdRef : 269627553

Laboratoire d'Informatique Fondamentale d'Orléans

Huawei Technologies France [Boulogne-Billancourt]

Chong Li

Fonction : Auteur

Huawei Technologies France [Boulogne-Billancourt]

Thibaut Tachon

Fonction : Auteur

Huawei Technologies France [Boulogne-Billancourt]

Hongxing Wang

Fonction : Auteur

Huawei Technologies France [Boulogne-Billancourt]

Sheng Yang

Fonction : Auteur

Huawei Technologies France [Boulogne-Billancourt]

Sébastien Limet

Fonction : Auteur

Laboratoire d'Informatique Fondamentale d'Orléans

Sophie Robert

Fonction : Auteur
PersonId : 6783
IdHAL : sophie-robert
IdRef : 194671062

Laboratoire d'Informatique Fondamentale d'Orléans

Résumé

Deep neural networks (DNNs) are playing an increasingly important role in our daily life. Since the size of DNNs is continuously growing up, it is highly important to train them effectively by distributing computation on multiple connected devices. The efficiency of training depends on the quality of chosen parallelization strategy. Being able to find a good parallelization strategy for a DNN in a reasonable amount of time is not trivial. Previous research demonstrated the possibility to systematically generate good parallelization strategies. However, systematic partitioning still suffers from either a heavy preprocessing or poor quality of parallelization. In this paper, we take a purely symbolic analysis approach by leveraging the features of DNNs like dense tensor balanced computation. We propose the Flex-Edge Recursive Graph and the Double Recursive Algorithm, successfully limiting our parallelization strategy generation to a linear complexity with a good quality of parallelization strategy. The experiments show that our solution significantly reduces the parallelization strategy generation time from hours to seconds while maintaining the parallelization quality.

Mots clés

Distributed algorithm Distributed machine learning Neural network partitioning

Domaines

Logique en informatique [cs.LO]

Sébastien Limet : Connectez-vous pour contacter le contributeur

https://univ-orleans.hal.science/hal-03526611

Soumis le : vendredi 14 janvier 2022-15:54:29

Dernière modification le : vendredi 5 mai 2023-12:00:49

Dates et versions

hal-03526611 , version 1 (14-01-2022)

Identifiants

HAL Id : hal-03526611 , version 1
DOI : 10.1007/978-3-030-85665-6_13

Citer

Haoran Wang, Chong Li, Thibaut Tachon, Hongxing Wang, Sheng Yang, et al.. Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization. European Conference on Parallel Processing EuroPar 2021, 2021, Lisbon, Portugal. pp.201-216, ⟨10.1007/978-3-030-85665-6_13⟩. ⟨hal-03526611⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-ORLEANS INSA-GROUPE INSA-CVL

153 Consultations

0 Téléchargements

Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager