一種深度神經(jīng)網(wǎng)絡(luò)的分布式訓(xùn)練方法-AET-電子技術(shù)應(yīng)用

一種深度神經(jīng)網(wǎng)絡(luò)的分布式訓(xùn)練方法

電子技術(shù)應(yīng)用 2023年3期

原野1，田園1，蔣七兵2，3

（1.云南電網(wǎng)有限責(zé)任公司信息中心，云南昆明 650214；2.云南云電同方科技有限公司云南昆明 650214； 3.西南林業(yè)大學(xué) 大數(shù)據(jù)與智能工程學(xué)院，云南昆明 650224）

摘要： 深度神經(jīng)網(wǎng)絡(luò)在高維數(shù)據(jù)的分類和預(yù)測(cè)中取得了巨大成功。訓(xùn)練深度神經(jīng)網(wǎng)絡(luò)是數(shù)據(jù)密集型的任務(wù)，需從多個(gè)數(shù)據(jù)源收集大規(guī)模的數(shù)據(jù)。這些數(shù)據(jù)中通常包含敏感信息時(shí)，使得深度神經(jīng)網(wǎng)絡(luò)的訓(xùn)練過程容易泄露數(shù)據(jù)隱私。針對(duì)訓(xùn)練過程中的數(shù)據(jù)隱私和通信代價(jià)問題，提出了一種深度神經(jīng)網(wǎng)絡(luò)的分布式訓(xùn)練方法，允許基于多個(gè)數(shù)據(jù)源共同訓(xùn)練深度神經(jīng)網(wǎng)絡(luò)。首先，提出了分布式訓(xùn)練架構(gòu)，由1個(gè)計(jì)算中心和多個(gè)代理組成。其次，提出了基于多代理的分布式訓(xùn)練算法，允許代理在數(shù)據(jù)不出本地和減少通信代價(jià)的情況下，通過切割深度神經(jīng)網(wǎng)絡(luò)，實(shí)現(xiàn)分布式地共同訓(xùn)練模型。然后，分析了算法的正確性。最后，實(shí)驗(yàn)結(jié)果表明該方法是有效的。

關(guān)鍵詞： 深度神經(jīng)網(wǎng)絡(luò) 分布式訓(xùn)練監(jiān)督學(xué)習(xí) 隱私保護(hù)

中圖分類號(hào)：TP311 文獻(xiàn)標(biāo)志碼：A DOI: 10.16157/j.issn.0258-7998.223244
中文引用格式： 原野，田園，蔣七兵. 一種深度神經(jīng)網(wǎng)絡(luò)的分布式訓(xùn)練方法[J]. 電子技術(shù)應(yīng)用，2023，49(3)：48-53.
英文引用格式： Yuan Ye，Tian Yuan，Jiang Qibing. Distributed training method for deep neural networks[J]. Application of Electronic Technique，2023，49(3)：48-53.

Distributed training method for deep neural networks

Yuan Ye1，Tian Yuan1，Jiang Qibing2，3

(1.Information Center， Yunnan Power Grid Co.， Ltd.， Kunming 650214， China； 2.Yunnan Yundian Tongfang Technology Co.， Ltd.， Kunming 650214， China； 3.School of Big Data and Intelligent Engineering， Southwest Forestry University， Kunming 650224， China)

Abstract： Abstract： Deep neural networks have achieved great success in classification and prediction of high-dimensional data. Training deep neural networks is a data-intensive task, which needs to collect large-scale data from multiple data sources. These data usually contain sensitive information, which makes the training process of convolutional neural networks easy to leak data privacy. Aiming at the problems of data privacy and communication cost in the training process, this paper proposes a distributed training method for deep neural networks, which allows to jointly learn deep neural networks based on multiple data sources. Firstly, a distributed training architecture is proposed, which is composed of one computing center and multiple agents. Secondly, a distributed training algorithm based on multiple data sources is proposed, which allows to distributed jointly train models through the splitting of convolutional neural networks under the constraints that raw data are not shared directly and the communication cost is reduced. Thirdly, the correctness of the algorithm is analyzed. Finally, the experimental results show that our method is effective.

Key words : deep neural network；distributed training；supervised learning；privacy guarantee

0　引言

深度神經(jīng)網(wǎng)絡(luò)在高維數(shù)據(jù)的分類和預(yù)測(cè)中取得了巨大成功，例如圖像、視頻和音頻。但是，訓(xùn)練深度神經(jīng)網(wǎng)絡(luò)是數(shù)據(jù)密集型的任務(wù)，往往需要從多個(gè)數(shù)據(jù)源收集大規(guī)模的數(shù)據(jù)。一個(gè)深度神經(jīng)網(wǎng)絡(luò)模型通常包含百萬級(jí)參數(shù)，需要大量數(shù)據(jù)和算力來訓(xùn)練這些參數(shù)。

當(dāng)訓(xùn)練數(shù)據(jù)包含敏感信息時(shí)，深度神經(jīng)網(wǎng)絡(luò)的訓(xùn)練過程往往會(huì)泄露隱私。如果訓(xùn)練數(shù)據(jù)中包含用戶信息、管理信息等高度敏感的信息，數(shù)據(jù)所有者通常不希望公開這些敏感數(shù)據(jù)。因此，深度神經(jīng)網(wǎng)絡(luò)因數(shù)據(jù)隱私而在現(xiàn)實(shí)應(yīng)用中受到限制。

為了控制深度神經(jīng)網(wǎng)絡(luò)訓(xùn)練過程中數(shù)據(jù)的隱私泄露，一種可行的解決方案是使用集中式的隱私保護(hù)深度學(xué)習(xí)方法。該方法依賴于一個(gè)可信的集中式計(jì)算環(huán)境，訓(xùn)練過程使用全局差分隱私算法擾亂訓(xùn)練數(shù)據(jù)，從而實(shí)現(xiàn)數(shù)據(jù)的隱私保護(hù)。在這種方式下，多個(gè)數(shù)據(jù)源需要信任云服務(wù)器，并將擁有的數(shù)據(jù)上傳到云服務(wù)器，而云服務(wù)器將使用各數(shù)據(jù)源上傳的數(shù)據(jù)集中訓(xùn)練深度神經(jīng)網(wǎng)絡(luò)。但是，這種方法因要求各代理共享數(shù)據(jù)而在實(shí)際使用中受到限制。

相比集中式的隱私保護(hù)深度學(xué)習(xí)方法，分布式的學(xué)習(xí)方法更適合實(shí)際應(yīng)用，因?yàn)榉植际降膶W(xué)習(xí)方法并不需要各數(shù)據(jù)源（代理）共享上傳數(shù)據(jù)，也不需要集中式的可信計(jì)算環(huán)境。聯(lián)邦學(xué)習(xí)是一種分布式的學(xué)習(xí)方法。在聯(lián)邦學(xué)習(xí)中，各數(shù)據(jù)源在不共享數(shù)據(jù)的情況下，通過參數(shù)聚合機(jī)制共同訓(xùn)練卷積神經(jīng)網(wǎng)絡(luò)。但是，該方法在訓(xùn)練過程中將面臨巨大的通信開銷。

針對(duì)上述問題，本文提出了一種深度神經(jīng)網(wǎng)絡(luò)的分布式訓(xùn)練方法。該方法允許多個(gè)數(shù)據(jù)源在不共享數(shù)據(jù)的情況下，通過網(wǎng)絡(luò)切割，共同訓(xùn)練深度神經(jīng)網(wǎng)絡(luò)，并減少訓(xùn)練過程中的通信開銷。具體地，本文主要貢獻(xiàn)包括以下幾點(diǎn)：

(1) 提出了一種深度神經(jīng)網(wǎng)絡(luò)的分布式訓(xùn)練方法，允許代理在數(shù)據(jù)不出本地的情況下，通過網(wǎng)絡(luò)切割，實(shí)現(xiàn)模型的分布式共同訓(xùn)練；

(2) 分析了該方法的正確性；

(3) 通過實(shí)驗(yàn)，驗(yàn)證了該方法的有效性。

本文詳細(xì)內(nèi)容請(qǐng)下載：http://forexkbc.com/resource/share/2000005228

作者信息：

原野1，田園1，蔣七兵2，3

（1.云南電網(wǎng)有限責(zé)任公司信息中心，云南昆明 650214；2.云南云電同方科技有限公司云南昆明 650214；

3.西南林業(yè)大學(xué) 大數(shù)據(jù)與智能工程學(xué)院，云南昆明 650224）

微信圖片_20210517164139.jpg

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容