基于Light-BotNet的激光點云分類研究-AET-電子技術(shù)應(yīng)用

基于Light-BotNet的激光點云分類研究

2022年電子技術(shù)應(yīng)用第6期

雷根華1，王蕾1，2，張志勇1

1.東華理工大學(xué) 信息工程學(xué)院，江西南昌330013； 2.江西省核地學(xué)數(shù)據(jù)科學(xué)與系統(tǒng)工程技術(shù)研究中心，江西南昌330013

摘要： 三維點云在機(jī)器人與自動駕駛中都有著普遍的應(yīng)用，深度學(xué)習(xí)在二維圖像上的研究成果顯著，但是如何利用深度學(xué)習(xí)識別不規(guī)則的三維點云，仍然是一個開放性的問題。目前大場景點云自身數(shù)據(jù)的復(fù)雜性，點云掃描距離的變化造成點的分布不均勻，噪聲和異常點引起的挑戰(zhàn)性依然存在。針對于現(xiàn)有的深度學(xué)習(xí)網(wǎng)絡(luò)框架對于激光點云數(shù)據(jù)的分類效率不高以及分類精度低的問題，提出一種基于激光點云特征圖像與Light-BotNet相結(jié)合的CNN-Transform框架。該框架在于通過對點云數(shù)據(jù)進(jìn)行特征提取，以相鄰的特征點構(gòu)造點云特征圖像作為網(wǎng)絡(luò)框架的輸入，最后以Light-BotNet為網(wǎng)絡(luò)框架模型進(jìn)行點云分類訓(xùn)練。實驗結(jié)果表明，該方法與現(xiàn)有的多數(shù)點云分類方法相比，能夠較好地提升激光點云的分類效率以及分類精度。

關(guān)鍵詞： 點云特征圖像 BotNet Transform CNN 激光點云分類

中圖分類號： TP391
文獻(xiàn)標(biāo)識碼： A
DOI：10.16157/j.issn.0258-7998.222725
中文引用格式： 雷根華，王蕾，張志勇. 基于Light-BotNet的激光點云分類研究[J].電子技術(shù)應(yīng)用，2022，48(6)：84-88，97.
英文引用格式： Lei Genhua，Wang Lei，Zhang Zhiyong. Research on laser point cloud classification based on Light-BotNet[J]. Application of Electronic Technique，2022，48(6)：84-88，97.

Research on laser point cloud classification based on Light-BotNet

Lei Genhua1，Wang Lei1，2，Zhang Zhiyong1

1.School of Information Engineering，East China University of Technology，Nanchang 330013，China； 2.Jiangxi Engineering Technology Research Center of Nuclear Geoscience Data Science and System，Nanchang 330013，China

Abstract： Three dimensional point clouds are widely used in robots and automatic driving. The research results of deep learning on two-dimensional images are remarkable, but how to use deep learning to identify irregular three-dimensional point clouds is still an open problem. At present, due to the complexity of the data of the scenic spot cloud itself, the uneven distribution of points caused by the change of the scanning distance of the point cloud, and the challenges caused by noise and abnormal points still exist. Aiming at the problems of low classification efficiency and low classification accuracy of the existing deep learning Network framework for laser point cloud data, a CNN Transform framework based on laser point cloud feature image and Light-BotNet is proposed. The framework is to extract the features of point cloud data, construct the point cloud feature image with adjacent feature points as the input of the network framework, and finally take Light-BotNet as the network framework model for point cloud classification training. The experimental results show that compared with most existing point cloud classification methods, this method can better improve the classification efficiency and accuracy of laser point cloud.

Key words : point cloud feature image；BotNet；Transform；CNN；laser point cloud classification

0 引言

大多的深度學(xué)習(xí)點云分類方法都是采用卷積層與池化層交替實現(xiàn)的，卷積層中的神經(jīng)元僅與上一層的部分區(qū)域相連接，學(xué)習(xí)局部特征，在點云數(shù)據(jù)特征提取時容易丟失部分特征，從而導(dǎo)致分類精度下降等問題。而Transform的提出則帶來了一種新的思路，主要利用自我注意機(jī)制提取內(nèi)在特征^[1-3]。Transform最初應(yīng)用在自然語言處理(NLP)領(lǐng)域，并且取得了重大的成功，受到NLP中Transformer功能的啟發(fā)，研究人員開始將Transformer應(yīng)用在計算機(jī)視覺(CV)任務(wù)。研究發(fā)現(xiàn)CNN曾是視覺應(yīng)用的基本組件^[4-5]，但Transformer正在顯示其作為CNN替代品的能力。Chen等人^[6]訓(xùn)練序列變換器，以自回歸預(yù)測像素，并在圖像分類任務(wù)上與CNN取得競爭性結(jié)果。卷積操作擅長提取細(xì)節(jié)，但是在大數(shù)據(jù)量的大場景三維點云數(shù)據(jù)分類任務(wù)中，要掌握三維點云的全局信息往往需要堆疊很多個卷積層，而Transform中的注意力善于把握整體信息，但又需要大量的數(shù)據(jù)進(jìn)行訓(xùn)練。

BotNet^[7]網(wǎng)絡(luò)是伯克利與谷歌的研究人員在Convolution+Transformer組合方面一個探索，它采用混合方式同時利用了CNN的特征提取能力、Transformer的內(nèi)容自注意力與位置自注意力機(jī)制，取得了優(yōu)于純CNN或者自注意力的性能，在ImageNet中取得了84.7%的精度。將CNN與Transform結(jié)合起來，達(dá)到取長補短的效果。BoTNet與ResNet^[8]網(wǎng)絡(luò)框架的不同之處在于：ResNet^[8]框架在最后3個bottleneck blocks中使用的是3×3的空間卷積，而BotNet框架則是采用全局自我注意替代空間卷積。帶自注意力模塊的Bottleneck模塊可以視作Transformer模塊。

本文詳細(xì)內(nèi)容請下載：http://forexkbc.com/resource/share/2000004426。

作者信息：

雷根華1，王蕾1，2，張志勇1

(1.東華理工大學(xué) 信息工程學(xué)院，江西南昌330013；

2.江西省核地學(xué)數(shù)據(jù)科學(xué)與系統(tǒng)工程技術(shù)研究中心，江西南昌330013)

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容