生成式人工智能訓(xùn)練數(shù)據(jù)風(fēng)險(xiǎn)的規(guī)制路徑研究-AET-電子技術(shù)應(yīng)用

生成式人工智能訓(xùn)練數(shù)據(jù)風(fēng)險(xiǎn)的規(guī)制路徑研究

網(wǎng)絡(luò)安全與數(shù)據(jù)治理

邢露元1，沈心怡2，王嘉怡3

1 南京大學(xué) 法學(xué)院，江蘇南京210046；2 倫敦政治經(jīng)濟(jì)學(xué)院法學(xué)院，英國倫敦WC2A 2AE； 3 東北農(nóng)業(yè)大學(xué)文理學(xué)院，黑龍江哈爾濱150030

摘要： 探討了生成式人工智能如ChatGPT在訓(xùn)練數(shù)據(jù)方面的法律風(fēng)險(xiǎn)與規(guī)制問題。首先分析了生成式人工智能在數(shù)據(jù)來源、歧視傾向、數(shù)據(jù)質(zhì)量以及安全風(fēng)險(xiǎn)等方面的問題，通過對(duì)中歐法律體系的比較研究，建議明確界定治理原則，并針對(duì)數(shù)據(jù)合規(guī)性制定完善路徑。最后，從具體措施層面，對(duì)中國現(xiàn)行的法律規(guī)制提出了具體的完善建議，為生成式人工智能的健康發(fā)展與法律規(guī)制提供有益參考。

關(guān)鍵詞： 生成式人工智能人工智能法案訓(xùn)練數(shù)據(jù)風(fēng)險(xiǎn) 數(shù)據(jù)合規(guī)

中圖分類號(hào)：DF9文獻(xiàn)標(biāo)識(shí)碼：ADOI:10.19358/j.issn.2097-1788.2024.01.002
引用格式：邢露元，沈心怡，王嘉怡.生成式人工智能訓(xùn)練數(shù)據(jù)風(fēng)險(xiǎn)的規(guī)制路徑研究［J］.網(wǎng)絡(luò)安全與數(shù)據(jù)治理，2024，43（1）：10-18.

Legal regulation and enhancement path for mitigating risks in training

Xing Luyuan1，Shen Xinyi2，Wang Jiayi3

1 School of Law, Nanjing University, Nanjing 210046, China; 2 School of Law, London School of Economics and Political Science, London WC2A 2AE, England；3 School of Arts and Sciences, Northeast Agricultural University, Harbin 150030, China

Abstract： This article discusses the legal risks and regulatory issues of generative artificial intelligence such as ChatGPT in training data. It begins by analyzing issues related to the sources of data, tendencies towards discrimination, data quality, and security risks in generative AI. Subsequently, the article undertakes a comparative study of Chinese and European legal systems, proposing the clear definition of governance principles and the development of comprehensive pathways for data compliance. Finally, the article offers specific recommendations from a practical standpoint for the improvement of the current legal regulations in China. These suggestions are intended to serve as proper references for the healthy development and legal regulation of generative artificial intelligence.

Key words : generative AI; artificial intelligence act; training data risks; data compliance

生成式人工智能中的訓(xùn)練數(shù)據(jù)風(fēng)險(xiǎn)不同于以往僅能進(jìn)行分類、預(yù)測或?qū)崿F(xiàn)特定功能的模型，生成式人工智能大模型（Large Generative AI Models,LGAIMs）經(jīng)過訓(xùn)練可生成新的文本、圖像或音頻等內(nèi)容，且具有強(qiáng)大的涌現(xiàn)特性和泛化能力［1］。訓(xùn)練數(shù)據(jù)表示為概率分布，LGAIMs可以實(shí)現(xiàn)自行學(xué)習(xí)訓(xùn)練數(shù)據(jù)中的模式和關(guān)系，可以生成訓(xùn)練數(shù)據(jù)集之外的內(nèi)容［2］。同時(shí)，LGAIMs與用戶之間進(jìn)行人機(jī)交互所產(chǎn)生的數(shù)據(jù)還會(huì)被用于大模型的迭代訓(xùn)練。LGAIMs的開發(fā)者往往需要使用互聯(lián)網(wǎng)上公開的數(shù)據(jù)以及和用戶的交互數(shù)據(jù)作為訓(xùn)練數(shù)據(jù)，而這些數(shù)據(jù)可能存在諸多合規(guī)風(fēng)險(xiǎn)，例如數(shù)據(jù)來源風(fēng)險(xiǎn)、歧視風(fēng)險(xiǎn)和質(zhì)量風(fēng)險(xiǎn)。

作者信息：

邢露元1，沈心怡2，王嘉怡3

（1 南京大學(xué) 法學(xué)院，江蘇南京210046；2 倫敦政治經(jīng)濟(jì)學(xué)院法學(xué)院，英國倫敦WC2A 2AE；

3 東北農(nóng)業(yè)大學(xué)文理學(xué)院，黑龍江哈爾濱150030）

文章下載地址：http://forexkbc.com/resource/share/2000005886

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容