世界科技研究与发展 ›› 2023, Vol. 45 ›› Issue (1): 26-40.doi: 10.16507/j.issn.1006-6055.2022.07.001

• 科技前沿与进展 • 上一篇    下一篇

半监督学习方法研究综述

李永国1,2 徐彩银1,2 汤璇1,2 李祥燕1,2   

  1. 1.上海海洋大学工程学院,上海201306;2.上海海洋可再生能源工程技术研究中心,上海201306
  • 出版日期:2023-02-25 发布日期:2023-03-13
  • 基金资助:
    国家自然科学基金面上项目“基于磁流体发电原理的海洋可再生能源利用中的基础问题研究”(51876114),上海市科学技术委员会资助项目“上海海洋可再生能源工程技术研究中心”(19DZ2254800)

A Review of Semi-supervised Learning Methods Research

LI Yongguo1,2   XU Caiyin1,2   TANG Xuan1,2   LI Xiangyan1,2   

  1. 1. School of Engineering, Shanghai Ocean University, Shanghai 201306, China; 2. Shanghai Marine Renewable Energy Engineering Technology Research Center, Shanghai 201306, China
  • Online:2023-02-25 Published:2023-03-13

摘要:

半监督学习存在于现实世界的各个场景中,可在生物化学领域对科学研究产生巨大的作用。在各领域也都有相关具体应用,如病毒毒性预测、网络安全检测、软传感器的应用等。随着机器学习领域的不断突破,目前尚缺乏关于半监督学习方法研究的完整综述。本文首先给出半监督学习的定义并分析了该领域应用过程中存在的挑战;然后梳理分析了半监督学习的四种方法,包括:半监督聚类、降维、回归、分类,并列出了这四种不同方法中比较先进的算法。随后介绍了各算法常见的评价指标(如精确率、召回率和ROC曲线等),对比了各类半监督学习算法效果,研究发现半监督学习方法都存在高于完全监督学习支持向量机的准确率,其中SSC-EKE算法以绝对优势领先传统的支持向量机经典监督学习算法。最后介绍了半监督学习的实际应用场景,展望了半监督学习的未来研究方向,并对全文进行总结。

关键词: 半监督学习, 半监督聚类, 半监督降维, 半监督回归, 半监督分类, 评价指标

Abstract:

Semi-supervised learning exists in various real-world scenarios and can signifucantly impact scientific research in the field of biochemistry. There are also relevant specific applications in various fields, such as virus toxicity prediction, network security detection, application of soft sensors, etc. With the continuous breakthroughs in machine learning, there currently needs to be a complete review of research on semi-supervised learning methods and analyzes the challenges existing in the application process in this field; then, it sorts out and analyzes four methods of semi-supervised learning, including semi-supervised clustering, dimensionality reduction, regression, classification, and more advanced algorithms in these four different ways are written side by side. Then, the typical evaluation indicators of each algorithm (such as precision rate, recall rate, ROC curve, etc.) were introduced, and the effects of various semi-supervised learning algorithms were compared. The study found that semi-supervised learning methods are more accurate than fully supervised learning support vector machines rate, in which the SSC-EKE algorithm leads the traditional support vector machine classic supervised learning algorithm by absolute advantage. Finally, the practical application scenarios of semi-supervised learning are introduced, the future research directions of semi-supervised learning have prospected, and the full text is summarized.

Key words: