基于自相关强化和原型监督的小样本语义分割方法

刘佳鑫; 田成军; 黄丹丹; 刘智

doi:10.5768/JAO202647.0202006

基于自相关强化和原型监督的小样本语义分割方法

Based on self-correlation enhancement and prototype supervision for few-shot semantic segmentation

摘要

摘要: 小样本语义分割旨在完成标注稀缺条件下的像素级分类任务。为进一步提升基于原型网络的小样本语义分割对不可见类的泛化能力，针对支持样本与查询图像之间存在外观差异、原型质量不佳的问题，提出了一种基于自相关强化和原型监督的小样本语义分割方法。首先，设计自相关强化模块，利用查询图像内部像素间的自相关性驱使初始辅助先验向查询数据迁移，生成高层次类原型以得到具有高指点性的强化先验信息；其次，引入多重渐进式监督损失，以原型复原支持掩码的程度为原型质量监督指标，对原型进行自正则化更新，对辅助先验进行自匹配更新，有效提高了原型对支持信息的概括能力，鼓励辅助先验更多地保留与查询特征相关联的细节。所提出的方法在小样本基准数据集PASCAL-5ⁱ上进行验证，结果表明，1-shot设定下mIoU(mean intersection over union)值达到64.4%，FB-IoU(foreground-background intersection over union)值达到73.5%，该方法具备一定的先进性和有效性。

Abstract: Few-shot semantic segmentation aims to perform pixel-level classification tasks under conditions of limited annotations. To further enhance the generalization ability of few-shot semantic segmentation based on prototype networks for unseen classes, we proposed a method based on self-correlation reinforcement and prototype supervision, addressing the issues of appearance discrepancies between support samples and query images and poor prototype quality. Firstly, we designed a self-correlation reinforcement module that leveraged the self-correlation of pixels within the query image to transfer initial auxiliary priors to the query data, generating high-level class prototypes that provided reinforced prior information with high discriminative power. Secondly, we introduced multiple progressive supervision losses, using the prototype's ability to recover the support mask as a supervision indicator for prototype quality. This allowed for self-regularized updates of the prototypes and self-matching updates of the auxiliary priors, effectively enhancing the prototype's ability to generalize from the support information and encouraging the auxiliary priors to retain more query-relevant details. The proposed method was validated on the few-shot benchmark dataset PASCAL-5i. Experimental results show that, under the 1-shot setting, the method achieves an mIoU(mean intersection over union) of 64.4% and an FB-IoU(foreground-background intersection over union) of 73.5%, demonstrating its effectiveness and advanced nature.

HTML全文

参考文献(28)

施引文献

资源附件(0)