XBound-Former: Toward Cross-Scale Boundary Modeling in Transformers

Skin lesion segmentation from dermoscopy images is of great significance in the quantitative analysis of skin cancers, which is yet challenging even for dermatologists due to the inherent issues, i.e., considerable size, shape and color variation, and ambiguous boundaries. Recent vision transformers have shown promising performance in handling the variation through global context modeling. Still, they have not thoroughly solved the problem of ambiguous boundaries as they ignore the complementary usage of the boundary knowledge and global contexts. In this paper, we propose a novel cross-scale boundary-aware transformer, XBound-Former, to simultaneously address the variation and boundary problems of skin lesion segmentation. XBound-Former is a purely attention-based network and catches boundary knowledge via three specially designed learners. First, we propose an implicit boundary learner (im-Bound) to constrain the network attention on the points with noticeable boundary variation, enhancing the local context modeling while maintaining the global context. Second, we propose an explicit boundary learner (ex-Bound) to extract the boundary knowledge at multiple scales and convert it into embeddings explicitly. Third, based on the learned multi-scale boundary embeddings, we propose a cross-scale boundary learner (X-Bound) to simultaneously address the problem of ambiguous and multi-scale boundaries by using learned boundary embedding from one scale to guide the boundary-aware att...
Source: IEE Transactions on Medical Imaging - Category: Biomedical Engineering Source Type: research