H2Former: An Efficient Hierarchical Hybrid Transformer for Medical Image Segmentation

Page view(s)
746
Checked on Nov 04, 2024
H2Former: An Efficient Hierarchical Hybrid Transformer for Medical Image Segmentation
Title:
H2Former: An Efficient Hierarchical Hybrid Transformer for Medical Image Segmentation
Journal Title:
IEEE Transactions on Medical Imaging
Publication Date:
05 April 2023
Citation:
He, A., Wang, K., Li, T., Du, C., Xia, S., & Fu, H. (2023). H2Former: An Efficient Hierarchical Hybrid Transformer for Medical Image Segmentation. IEEE Transactions on Medical Imaging, 1–1. https://doi.org/10.1109/tmi.2023.3264513
Abstract:
Accurate medical image segmentation is of great significance for computer aided diagnosis. Although methods based on convolutional neural networks (CNNs) have achieved good results, it is weak to model the long-range dependencies, which is very important for segmentation task to build global context dependencies. The Transformers can establish long-range dependencies among pixels by self-attention, providing a supplement to the local convolution. In addition, multi-scale feature fusion and feature selection are crucial for medical image segmentation tasks, which is ignored by Transformers. However, it is challenging to directly apply self-attention to CNNs due to the quadratic computational complexity for high-resolution feature maps. Therefore, to integrate the merits of CNNs, multi-scale channel attention and Transformers, we propose an efficient hierarchical hybrid vision Transformer (H2Former) for medical image segmentation. With these merits, the model can be data-efficient for limited medical data regime. The experimental results show that our approach exceeds previous Transformer, CNNs and hybrid methods on three 2D and two 3D medical image segmentation tasks. Moreover, it keeps computational efficiency in model parameters, FLOPs and inference time. For example, H2Former outperforms TransUNet by 2.29% in IoU score on KVASIR-SEG dataset with 30.77% parameters and 59.23% FLOPs.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the A*STAR - Career Development Fund
Grant Reference no. : C222812010

This research / project is supported by the AI Singapore - Tech Challenge Funding
Grant Reference no. : AISG2-TC-2021-003

This work is partially supported by the National Natural Science Foundation (62272248), CAAI-Huawei MindSpore Open Fund (CAAIXSJLJJ2021-025A)
Description:
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISSN:
1558-254X
0278-0062
Files uploaded:

File Size Format Action
h2former.pdf 1.15 MB PDF Request a copy