Abstract:
We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the l...Show MoreMetadata
Abstract:
We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billion masks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks. We evaluate its capabilities on numerous tasks and find that its zero-shot performance is impressive – often competitive with or even superior to prior fully supervised results. We are releasing the Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and 11M images at segment-anything.com to foster research into foundation models for computer vision. We recommend reading the full paper at: arxiv.org/abs/2304.02643.
Date of Conference: 01-06 October 2023
Date Added to IEEE Xplore: 15 January 2024
ISBN Information:
ISSN Information:
Citations are not available for this document.
Cites in Papers - |
Cites in Papers - IEEE (1671)
Select All
1.
Xianzhen Tan, Zhe Qu, Jie Wang, Hulin Kuang, "A Clinical Knowledge-Driven Fine-Tuning Strategy for Applying Foundation Model to Fully Automatic Acute Ischemic Stroke Lesion Segmentation on Non-Contrast CT Scans", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
2.
Seungkwon Kim, GyuTae Park, Sangyeon Kim, Seung-Hun Nam, "VisAgent: Narrative-Preserving Story Visualization Framework", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
3.
Weiting Wang, Weiqi Wang, Feilong Bao, "A High-Precision Character Cartoon Style Transfer Method Based on VToonify and Diffusion Models", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
4.
Wenqi Shan, Qiang Li, Zhiwei Wang, "SPNet: Sparse-mask Prompt-learning Network for Cerebrovascular Segmentation", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
5.
Xiaoling Wang, Ruilong Xing, Zhuotao Tian, Yijun Liu, Senqiao Yang, Yaowei Wang, Jingyong Su, "C2AD: Dual Consistency Learning for Zero-Shot Anomaly Detection", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
6.
Yuanfeng Xu, Yuhao Chen, Zhongzhan Huang, Zijian He, Guangrun Wang, Liang Lin, "Anima2: Cross-Species Animal Animation through Image-to-Video Synthesis with Subject Alignment", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
7.
Haiwen Li, Delong Liu, Fei Su, Zhicheng Zhao, "Object-Centric Discriminative Learning for Text-Based Person Retrieval", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
8.
Mengxue Kang, Xinyu Zhang, Fei Wei, Shuang Xu, Yuhe Liu, "Enhancing Image Editing with Chain-of-Thought Reasoning and Multimodal Large Language Models", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
9.
Shiyu Miao, Delong Chen, Fan Liu, Chuanyi Zhang, Yanhui Gu, Shengjie Guo, Jun Zhou, "Prompting DirectSAM for Semantic Contour Extraction in Remote Sensing Images", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
10.
Xinrun Chen, Chengliang Wang, Haojian Ning, Mengzhan Zhang, Mei Shen, Shiying Li, "SAM-OCTA2: Layer Sequence OCTA Segmentation with Fine-tuned Segment Anything Model 2", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
11.
Guillaume Buthmann, Tomoya Sakai, Haoxiang Qiu, Takayuki Katsuki, Daiki Kimura, "Text-Guided Few-Shot Semantic Segmentation with Training-Free Multimodal Feature Matching", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
12.
Jingyi Wang, Jianzhong Ju, Jian Luan, Zhidong Deng, "LLaVA-SG: Leveraging Scene Graphs as Visual Semantic Expression in Vision-Language Models", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
13.
Dongqi Fan, Tao Chen, Mingjie Wang, Rui Ma, Qiang Tang, Zili Yi, Qian Wang, Liang Chang, "One-Shot Learning for Pose-Guided Person Image Synthesis in the Wild", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
14.
Weiji Kong, Xun Gong, Juan Wang, "LCE: A Framework for Explainability of Ultrasound Image Based on Concept Discovery", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
15.
Minfan Zhao, Ziqi Zhu, Jun Shi, Zhaohui Wang, Junshi Chen, Hong An, Bing Yan, "PromptSeg: Learning to Segment Medical Image via Visual Prompts", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
16.
Zhangchen Zhu, Jiafeng Li, Ying Wen, "Self-Optimization Training for Weakly Supervised Image Manipulation Localization", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
17.
Lulin Li, Ben Chen, Xuechao Zou, Junliang Xing, Pin Tao, "UV-Mamba: A DCN-Enhanced State Space Model for Urban Village Boundary Identification in High-Resolution Remote Sensing Images", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
18.
Wei Chen, Chen Li, Wenjuan Zhou, Yuhang Li, Tianhang Guo, Yuhua Tang, "Exploiting Foundation Models for Label-Efficient Few-Shot Learning via Feature Coupling: A Case Study of cardiac CT Segmentation", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
19.
Jingyun Xue, Tao Wang, Pengwen Dai, Kaihao Zhang, "Segmentation-Guided Sparse Transformer for Under-Display Camera Image Restoration", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
20.
Sheng Wei, Song Qiu, Mei Zhou, He Zhang, Yan Wang, Qingli Li, "Self-Prompting Driven SAM2 for 3D Medical Image Segmentation", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
21.
Dongsheng Li, Chunyan Zang, Huijie Zhang, Yiming Lin, Qiushi Xia, "An Efficient Pore Annotation Framework for Tight Sandstone Images with Segment Anything Model", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
22.
Jing Jiang, Jiankun Zhu, Zhaopan Xu, Xi Chen, Sicheng Zhao, Hongxun Yao, "Gaussian Constrained Diffeomorphic Deformation Network for Panoramic Semantic Segmentation", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
23.
Zheng Zhang, Saket Sathe, "Low-shot Image Classification Using Mixture of Experts", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
24.
Yinzhou Ling, Jingjing Luo, Yuan Han, Wenxian Li, Hongbo Wang, "Instance Segmentation of Airway Anatomies Using Mask R-CNN Prompt Adaptation-SAM", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
25.
Yong Liu, Chengyu Wu, Jiayuan Cui, Bin Jiang, "VPCI: Self-Supervised Visual Prompt-Guided Cross-Domain Interactive Image Fusion Framework", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
26.
Jiaxiang Fang, Shiqiang Ma, Shengfeng He, Fei Guo, "Self-Support Prototype-Aware For Few-Shot Semantic Segmentation", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
27.
Bingbing Dan, Meihui Li, Tao Tang, Jing Zhang, "One Shot is Enough for Sequential Infrared Small Target Segmentation", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
28.
Xiaofeng Fan, Jie Guo, Shichao Kan, Yixiong Liang, "EPCPE: A Real-time End-to-End Pipeline for RGB-based Category-level 6D Pose Estimation", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
29.
Mobina Mansoori, Sajjad Shahabodini, Jamshid Abouei, Konstantinos N. Plataniotis, Arash Mohammadi, "Self-Prompting Polyp Segmentation in Colonoscopy Using Hybrid YOLO-SAM2 Model", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
30.
Kaiwen Li, Dezheng Gao, Zelin Yang, Xing Wei, "SCAT: Shared-Convolution Adaptation Tuning for Foreground Segmentation", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
Cites in Papers - Other Publishers (918)
1.
Haoyu Guo, He Zhu, Sida Peng, Yuang Wang, Yujun Shen, Ruizhen Hu, Xiaowei Zhou, "SAM-Guided Graph Cut for\\xa03D Instance Segmentation", Computer Vision – ECCV 2024, vol.15106, pp.234, 2025.
2.
Marion Boyer, David Youssefi, Florent Lafarge, "LineFit: A Geometric Approach for\\xa0Fitting Line Segments in\\xa0Images", Computer Vision – ECCV 2024, vol.15113, pp.92, 2025.
3.
Daniel Marczak, Bartłomiej Twardowski, Tomasz Trzciński, Sebastian Cygert, "MAGMAX: Leveraging Model Merging for\\xa0Seamless Continual Learning", Computer Vision – ECCV 2024, vol.15143, pp.379, 2025.
4.
Xuerong Cui, Yi Li, Juan Li, Jingyao Zhang, "Cross-PIC: A cross-scale in-context learning network for 3D multibeam point cloud segmentation of submarine pipelines", Ocean Engineering, vol.315, pp.119778, 2025.
5.
Jiawei Yang, Katie Z Luo, Jiefeng Li, Congyue Deng, Leonidas Guibas, Dilip Krishnan, Kilian Q Weinberger, Yonglong Tian, Yue Wang, "Denoising Vision Transformers", Computer Vision – ECCV 2024, vol.15143, pp.453, 2025.
6.
Feng Li, Hao Zhang, Peize Sun, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianwei Yang, Lei Zhang, Jianfeng Gao, "Segment and\\xa0Recognize Anything at\\xa0Any Granularity", Computer Vision – ECCV 2024, vol.15106, pp.467, 2025.
7.
Xin Duan, Yu Cao, Lei Zhu, Gang Fu, Xin Wang, Renjie Zhang, Ping Li, "Two-Stage Video Shadow Detection via\\xa0Temporal-Spatial Adaption", Computer Vision – ECCV 2024, vol.15106, pp.196, 2025.
8.
Yasumasa Onoe, Sunayana Rane, Zachary Berger, Yonatan Bitton, Jaemin Cho, Roopal Garg, Alexander Ku, Zarana Parekh, Jordi Pont-Tuset, Garrett Tanzer, Su Wang, Jason Baldridge, "DOCCI: Descriptions of\\xa0Connected and\\xa0Contrasting Images", Computer Vision – ECCV 2024, vol.15118, pp.291, 2025.
9.
Yaniv Wolf, Amit Bracha, Ron Kimmel, "GS2Mesh: Surface Reconstruction from\\xa0Gaussian Splatting via\\xa0Novel Stereo Views", Computer Vision – ECCV 2024, vol.15147, pp.207, 2025.
10.
Xiao Shang, Siqi Wu, Yuhao Liu, Zhenfeng Zhao, Shenwen Wang, "PVT-MA: pyramid vision transformers with multi-attention fusion mechanism for polyp segmentation", Applied Intelligence, vol.55, no.1, 2025.
11.
Dingyuan Zhang, Dingkang Liang, Zichang Tan, Xiaoqing Ye, Cheng Zhang, Jingdong Wang, Xiang Bai, "Make Your ViT-Based Multi-view 3D Detectors Faster via\\xa0Token Compression", Computer Vision – ECCV 2024, vol.15105, pp.56, 2025.
12.
Patrick Møller Jensen, Vedrana Andersen Dahl, Rebecca Engberg, Carsten Gundlach, Hans Marin Kjer, Anders Bjorholm Dahl, "BugNIST a\\xa0Large Volumetric Dataset for\\xa0Object Detection Under Domain Shift", Computer Vision – ECCV 2024, vol.15090, pp.18, 2025.
13.
Yifan Pu, Zhuofan Xia, Jiayi Guo, Dongchen Han, Qixiu Li, Duo Li, Yuhui Yuan, Ji Li, Yizeng Han, Shiji Song, Gao Huang, Xiu Li, "Efficient Diffusion Transformer with\\xa0Step-Wise Dynamic Attention Mediators", Computer Vision – ECCV 2024, vol.15073, pp.424, 2025.
14.
Go Ohtani, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Yoshimitsu Aoki, "Rethinking Image Super-Resolution from\\xa0Training Data Perspectives", Computer Vision – ECCV 2024, vol.15075, pp.19, 2025.
15.
Minghao Chen, Iro Laina, Andrea Vedaldi, "DGE: Direct Gaussian 3D Editing by\\xa0Consistent Multi-view Editing", Computer Vision – ECCV 2024, vol.15132, pp.74, 2025.
16.
Silvio Galesso, Philipp Schröppel, Hssan Driss, Thomas Brox, "Diffusion for\\xa0Out-of-Distribution Detection on\\xa0Road Scenes and\\xa0Beyond", Computer Vision – ECCV 2024, vol.15132, pp.110, 2025.
17.
Muer Tie, Julong Wei, Ke Wu, Zhengjun Wang, Shanshuai Yuan, Kaizhao Zhang, Jie Jia, Jieru Zhao, Zhongxue Gan, Wenchao Ding, "O _{2}V-Mapping: Online Open-Vocabulary Mapping with\\xa0Neural Implicit Representation", Computer Vision – ECCV 2024, vol.15145, pp.318, 2025.
18.
Guoxing Zhang, Yiming Liu, Xiaoyu Yang, Hailong Huang, Chao Huang, "TrafficNight: An Aerial Multimodal Benchmark for Nighttime Vehicle Surveillance", Computer Vision – ECCV 2024, vol.15123, pp.36, 2025.
19.
Alexander Mattern, Henrik Gerdes, Dennis Grunert, Robert H. Schmitt, "A comparison of transformer and CNN-based object detection models for surface defects on Li-Ion Battery Electrodes", Journal of Energy Storage, vol.105, pp.114378, 2025.
20.
Yash Bhalgat, Iro Laina, João F. Henriques, Andrew Zisserman, Andrea Vedaldi, "N2F2: Hierarchical Scene Understanding with\\xa0Nested Neural Feature Fields", Computer Vision – ECCV 2024, vol.15117, pp.197, 2025.
21.
Isaac Labe, Noam Issachar, Itai Lang, Sagie Benaim, "DGD: Dynamic 3D Gaussians Distillation", Computer Vision – ECCV 2024, vol.15126, pp.361, 2025.
22.
Badr-Eddine Marani, Mohamed Hanini, Nihitha Malayarukil, Stergios Christodoulidis, Maria Vakalopoulou, Enzo Ferrante, "ViG-Bias: Visually Grounded Bias Discovery and\\xa0Mitigation", Computer Vision – ECCV 2024, vol.15117, pp.414, 2025.
23.
Mengyu Wang, Yuyao Huang, Henghui Ding, Xinlong Wang, Tiejun Huang, Yao Zhao, Yunchao Wei, Shuicheng Yan, "Region-Native Visual Tokenization", Computer Vision – ECCV 2024, vol.15132, pp.19, 2025.
24.
Charig Yang, Weidi Xie, Andrew Zisserman, "Made to\\xa0Order: Discovering Monotonic Temporal Changes via\\xa0Self-supervised Video Ordering", Computer Vision – ECCV 2024, vol.15132, pp.268, 2025.
25.
Yaoting Wang, Peiwen Sun, Yuanchao Li, Honggang Zhang, Di Hu, "Can Textual Semantics Mitigate Sounding Object Segmentation Preference?", Computer Vision – ECCV 2024, vol.15132, pp.340, 2025.
26.
Raghav Kapoor, Yash Parag Butala, Melisa Russak, Jing Yu Koh, Kiran Kamble, Waseem AlShikh, Ruslan Salakhutdinov, "OmniACT: A Dataset and\\xa0Benchmark for\\xa0Enabling Multimodal Generalist Autonomous Agents for\\xa0Desktop and\\xa0Web", Computer Vision – ECCV 2024, vol.15126, pp.161, 2025.
27.
Seonghoon Yu, Paul Hongsuck Seo, Jeany Son, "Pseudo-RIS: Distinctive Pseudo-Supervision Generation for\\xa0Referring Image Segmentation", Computer Vision – ECCV 2024, vol.15126, pp.18, 2025.
28.
Connor Lee, Matthew Anderson, Nikhil Ranganathan, Xingxing Zuo, Kevin Do, Georgia Gkioxari, Soon-Jo Chung, "Caltech Aerial RGB-Thermal Dataset in\\xa0the\\xa0Wild", Computer Vision – ECCV 2024, vol.15121, pp.236, 2025.
29.
Daniela L. Freire, Andre C. P. L. F. de Carvalho, Augusto José Peterlevitz, Mateus Antonio Chinelatto, Ricardo Dutra da Silva, Juan Fernando Rojas Perea, "A Methodology for\\xa0Automated Conversion of\\xa0Axis-Aligned to\\xa0Polygonal and\\xa0Oriented Bounding Box Annotations in\\xa0Aerial Imagery Object Detection", Intelligent Data Engineering and Automated Learning – IDEAL 2024, vol.15347, pp.373, 2025.
30.
Iván García-Aguilar, Syed Ali Haider Jafri, David Elizondo, Saul Calderón, Sarah Greenfield, Rafael M. Luque-Baena, "Enhancing Object Segmentation via\\xa0Few-Shot Learning with\\xa0Limited Annotated Data", The 19th International Conference on Soft Computing Models in Industrial and Environmental Applications SOCO 2024, vol.889, pp.32, 2025.