Segment Anything Model and Dataset (SAM and SA-1B): Difference between revisions

No edit summary
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{see also|Papers|Models|Datasets}}
{{see also|Papers|Models|Datasets}}
{{see also|Computer Vision Papers|Computer Vision Models|Computer Vision Datasets}}
{| class="wikitable"
|-
| [https://ai.facebook.com/research/publications/segment-anything/ Paper]
| [https://segment-anything.com/ Website]
| [https://segment-anything.com/demo Demo]
| [https://segment-anything.com/dataset/index.html Dataset]
| [https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/ Blog]
| [https://github.com/facebookresearch/segment-anything GitHub]
|-
|}
==Introduction==
==Introduction==
[[File:segment anything model demo2.png|400px|right]]
===Model Introduction===
===Model Introduction===
'''Segment Anything Model (SAM)''' is an [[artificial intelligence model]] developed by [[Meta AI]]. This model allows users to effortlessly "cut out" any object within an image using a single click. It is a [[prompt]]able [[segmentation system]] that can generalize to unfamiliar objects and images without additional training.
'''Segment Anything Model (SAM)''' is an [[artificial intelligence model]] developed by [[Meta AI]]. This model allows users to effortlessly "cut out" any object within an image using a single click. It is a [[prompt]]able [[segmentation system]] that can generalize to unfamiliar objects and images without additional training.
Line 13: Line 25:


==Segmentation Anything Model (SAM) Structure and Implementation==
==Segmentation Anything Model (SAM) Structure and Implementation==
[[File:segment anything model1.png|400px|right]]
SAM's structure consists of three components:
SAM's structure consists of three components:


Line 27: Line 40:


==Segmentation Anything Model (SAM) Overview==
==Segmentation Anything Model (SAM) Overview==
[[File:segment anything model demo1.png|400px|right]]
===Input Prompts===
===Input Prompts===
SAM utilizes a variety of [[input prompt]]s to determine which object to segment in an image. These prompts enable the model to execute a wide range of segmentation tasks without further training. SAM can be prompted using interactive points and boxes, automatically segment all objects within an image, or generate multiple valid masks when given ambiguous prompts.
SAM utilizes a variety of [[input prompt]]s to determine which object to segment in an image. These prompts enable the model to execute a wide range of segmentation tasks without further training. SAM can be prompted using interactive points and boxes, automatically segment all objects within an image, or generate multiple valid masks when given ambiguous prompts.
Line 43: Line 57:


===Promptable Segmentation===
===Promptable Segmentation===
[[File:segment anything model2.png|400px|right]]
SAM is designed to return a valid segmentation mask for any [[prompt]], whether it be foreground/background points, a rough box or mask, freeform text, or any other information indicating what to segment in an image. This model has been trained on the SA-1B dataset, which consists of over 1 billion masks, allowing it to generalize to new objects and images beyond its [[training data]]. As a result, practitioners no longer need to collect their own segmentation data and [[fine-tune]] a model for their use case.
SAM is designed to return a valid segmentation mask for any [[prompt]], whether it be foreground/background points, a rough box or mask, freeform text, or any other information indicating what to segment in an image. This model has been trained on the SA-1B dataset, which consists of over 1 billion masks, allowing it to generalize to new objects and images beyond its [[training data]]. As a result, practitioners no longer need to collect their own segmentation data and [[fine-tune]] a model for their use case.


==Segmenting 1 Billion Masks: Building SA-1B Dataset==
==Segmenting 1 Billion Masks: Building SA-1B Dataset==
[[File:segment anything dataset1.png|400px|right]]
To train SAM, a massive and diverse dataset was needed. The SA-1B dataset was collected using the model itself; annotators used SAM to annotate images interactively, and the newly annotated data was then used to update SAM. This process was repeated multiple times to iteratively improve both the model and the [[dataset]].
To train SAM, a massive and diverse dataset was needed. The SA-1B dataset was collected using the model itself; annotators used SAM to annotate images interactively, and the newly annotated data was then used to update SAM. This process was repeated multiple times to iteratively improve both the model and the [[dataset]].


370

edits