Model | HF Name | Creator | Task | Library | Dataset | Language | Paper | Related to | License |
---|
All-In-One-Pixel model | PublicPrompts/All-In-One-Pixel-Model | PublicPrompts | | Diffusers | | | | | Creativeml-openrail-m |
BEiT (base-sized, fine-tuned on ImageNet-22k) model | microsoft/beit-base-patch16-224-pt22k-ft22k | Microsoft | Image Classification | PyTorch JAX Transformers | Imagenet Imagenet-21k | | 2106.08254 | Vision Beit | Apache-2.0 |
CLIP ViT base patch16 model | openai/clip-vit-base-patch16 | Openai | Zero-Shot Image Classification | PyTorch JAX Transformers | | | 2103.00020 1908.04913 | Clip Vision | |
CLIP ViT base patch32 model | openai/clip-vit-base-patch32 | Openai | Zero-Shot Image Classification | PyTorch TensorFlow JAX Transformers | | | 2103.00020 1908.04913 | Clip Vision | |
CLIP ViT-B/32 - LAION-2B model | laion/CLIP-ViT-B-32-laion2B-s34B-b79K | Laion | | PyTorch OpenCLIP | | | 1910.04867 | Clip | Mit |
CLIP ViT-H-14 - LAION-2B model | laion/CLIP-ViT-H-14-laion2B-s32B-b79K | Laion | | PyTorch OpenCLIP | | | 1910.04867 | Clip | Mit |
CLIP model | openai/clip-vit-large-patch14 | Openai | Zero-Shot Image Classification | PyTorch TensorFlow JAX Transformers | | | 2103.00020 1908.04913 | Clip Vision | |
CLIP-ViT-large-patch14-336 model | openai/clip-vit-large-patch14-336 | Openai | Zero-Shot Image Classification | PyTorch TensorFlow Transformers | | | | Clip Generated from keras callback | |
CLIPSeg model | CIDAS/clipseg-rd64-refined | CIDAS | Image Segmentation | PyTorch Transformers | | | 2112.10003 | Vision Clipseg | Apache-2.0 |
DETR (End-to-End Object Detection) with ResNet-101 backbone model | facebook/detr-resnet-101 | Meta | Object Detection | PyTorch Transformers | Coco | | 2005.12872 | Vision Detr | Apache-2.0 |
DETR (End-to-End Object Detection) with ResNet-50 backbone model | facebook/detr-resnet-50 | Meta | Object Detection | PyTorch Transformers | Coco | | 2005.12872 | Vision Detr | Apache-2.0 |
EimisAnimeDiffusion 1.0v model | eimiss/EimisAnimeDiffusion_1.0v | Eimiss | Text-to-Image Image-to-Image | Diffusers | | English | | Stable-diffusion | Creativeml-openrail-m |
Fantasy Card Diffusion model | volrath50/fantasy-card-diffusion | Volrath50 | Text-to-Image Image-to-Image | Diffusers | | English | | Stable-diffusion | Creativeml-openrail-m |
Ghibli Diffusion model | nitrosocke/Ghibli-Diffusion | Nitrosocke | Text-to-Image Image-to-Image | Diffusers | | English | | Stable-diffusion | Creativeml-openrail-m |
MaskFormer ADE20k model | facebook/maskformer-swin-large-ade | Meta | Image Segmentation | PyTorch Transformers | Scene parse 150 | | 2107.06278 | Vision Maskformer | Other |
MaskFormer Coco model | facebook/maskformer-swin-large-coco | Meta | Image Segmentation | PyTorch Transformers | Coco | | 2107.06278 | Vision Maskformer | Other |
Midjourney style on Stable Diffusion model | sd-concepts-library/midjourney-style | Sd-concepts-library | | | | | | | Mit |
Nitro Diffusion model | nitrosocke/Nitro-Diffusion | Nitrosocke | Text-to-Image Image-to-Image | Diffusers | | English | | Stable-diffusion | Creativeml-openrail-m |
OWL-ViT model | google/owlvit-base-patch32 | Google | Object Detection | PyTorch Transformers | | | 2205.06230 | Vision Owlvit Zero-shot-object-detection | Apache-2.0 |
Redshift Diffusion model | nitrosocke/redshift-diffusion | Nitrosocke | Text-to-Image Image-to-Image | Diffusers | | English | | Stable-diffusion | Creativeml-openrail-m |
RuCLIP-ViT-base-patch32-224 model | sberbank-ai/ruclip-vit-base-patch32-224 | Sberbank-ai | | PyTorch Transformers | | | | | |
Stable Diffusion Image Variations model | lambdalabs/sd-image-variations-diffusers | Lambdalabs | Image-to-Image | Diffusers | ChristophSchuhmann/improved aesthetics 6plus | | | Stable-diffusion Stable-diffusion-diffusers | Creativeml-openrail-m |
ViT For Age Classification model | nateraw/vit-age-classifier | Nateraw | Image Classification | PyTorch Transformers | Fairface | | | Vit | |
ViT large patch14 CLIP 224.openai ft in12k in1k model | timm/vit_large_patch14_clip_224.openai_ft_in12k_in1k | Timm | Image Classification | PyTorch Timm | | | | Vision | Apache-2.0 |
Vision Transformer (base-sized) 224x224 model | google/vit-base-patch16-224 | Google | Image Classification | PyTorch TensorFlow JAX Transformers | Imagenet-21k Imagenet-1k | | 2010.11929 2006.03677 | Vision Vit | Apache-2.0 |
Vision Transformer (base-sized) 384x384 model | google/vit-base-patch16-384 | Google | Image Classification | PyTorch TensorFlow JAX Transformers | Imagenet Imagenet-21k | | 2010.11929 2006.03677 | Vision Vit | Apache-2.0 |
Vision Transformer (base-sized) patch-16-384 model | google/vit-base-patch16-384 | Google | Image Classification | PyTorch TensorFlow JAX Transformers | Imagenet Imagenet-21k | | 2010.11929 2006.03677 | Vision Vit | Apache-2.0 |