towhee.models.multiscale_vision_transformers.create_mvit.create_mvit_model¶

towhee.models.multiscale_vision_transformers.create_mvit.create_mvit_model(model_name: str = 'imagenet_b_16_conv', checkpoint_path: Optional[str] = None, device: Optional[str] = None, change_model_keys: bool = True) → Module[source]¶

Create Multiscale Vision Transformers model. https://arxiv.org/abs/2104.11227

Parameters:

model_name (str) – Multiscale Vision Transformers model name.
checkpoint_path (str) – Local checkpoint path, default is None. Checkpoint weights can be download in https://github.com/facebookresearch/SlowFast/blob/main/MODEL_ZOO.md.
device (str) – Model device, cpu or cuda.
change_model_keys (bool) – This MViT structure is a little different from that from Facebookresearch for visualization. So you should set change_model_keys is True if you download pretrained checkpoint from Facebookresearch.