towhee.models.multiscale_vision_transformers.create_mvit.create_mvit_model

towhee.models.multiscale_vision_transformers.create_mvit.create_mvit_model(model_name: str = 'imagenet_b_16_conv', checkpoint_path: Optional[str] = None, device: Optional[str] = None, change_model_keys: bool = True)torch.nn.modules.module.Module[source]

Create Multiscale Vision Transformers model. https://arxiv.org/abs/2104.11227

Parameters:
  • model_name (str) – Multiscale Vision Transformers model name.

  • checkpoint_path (str) – Local checkpoint path, default is None. Checkpoint weights can be download in https://github.com/facebookresearch/SlowFast/blob/main/MODEL_ZOO.md.

  • device (str) – Model device, cpu or cuda.

  • change_model_keys (bool) – This MViT structure is a little different from that from Facebookresearch for visualization. So you should set change_model_keys is True if you download pretrained checkpoint from Facebookresearch.