Shortcuts

mmtrack.apis

mmtrack.core

anchor

evaluation

motion

optimizer

track

utils

mmtrack.datasets

datasets

parsers

pipelines

samplers

class mmtrack.datasets.samplers.DistributedVideoSampler(dataset, num_replicas=None, rank=None, shuffle=False)[source]

Put videos to multi gpus during testing.

Parameters
  • dataset (Dataset) – Test dataset that must has data_infos attribute. Each data_info in data_infos record information of one frame, and each video must has one data_info that includes data_info[‘frame_id’] == 0.

  • num_replicas (int) – The number of gpus. Defaults to None.

  • rank (int) – Gpu rank id. Defaults to None.

  • shuffle (bool) – If True, shuffle the dataset. Defaults to False.

mmtrack.models

mot

sot

vid

aggregators

class mmtrack.models.aggregators.EmbedAggregator(num_convs=1, channels=256, kernel_size=3, norm_cfg=None, act_cfg={'type': 'ReLU'}, init_cfg=None)[source]

Embedding convs to aggregate multi feature maps.

This module is proposed in “Flow-Guided Feature Aggregation for Video Object Detection”. FGFA.

Parameters
  • num_convs (int) – Number of embedding convs.

  • channels (int) – Channels of embedding convs. Defaults to 256.

  • kernel_size (int) – Kernel size of embedding convs, Defaults to 3.

  • norm_cfg (dict) – Configuration of normlization method after each conv. Defaults to None.

  • act_cfg (dict) – Configuration of activation method after each conv. Defaults to dict(type=’ReLU’).

  • init_cfg (dict or list[dict], optional) – Initialization config dict. Defaults to None.

forward(x, ref_x)[source]

Aggregate reference feature maps ref_x.

The aggregation mainly contains two steps: 1. Computing the cos similarity between x and ref_x. 2. Use the normlized (i.e. softmax) cos similarity to weightedly sum ref_x.

Parameters
  • x (Tensor) – of shape [1, C, H, W]

  • ref_x (Tensor) – of shape [N, C, H, W]. N is the number of reference feature maps.

Returns

The aggregated feature map with shape [1, C, H, W].

Return type

Tensor

class mmtrack.models.aggregators.SelsaAggregator(in_channels, num_attention_blocks=16, init_cfg=None)[source]

Selsa aggregator module.

This module is proposed in “Sequence Level Semantics Aggregation for Video Object Detection”. SELSA.

Parameters
  • in_channels (int) – The number of channels of the features of proposal.

  • num_attention_blocks (int) – The number of attention blocks used in selsa aggregator module. Defaults to 16.

  • init_cfg (dict or list[dict], optional) – Initialization config dict. Defaults to None.

forward(x, ref_x)[source]

Aggregate the features ref_x of reference proposals.

The aggregation mainly contains two steps: 1. Use multi-head attention to computing the weight between x and ref_x. 2. Use the normlized (i.e. softmax) weight to weightedly sum ref_x.

Parameters
  • x (Tensor) – of shape [N, C]. N is the number of key frame proposals.

  • ref_x (Tensor) – of shape [M, C]. M is the number of reference frame proposals.

Returns

The aggregated features of key frame proposals with shape [N, C].

Return type

Tensor

backbones

losses

motion

reid

roi_heads

track_heads

builder

mmtrack.utils

mmtrack.utils.collect_env()[source]

Collect the information of the running environments.

mmtrack.utils.get_root_logger(log_file=None, log_level=20)[source]

Get root logger.

Parameters
  • log_file (str) – File path of log. Defaults to None.

  • log_level (int) – The level of logger. Defaults to logging.INFO.

Returns

The obtained logger

Return type

logging.Logger

Read the Docs v: latest
Versions
latest
stable
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.