Deep-person-reid is a pytorch-based framework for training and evaluating deep person re-identification models on reid benchmarks.
It has the following features:
- multi-GPU training.
- support both image reid and video reid.
- incredibly easy preparation of reid datasets.
- standard split protocol used by most research papers.
- end-to-end training and evaluation.
- implementations of state-of-the-art reid models.
- access to pretrained reid models.
- multi-dataset training.
- visualization of ranked results.
- state-of-the-art training techniques.
- 22-01-2019: Added Market1501+500K.
- 06-01-2019: Released Awesome-ReID, a collection of ReID-related research with links to codes and papers.
- 26-11-2018: Released pretrained weights (imagenet & reid) for shufflenet.
- 23-11-2018: Released imagenet-pretrained weights for resnext50_32x4d.
- 11-11-2018: Added multi-dataset training; Added cython code for cuhk03-style evaluation; Wrapped dataloader construction to Image/Video-DataManager; Wrapped argparse to args.py; Added MLFN (CVPR'18).
cdto your preferred directory and run
git clone https://github.com/KaiyangZhou/deep-person-reid.
- Install dependencies by
pip install -r requirements.txt(if necessary).
- To install the cython-based evaluation toolbox,
make. As a result,
eval_metrics_cy.sois generated under the same folder. Run
python test_cython.pyto test if the toolbox is installed successfully. (credit to luzai)
- Market1501 (
- CUHK03 (
- DukeMTMC-reID (
- MSMT17 (
- VIPeR (
- GRID (
- CUHK01 (
- PRID450S (
- SenseReID (
The keys to use these datasets are enclosed in the parentheses. See torchreid/datasets/__init__.py for details. The data managers of image-reid and video-reid are implemented in torchreid/data_manager.py.
Instructions regarding how to prepare and do evaluation on these datasets are provided in DATASETS.md.
ImageNet classification models
xent: cross entropy loss (the label smoothing regularizer can be enabled by
htri: hard mining triplet loss.
Training methods are implemented in
train_imgreid_xent.py: train image-reid models with cross entropy loss.
train_imgreid_xent_htri.py: train image-reid models with hard mining triplet loss or the combination of hard mining triplet loss and cross entropy loss.
train_vidreid_xent.py: train video-reid models with cross entropy loss.
train_vidreid_xent_htri.py: train video-reid models with hard mining triplet loss or the combination of hard mining triplet loss and cross entropy loss.
Input arguments for the above training scripts are unified in args.py.
To train an image-reid model with cross entropy loss, you can do
python train_imgreid_xent.py \ -s market1501 \ # source dataset for training -t market1501 \ # target dataset for test --height 256 \ # image height --width 128 \ # image width --optim amsgrad \ # optimizer --label-smooth \ # label smoothing regularizer --lr 0.0003 \ # learning rate --max-epoch 60 \ # maximum epoch to run --stepsize 20 40 \ # stepsize for learning rate decay --train-batch-size 32 \ --test-batch-size 100 \ -a resnet50 \ # network architecture --save-dir log/resnet50-market-xent \ # where to save the log and models --gpu-devices 0 \ # gpu device index
-t can take different strings (i.e. dataset keys) of arbitrary length (delimited by space). For example, if you wanna train models on Market1501 + DukeMTMC-reID and test on both of them, you can use
-s market1501 dukemtmcreid and
-t market1501 dukemtmcreid. If say, you wanna test on a different dataset, e.g. MSMT17, then just do
-t msmt17. Multi-dataset training is implemented for both image-reid and video-reid. Note that when
-t takes multiple datasets, evaluation is performed on each specified dataset individually.
Two-stepped transfer learning
To finetune models pretrained on external large-scale datasets, such as ImageNet, the two-stepped training strategy is recommended. This can be achieved by
--open-layers. The pipeline goes as follows.
First, the base network is frozen and the randomly initialized layers (e.g. identity classification layer) are trained for
--fixbase-epoch epochs. Specifically, the layers specified by
--open-layers are set to the train mode and will be updated, while other layers are set to the eval mode and are frozen. See
open_specified_layers(model, open_layers) in torchreid/utils/torchtools.py.
Second, after the new layers are adapted to the old (well-initialized) layers, all layers are set to the train mode (via
open_all_layers(model)) and are trained for
For example, to train the randomly initialized classifier in resnet50 for 5 epochs before training all layers, do
--fixbase-epoch 5 and
--open-layers classifier. Note that the layer names must align with the attribute names in the model (in this case,
self.classifier exists in the model).
In addition, there is an argument called
--always-fixbase. Once activated, the base network will be frozen and only the specified layers with
--open-layers will be trained.
Using hard mining triplet loss
htri requires adding
Training video-reid models
For video reid,
test-batch-size refers to the number of tracklets, so the real image batch size is
--test-batch-size * --seq-len. As the training follows the image-based paradigm, the semantic meaning of
train-batch-size does not change.
--evaluate to switch to the evaluation mode. In doing so, no model training is performed. For example, say you wanna load pretrained model weights at
resnet50 and do evaluation on Market1501, you can do
python train_imgreid_xent.py \ -s market1501 \ # this does not matter any more -t market1501 \ # you can add more datasets here for the test list --height 256 \ --width 128 \ --test-batch-size 100 \ --evaluate \ -a resnet50 \ --load-weights path_to/resnet50.pth.tar \ --save-dir log/eval-resnet50 \ --gpu-devices 0 \
--load-weights will discard layer weights in
path_to/resnet50.pth.tar that do not match the original model layers in size. If you encounter the
UnicodeDecodeError problem when loading the checkpoints downloaded from the model zoo, please try this solution.
--eval-freq to control the evaluation frequency and
--start-eval to indicate when to start counting the evaluation frequency. This is useful when you want to test the model for every
--eval-freq epochs to diagnose the training (the cython evaluation code is really fast, e.g. evaluation on Market1501 can be done in less than 10s).
Visualize ranked results
To visualize the ranked results, you can use
--visualize-ranks, which works along with
--evaluate. The ranked images will be saved in
save_dir is the directory you specify with
--save-dir. This function is implemented in torchreid/utils/reidtools.py.
Please link this project in your paper.
This project is under the MIT License.