• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..04-Nov-2021-

config/H04-Nov-2021-212131

data/demo/H04-Nov-2021-3915

dataset/H04-Nov-2021-1,8741,441

detect/H04-Nov-2021-256199

evaluate/H04-Nov-2021-644481

model/H04-Nov-2021-191

symbol/H04-Nov-2021-1,6081,263

tools/H04-Nov-2021-4,0613,258

train/H04-Nov-2021-392297

README.mdH A D04-Nov-202114.9 KiB285225

__init__.pyH A D04-Nov-2021785 170

benchmark_score.pyH A D04-Nov-20214.3 KiB11880

demo.pyH A D04-Nov-20219.9 KiB242199

deploy.pyH A D04-Nov-20213 KiB6443

evaluate.pyH A D04-Nov-20215.5 KiB10985

init.shH A D04-Nov-20211.8 KiB6030

quantization.pyH A D04-Nov-20218 KiB160117

train.pyH A D04-Nov-20218.6 KiB157128

README.md

1<!--- Licensed to the Apache Software Foundation (ASF) under one -->
2<!--- or more contributor license agreements.  See the NOTICE file -->
3<!--- distributed with this work for additional information -->
4<!--- regarding copyright ownership.  The ASF licenses this file -->
5<!--- to you under the Apache License, Version 2.0 (the -->
6<!--- "License"); you may not use this file except in compliance -->
7<!--- with the License.  You may obtain a copy of the License at -->
8
9<!---   http://www.apache.org/licenses/LICENSE-2.0 -->
10
11<!--- Unless required by applicable law or agreed to in writing, -->
12<!--- software distributed under the License is distributed on an -->
13<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
14<!--- KIND, either express or implied.  See the License for the -->
15<!--- specific language governing permissions and limitations -->
16<!--- under the License. -->
17
18# SSD: Single Shot MultiBox Object Detector
19
20SSD is an unified framework for object detection with a single network.
21
22You can use the code to train/evaluate/test for object detection task.
23
24-------------------
25
26## Gluon Implementation
27
28You can find a Gluon implementation on [gluon-cv](https://gluon-cv.mxnet.io/build/examples_detection/train_ssd_voc.html).
29
30-------------------
31
32### Disclaimer
33This is a re-implementation of original SSD which is based on caffe. The official
34repository is available [here](https://github.com/weiliu89/caffe/tree/ssd).
35The arXiv paper is available [here](http://arxiv.org/abs/1512.02325).
36
37This example is intended for reproducing the nice detector while fully utilize the
38remarkable traits of MXNet.
39* Model [converter](#convert-caffemodel) from caffe is available now!
40* The result is almost identical to the original version. However, due to different implementation details, the results might differ slightly.
41
42Due to the permission issue, this example is maintained in this [repository](https://github.com/zhreshold/mxnet-ssd) separately. You can use the link regarding specific per example [issues](https://github.com/zhreshold/mxnet-ssd/issues).
43
44### What's new
45* Support training and inference on COCO dataset. Int8 inference achieves 0.253 mAP on CPU with MKL-DNN backend, which is a comparable accuracy to FP32 (0.2552 mAP).
46* Support uint8 inference on CPU with MKL-DNN backend. Uint8 inference achieves 0.8364 mAP, which is a comparable accuracy to FP32 (0.8366 mAP).
47* Added live camera capture and detection display (run with --camera flag). Example:
48    `./demo.py --camera --cpu --frame-resize 0.5`
49* Added multiple trained models.
50* Added a much simpler way to compose network from mainstream classification networks (resnet, inception...) and [Guide](symbol/README.md).
51* Update to the latest version according to caffe version, with 5% mAP increase.
52* Use C++ record iterator based on back-end multi-thread engine to achieve huge speed up on multi-gpu environments.
53* Monitor validation mAP during training.
54* More network symbols under development and test.
55* Extra operators are now in `mxnet/src/operator/contrib`.
56* Old models are incompatible, use [e06c55d](https://github.com/apache/mxnet/commits/e06c55d6466a0c98c7def8f118a48060fb868901) or [e4f73f1](https://github.com/apache/mxnet/commits/e4f73f1f4e76397992c4b0a33c139d52b4b7af0e) for backward compatibility. Or, you can modify the json file to update the symbols if you are familiar with it, because only names have changed while weights and bias should still be good.
57
58### Demo results
59![demo1](https://cloud.githubusercontent.com/assets/3307514/19171057/8e1a0cc4-8be0-11e6-9d8f-088c25353b40.png)
60![demo2](https://cloud.githubusercontent.com/assets/3307514/19171063/91ec2792-8be0-11e6-983c-773bd6868fa8.png)
61![demo3](https://cloud.githubusercontent.com/assets/3307514/19171086/a9346842-8be0-11e6-8011-c17716b22ad3.png)
62
63### mAP
64|        Model          | Training data    | Test data |  mAP | Note |
65|:-----------------:|:----------------:|:---------:|:----:|:-----|
66| [VGG16_reduced 300x300](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.5-beta/vgg16_ssd_300_voc0712_trainval.zip) | VOC07+12 trainval| VOC07 test| 77.8| fast |
67| [VGG16_reduced 512x512](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.5-beta/vgg16_ssd_512_voc0712_trainval.zip) | VOC07+12 trainval | VOC07 test| 79.9| slow |
68| [Inception-v3 512x512](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.6/inceptionv3_ssd_512_voc0712_trainval.zip) | VOC07+12 trainval| VOC07 test| 78.9 | fastest |
69| [Resnet-50 512x512](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.6/resnet50_ssd_512_voc0712_trainval.zip) | VOC07+12 trainval| VOC07 test| 78.9 | fast |
70
71### Speed
72|         Model         |   GPU            | CUDNN | Batch-size | FPS* |
73|:---------------------:|:----------------:|:-----:|:----------:|:----:|
74| VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1  |     16     | 95   |
75| VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1  |     8      | 95   |
76| VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1  |     1      | 64   |
77| VGG16_reduced 300x300 | TITAN X(Maxwell) |  N/A  |     8      | 36   |
78| VGG16_reduced 300x300 | TITAN X(Maxwell) |  N/A  |     1      | 28   |
79*Forward time only, data loading and drawing excluded.*
80
81
82### Getting started
83* You will need python modules: `cv2`, `matplotlib` and `numpy`.
84If you use mxnet-python api, you probably have already got them.
85You can install them via pip or package managers, such as `apt-get`:
86```
87sudo apt-get install python-opencv python-matplotlib python-numpy
88```
89
90* Build MXNet: Follow the official instructions
91```
92# for Ubuntu/Debian
93cp make/config.mk ./config.mk
94# enable cuda, cudnn if applicable
95```
96Remember to enable CUDA if you want to be able to train, since CPU training is
97insanely slow. Using CUDNN is optional, but highly recommended.
98
99### Try the demo
100* Download the pretrained model: [`ssd_resnet50_0712.zip`](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.6/resnet50_ssd_512_voc0712_trainval.zip), and extract to `model/` directory.
101
102* Run
103```
104# cd /path/to/incubator-mxnet/example/ssd
105# download the test images
106python data/demo/download_demo_images.py
107# run the demo
108python demo.py --gpu 0
109# play with examples:
110python demo.py --epoch 0 --images ./data/demo/dog.jpg --thresh 0.5
111python demo.py --cpu --network resnet50 --data-shape 512
112# wait for library to load for the first time
113```
114* Check `python demo.py --help` for more options.
115
116### Live Camera detection
117
118Use `init.sh` to download the trained model.
119You can use `./demo.py --camera` to use a video capture device with opencv such as a webcam. This
120will open a window that will display the camera output together with the detections. You can play
121with the detection threshold to get more or less detections.
122
123### Train the model on VOC
124* Note that we recommend to use gluon-cv to train the model, please refer to [gluon-cv ssd](https://gluon-cv.mxnet.io/build/examples_detection/train_ssd_voc.html).
125This example only covers training on Pascal VOC or MS COCO dataset. Other datasets should
126be easily supported by adding subclass derived from class `Imdb` in `dataset/imdb.py`.
127See example of `dataset/pascal_voc.py` for details.
128* Download the converted pretrained `vgg16_reduced` model [here](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.2-beta/vgg16_reduced.zip), unzip `.param` and `.json` files
129into `model/` directory by default.
130* Download the PASCAL VOC dataset, skip this step if you already have one.
131```
132cd /path/to/where_you_store_datasets/
133wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
134wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
135wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
136# Extract the data.
137tar -xvf VOCtrainval_11-May-2012.tar
138tar -xvf VOCtrainval_06-Nov-2007.tar
139tar -xvf VOCtest_06-Nov-2007.tar
140```
141* We are going to use `trainval` set in VOC2007/2012 as a common strategy.
142The suggested directory structure is to store `VOC2007` and `VOC2012` directories
143in the same `VOCdevkit` folder.
144* Then link `VOCdevkit` folder to `data/VOCdevkit` by default:
145```
146ln -s /path/to/VOCdevkit /path/to/incubator-mxnet/example/ssd/data/VOCdevkit
147```
148Use hard link instead of copy could save us a bit disk space.
149* Create packed binary file for faster training:
150```
151# cd /path/to/incubator-mxnet/example/ssd
152bash tools/prepare_pascal.sh
153# or if you are using windows
154python tools/prepare_dataset.py --dataset pascal --year 2007,2012 --set trainval --target ./data/train.lst
155python tools/prepare_dataset.py --dataset pascal --year 2007 --set test --target ./data/val.lst --no-shuffle
156```
157* Start training:
158```
159# cd /path/to/incubator-mxnet/example/ssd
160python train.py
161```
162* By default, this example will use `batch-size=32` and `learning_rate=0.002`.
163You might need to change the parameters a bit if you have different configurations.
164Check `python train.py --help` for more training options. For example, if you have 4 GPUs, use:
165```
166# note that a perfect training parameter set is yet to be discovered for multi-GPUs
167python train.py --gpus 0,1,2,3 --batch-size 32
168```
169
170### Train the model on COCO
171* Download the COCO2014 dataset, skip this step if you already have one.
172```
173cd /path/to/where_you_store_datasets/
174wget http://images.cocodataset.org/zips/train2014.zip
175wget http://images.cocodataset.org/zips/val2014.zip
176wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip
177# Extract the data.
178unzip train2014.zip
179unzip val2014.zip
180unzip annotations_trainval2014.zip
181```
182* We are going to use `train2014,valminusminival2014` set in COCO2014 for training and `minival2014` for evaluation as a common strategy.
183* Then link `COCO2014` folder to `data/coco` by default:
184```
185ln -s /path/to/COCO2014 /path/to/incubator-mxnet/example/ssd/data/coco
186```
187Use hard link instead of copy could save us a bit disk space.
188* Create packed binary file for faster training:
189```
190# cd /path/to/incubator-mxnet/example/ssd
191bash tools/prepare_coco.sh
192# or if you are using windows
193python tools/prepare_dataset.py --dataset coco --set train2014,valminusminival2014 --target ./data/train.lst --root ./data/coco
194python tools/prepare_dataset.py --dataset coco --set minival2014 --target ./data/val.lst --root ./data/coco --no-shuffle
195```
196* Start training:
197```
198# cd /path/to/incubator-mxnet/example/ssd
199python train.py --label-width=560 --num-class=80 --class-names=./dataset/names/coco_label --pretrained="" --num-example=117265 --batch-size=64
200```
201
202### Evalute trained model
203Make sure you have val.rec as validation dataset. It's the same one as used in training. Use:
204```
205# cd /path/to/incubator-mxnet/example/ssd
206python evaluate.py --gpus 0,1 --batch-size 128 --epoch 0
207
208# Evaluate on COCO dataset
209python evaluate.py --gpus 0,1 --batch-size 128 --epoch 0 --num-class=80 --class-names=./dataset/names/mscoco.names
210```
211
212### Quantize model
213
214To quantize a model on VOC dataset, follow the [Train instructions](https://github.com/apache/incubator-mxnet/tree/master/example/ssd#train-the-model-on-VOC) to train a FP32 `SSD-VGG16_reduced_300x300` model based on Pascal VOC dataset. You can also download our [SSD-VGG16 pre-trained model](http://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_vgg16_reduced_300-dd479559.zip) and [packed binary data](http://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/ssd-val-fc19a535.zip). Create `model` and `data` directories if they're not exist, extract the zip files, then rename the uncompressed files as follows (eg, rename `ssd-val-fc19a535.idx` to `val.idx`, `ssd-val-fc19a535.lst` to `val.lst`, `ssd-val-fc19a535.rec` to `val.rec`, `ssd_vgg16_reduced_300-dd479559.params` to `ssd_vgg16_reduced_300-0000.params`, `ssd_vgg16_reduced_300-symbol-dd479559.json` to `ssd_vgg16_reduced_300-symbol.json`.)
215
216To quantize a model on COCO dataset, follow the [Train instructions](https://github.com/apache/incubator-mxnet/tree/master/example/ssd#train-the-model-on-COCO) to train a FP32 `SSD-VGG16_reduced_300x300` model based on COCO dataset. You can also download our [SSD-VGG16 pre-trained model](http://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_vgg16_reduced_300-7fedd4ad.zip) and [packed binary data](http://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/ssd_coco-val-e91096e8.zip). Create `model` and `data` directories if they're not exist, extract the zip files, then rename the uncompressed files as follows (eg, rename `ssd_coco-val-e91096e8.idx` to `val.idx`, `ssd_coco-val-e91096e8.lst` to `val.lst`, `ssd_coco-val-e91096e8.rec` to `val.rec`, `ssd_vgg16_reduced_300-7fedd4ad.params` to `ssd_vgg16_reduced_300-0000.params`, `ssd_vgg16_reduced_300-symbol-7fedd4ad.json` to `ssd_vgg16_reduced_300-symbol.json`.)
217
218```
219data/
220|---val.rec
221|---val.lxt
222|---val.idx
223model/
224|---ssd_vgg16_reduced_300-0000.params
225|---ssd_vgg16_reduced_300-symbol.json
226```
227
228Then, use the following command for quantization. By default, this script uses 5 batches (32 samples per batch) for naive calibration:
229
230```
231python quantization.py
232```
233
234After quantization, INT8 models will be saved in `model/` dictionary.  Use the following command to launch inference.
235
236```
237
238# Launch FP32 Inference on VOC dataset
239python evaluate.py --cpu --num-batch 10 --batch-size 224 --deploy --prefix=./model/ssd_
240
241# Launch INT8 Inference on VOC dataset
242python evaluate.py --cpu --num-batch 10 --batch-size 224 --deploy --prefix=./model/cqssd_
243
244# Launch FP32 Inference on COCO dataset
245
246python evaluate.py --cpu --num-batch 10 --batch-size 224 --deploy --prefix=./model/ssd_ --num-class=80 --class-names=./dataset/names/mscoco.names
247
248# Launch INT8 Inference on COCO dataset
249
250python evaluate.py --cpu --num-batch 10 --batch-size 224 --deploy --prefix=./model/cqssd_ --num-class=80 --class-names=./dataset/names/mscoco.names
251
252# Launch dummy data Inference
253python benchmark_score.py --deploy --prefix=./model/ssd_
254python benchmark_score.py --deploy --prefix=./model/cqssd_
255```
256### Convert model to deploy mode
257This simply removes all loss layers, and attach a layer for merging results and non-maximum suppression.
258Useful when loading python symbol is not available.
259```
260# cd /path/to/incubator-mxnet/example/ssd
261python deploy.py --num-class 20
262```
263
264### Convert caffe model
265Converter from caffe is available at `/path/to/incubator-mxnet/example/ssd/tools/caffe_converter`
266
267This is specifically modified to handle custom layer in caffe-ssd. Usage:
268```
269cd /path/to/incubator-mxnet/example/ssd/tools/caffe_converter
270make
271python convert_model.py deploy.prototxt name_of_pretrained_caffe_model.caffemodel ssd_converted
272# you will use this model in deploy mode without loading from python symbol(layer names inconsistent)
273python demo.py --prefix ssd_converted --epoch 1 --deploy
274```
275There is no guarantee that conversion will always work, but at least it's good for now.
276
277### Legacy models
278Since the new interface for composing network is introduced, the old models have inconsistent names for weights.
279You can still load the previous model by rename the symbol to `legacy_xxx.py`
280and call with `python train/demo.py --network legacy_xxx `
281For example:
282```
283python demo.py --network 'legacy_vgg16_ssd_300.py' --prefix model/ssd_300 --epoch 0
284```
285