Summary
This commit is contained in:
1
keras-yolo3-master/.gitattributes
vendored
Executable file
1
keras-yolo3-master/.gitattributes
vendored
Executable file
@@ -0,0 +1 @@
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
4
keras-yolo3-master/.gitignore
vendored
Executable file
4
keras-yolo3-master/.gitignore
vendored
Executable file
@@ -0,0 +1,4 @@
|
||||
*.jpg
|
||||
*.jpeg
|
||||
*.weights
|
||||
*.h5
|
||||
21
keras-yolo3-master/LICENSE
Executable file
21
keras-yolo3-master/LICENSE
Executable file
@@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2017 Ngoc Anh Huynh
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
1
keras-yolo3-master/Link_Git
Normal file
1
keras-yolo3-master/Link_Git
Normal file
@@ -0,0 +1 @@
|
||||
https://github.com/experiencor/keras-yolo3
|
||||
113
keras-yolo3-master/README.md
Executable file
113
keras-yolo3-master/README.md
Executable file
@@ -0,0 +1,113 @@
|
||||
# YOLO3 (Detection, Training, and Evaluation)
|
||||
|
||||
## Dataset and Model
|
||||
|
||||
Dataset | mAP | Demo | Config | Model
|
||||
:---:|:---:|:---:|:---:|:---:
|
||||
Kangaroo Detection (1 class) (https://github.com/experiencor/kangaroo) | 95% | https://youtu.be/URO3UDHvoLY | check zoo | http://bit.do/ekQFj
|
||||
Raccoon Detection (1 class) (https://github.com/experiencor/raccoon_dataset) | 98% | https://youtu.be/lxLyLIL7OsU | check zoo | http://bit.do/ekQFf
|
||||
Red Blood Cell Detection (3 classes) (https://github.com/experiencor/BCCD_Dataset) | 84% | https://imgur.com/a/uJl2lRI | check zoo | http://bit.do/ekQFc
|
||||
VOC (20 classes) (http://host.robots.ox.ac.uk/pascal/VOC/voc2012/) | 72% | https://youtu.be/0RmOI6hcfBI | check zoo | http://bit.do/ekQE5
|
||||
|
||||
## Todo list:
|
||||
- [x] Yolo3 detection
|
||||
- [x] Yolo3 training (warmup and multi-scale)
|
||||
- [x] mAP Evaluation
|
||||
- [x] Multi-GPU training
|
||||
- [x] Evaluation on VOC
|
||||
- [ ] Evaluation on COCO
|
||||
- [ ] MobileNet, DenseNet, ResNet, and VGG backends
|
||||
|
||||
## Detection
|
||||
|
||||
Grab the pretrained weights of yolo3 from https://pjreddie.com/media/files/yolov3.weights.
|
||||
|
||||
```python yolo3_one_file_to_detect_them_all.py -w yolo3.weights -i dog.jpg```
|
||||
|
||||
## Training
|
||||
|
||||
### 1. Data preparation
|
||||
|
||||
Download the Raccoon dataset from from https://github.com/experiencor/raccoon_dataset.
|
||||
|
||||
Organize the dataset into 4 folders:
|
||||
|
||||
+ train_image_folder <= the folder that contains the train images.
|
||||
|
||||
+ train_annot_folder <= the folder that contains the train annotations in VOC format.
|
||||
|
||||
+ valid_image_folder <= the folder that contains the validation images.
|
||||
|
||||
+ valid_annot_folder <= the folder that contains the validation annotations in VOC format.
|
||||
|
||||
There is a one-to-one correspondence by file name between images and annotations. If the validation set is empty, the training set will be automatically splitted into the training set and validation set using the ratio of 0.8.
|
||||
|
||||
### 2. Edit the configuration file
|
||||
The configuration file is a json file, which looks like this:
|
||||
|
||||
```python
|
||||
{
|
||||
"model" : {
|
||||
"min_input_size": 352,
|
||||
"max_input_size": 448,
|
||||
"anchors": [10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326],
|
||||
"labels": ["raccoon"]
|
||||
},
|
||||
|
||||
"train": {
|
||||
"train_image_folder": "/home/andy/data/raccoon_dataset/images/",
|
||||
"train_annot_folder": "/home/andy/data/raccoon_dataset/anns/",
|
||||
|
||||
"train_times": 10, # the number of time to cycle through the training set, useful for small datasets
|
||||
"pretrained_weights": "", # specify the path of the pretrained weights, but it's fine to start from scratch
|
||||
"batch_size": 16, # the number of images to read in each batch
|
||||
"learning_rate": 1e-4, # the base learning rate of the default Adam rate scheduler
|
||||
"nb_epoch": 50, # number of epoches
|
||||
"warmup_epochs": 3, # the number of initial epochs during which the sizes of the 5 boxes in each cell is forced to match the sizes of the 5 anchors, this trick seems to improve precision emperically
|
||||
"ignore_thresh": 0.5,
|
||||
"gpus": "0,1",
|
||||
|
||||
"saved_weights_name": "raccoon.h5",
|
||||
"debug": true # turn on/off the line that prints current confidence, position, size, class losses and recall
|
||||
},
|
||||
|
||||
"valid": {
|
||||
"valid_image_folder": "",
|
||||
"valid_annot_folder": "",
|
||||
|
||||
"valid_times": 1
|
||||
}
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
The ```labels``` setting lists the labels to be trained on. Only images, which has labels being listed, are fed to the network. The rest images are simply ignored. By this way, a Dog Detector can easily be trained using VOC or COCO dataset by setting ```labels``` to ```['dog']```.
|
||||
|
||||
Download pretrained weights for backend at:
|
||||
|
||||
https://1drv.ms/u/s!ApLdDEW3ut5fgQXa7GzSlG-mdza6
|
||||
|
||||
**This weights must be put in the root folder of the repository. They are the pretrained weights for the backend only and will be loaded during model creation. The code does not work without this weights.**
|
||||
|
||||
### 3. Generate anchors for your dataset (optional)
|
||||
|
||||
`python gen_anchors.py -c config.json`
|
||||
|
||||
Copy the generated anchors printed on the terminal to the ```anchors``` setting in ```config.json```.
|
||||
|
||||
### 4. Start the training process
|
||||
|
||||
`python train.py -c config.json`
|
||||
|
||||
By the end of this process, the code will write the weights of the best model to file best_weights.h5 (or whatever name specified in the setting "saved_weights_name" in the config.json file). The training process stops when the loss on the validation set is not improved in 3 consecutive epoches.
|
||||
|
||||
### 5. Perform detection using trained weights on image, set of images, video, or webcam
|
||||
`python predict.py -c config.json -i /path/to/image/or/video`
|
||||
|
||||
It carries out detection on the image and write the image with detected bounding boxes to the same folder.
|
||||
|
||||
## Evaluation
|
||||
|
||||
`python evaluate.py -c config.json`
|
||||
|
||||
Compute the mAP performance of the model defined in `saved_weights_name` on the validation dataset defined in `valid_image_folder` and `valid_annot_folder`.
|
||||
BIN
keras-yolo3-master/__pycache__/callbacks.cpython-36.pyc
Executable file
BIN
keras-yolo3-master/__pycache__/callbacks.cpython-36.pyc
Executable file
Binary file not shown.
BIN
keras-yolo3-master/__pycache__/generator.cpython-36.pyc
Executable file
BIN
keras-yolo3-master/__pycache__/generator.cpython-36.pyc
Executable file
Binary file not shown.
BIN
keras-yolo3-master/__pycache__/voc.cpython-36.pyc
Executable file
BIN
keras-yolo3-master/__pycache__/voc.cpython-36.pyc
Executable file
Binary file not shown.
BIN
keras-yolo3-master/__pycache__/voc.cpython-37.pyc
Normal file
BIN
keras-yolo3-master/__pycache__/voc.cpython-37.pyc
Normal file
Binary file not shown.
BIN
keras-yolo3-master/__pycache__/yolo.cpython-36.pyc
Executable file
BIN
keras-yolo3-master/__pycache__/yolo.cpython-36.pyc
Executable file
Binary file not shown.
BIN
keras-yolo3-master/__pycache__/yolo.cpython-37.pyc
Normal file
BIN
keras-yolo3-master/__pycache__/yolo.cpython-37.pyc
Normal file
Binary file not shown.
70
keras-yolo3-master/callbacks.py
Executable file
70
keras-yolo3-master/callbacks.py
Executable file
@@ -0,0 +1,70 @@
|
||||
from keras.callbacks import TensorBoard, ModelCheckpoint
|
||||
import tensorflow as tf
|
||||
import numpy as np
|
||||
|
||||
class CustomTensorBoard(TensorBoard):
|
||||
""" to log the loss after each batch
|
||||
"""
|
||||
def __init__(self, log_every=1, **kwargs):
|
||||
super(CustomTensorBoard, self).__init__(**kwargs)
|
||||
self.log_every = log_every
|
||||
self.counter = 0
|
||||
|
||||
def on_batch_end(self, batch, logs=None):
|
||||
self.counter+=1
|
||||
if self.counter%self.log_every==0:
|
||||
for name, value in logs.items():
|
||||
if name in ['batch', 'size']:
|
||||
continue
|
||||
summary = tf.Summary()
|
||||
summary_value = summary.value.add()
|
||||
summary_value.simple_value = value.item()
|
||||
summary_value.tag = name
|
||||
self.writer.add_summary(summary, self.counter)
|
||||
self.writer.flush()
|
||||
|
||||
super(CustomTensorBoard, self).on_batch_end(batch, logs)
|
||||
|
||||
class CustomModelCheckpoint(ModelCheckpoint):
|
||||
""" to save the template model, not the multi-GPU model
|
||||
"""
|
||||
def __init__(self, model_to_save, **kwargs):
|
||||
super(CustomModelCheckpoint, self).__init__(**kwargs)
|
||||
self.model_to_save = model_to_save
|
||||
|
||||
def on_epoch_end(self, epoch, logs=None):
|
||||
logs = logs or {}
|
||||
self.epochs_since_last_save += 1
|
||||
if self.epochs_since_last_save >= self.period:
|
||||
self.epochs_since_last_save = 0
|
||||
filepath = self.filepath.format(epoch=epoch + 1, **logs)
|
||||
if self.save_best_only:
|
||||
current = logs.get(self.monitor)
|
||||
if current is None:
|
||||
warnings.warn('Can save best model only with %s available, '
|
||||
'skipping.' % (self.monitor), RuntimeWarning)
|
||||
else:
|
||||
if self.monitor_op(current, self.best):
|
||||
if self.verbose > 0:
|
||||
print('\nEpoch %05d: %s improved from %0.5f to %0.5f,'
|
||||
' saving model to %s'
|
||||
% (epoch + 1, self.monitor, self.best,
|
||||
current, filepath))
|
||||
self.best = current
|
||||
if self.save_weights_only:
|
||||
self.model_to_save.save_weights(filepath, overwrite=True)
|
||||
else:
|
||||
self.model_to_save.save(filepath, overwrite=True)
|
||||
else:
|
||||
if self.verbose > 0:
|
||||
print('\nEpoch %05d: %s did not improve from %0.5f' %
|
||||
(epoch + 1, self.monitor, self.best))
|
||||
else:
|
||||
if self.verbose > 0:
|
||||
print('\nEpoch %05d: saving model to %s' % (epoch + 1, filepath))
|
||||
if self.save_weights_only:
|
||||
self.model_to_save.save_weights(filepath, overwrite=True)
|
||||
else:
|
||||
self.model_to_save.save(filepath, overwrite=True)
|
||||
|
||||
super(CustomModelCheckpoint, self).on_batch_end(epoch, logs)
|
||||
BIN
keras-yolo3-master/callbacks.pyc
Executable file
BIN
keras-yolo3-master/callbacks.pyc
Executable file
Binary file not shown.
49
keras-yolo3-master/config_full_yolo.json
Executable file
49
keras-yolo3-master/config_full_yolo.json
Executable file
@@ -0,0 +1,49 @@
|
||||
{
|
||||
"model" : {
|
||||
"min_input_size": 448,
|
||||
"max_input_size": 448,
|
||||
"anchors": [26,32, 45,119, 54,18, 94,59, 109,183, 200,21, 203,91, 210,253, 249,157],
|
||||
"labels": ["Gun", "Knife", "Razor", "Shuriken"],
|
||||
"backend": "full_yolo_backend.h5"
|
||||
},
|
||||
|
||||
"train": {
|
||||
"train_image_folder": "../Experimento_6/Training/images/",
|
||||
"train_annot_folder": "../Experimento_6/Training/anns/",
|
||||
"cache_name": "experimento_6_gpu.pkl",
|
||||
|
||||
"train_times": 1,
|
||||
|
||||
"batch_size": 2,
|
||||
"learning_rate": 1e-4,
|
||||
"nb_epochs": 100,
|
||||
"warmup_epochs": 10,
|
||||
"ignore_thresh": 0.5,
|
||||
"gpus": "0,1",
|
||||
|
||||
"grid_scales": [1,1,1],
|
||||
"obj_scale": 5,
|
||||
"noobj_scale": 1,
|
||||
"xywh_scale": 1,
|
||||
"class_scale": 1,
|
||||
|
||||
"tensorboard_dir": "log_experimento_3_gpu",
|
||||
"saved_weights_name": "../Experimento_5/Resultados_yolo3/full_yolo/experimento_5_yolo3_full_yolo.h5",
|
||||
"debug": true
|
||||
},
|
||||
|
||||
"valid": {
|
||||
"valid_image_folder": "../Experimento_6/Training/images/",
|
||||
"valid_annot_folder": "../Experimento_6/Training/anns/",
|
||||
"cache_name": "val_6.pkl",
|
||||
|
||||
"valid_times": 1
|
||||
},
|
||||
"test": {
|
||||
"test_image_folder": "../Experimento_3/Baggages/Testing_678/images/",
|
||||
"test_annot_folder": "../Experimento_3/Baggages/Testing_678/anns/",
|
||||
"cache_name": "experimento_3_test678.pkl",
|
||||
|
||||
"test_times": 1
|
||||
}
|
||||
}
|
||||
49
keras-yolo3-master/config_full_yolo_fault.json
Executable file
49
keras-yolo3-master/config_full_yolo_fault.json
Executable file
@@ -0,0 +1,49 @@
|
||||
{
|
||||
"model" : {
|
||||
"min_input_size": 448,
|
||||
"max_input_size": 448,
|
||||
"anchors": [5,7, 10,14, 26,32, 45,119, 54,18, 94,59, 109,183, 200,21, 203,91],
|
||||
"labels": ["1", "2", "3", "4", "5", "6", "7", "8"],
|
||||
"backend": "full_yolo_backend.h5"
|
||||
},
|
||||
|
||||
"train": {
|
||||
"train_image_folder": "../model-definition/Train&Test_B/images/",
|
||||
"train_annot_folder": "../model-definition/Train&Test_B/anns/",
|
||||
"cache_name": "experimento_fault_gpu.pkl",
|
||||
|
||||
"train_times": 1,
|
||||
|
||||
"batch_size": 2,
|
||||
"learning_rate": 1e-4,
|
||||
"nb_epochs": 100,
|
||||
"warmup_epochs": 10,
|
||||
"ignore_thresh": 0.5,
|
||||
"gpus": "0,1",
|
||||
|
||||
"grid_scales": [1,1,1],
|
||||
"obj_scale": 5,
|
||||
"noobj_scale": 1,
|
||||
"xywh_scale": 1,
|
||||
"class_scale": 1,
|
||||
|
||||
"tensorboard_dir": "log_experimento_fault_gpu",
|
||||
"saved_weights_name": "../model-definition/experimento_yolo3_full_fault.h5",
|
||||
"debug": true
|
||||
},
|
||||
|
||||
"valid": {
|
||||
"valid_image_folder": "../model-definition/Train&Test_B/images/",
|
||||
"valid_annot_folder": "../model-definition/Train&Test_B/anns/",
|
||||
"cache_name": "val_fault.pkl",
|
||||
|
||||
"valid_times": 1
|
||||
},
|
||||
"test": {
|
||||
"test_image_folder": "../model-definition/Train&Test_B/images/",
|
||||
"test_annot_folder": "../model-definition/Train&Test_B/anns/",
|
||||
"cache_name": "test_fault.pkl",
|
||||
|
||||
"test_times": 1
|
||||
}
|
||||
}
|
||||
49
keras-yolo3-master/config_full_yolo_fault_1.json
Executable file
49
keras-yolo3-master/config_full_yolo_fault_1.json
Executable file
@@ -0,0 +1,49 @@
|
||||
{
|
||||
"model" : {
|
||||
"min_input_size": 400,
|
||||
"max_input_size": 400,
|
||||
"anchors": [5,7, 10,14, 15, 15, 26,32, 45,119, 54,18, 94,59, 109,183, 200,21],
|
||||
"labels": ["1"],
|
||||
"backend": "full_yolo_backend.h5"
|
||||
},
|
||||
|
||||
"train": {
|
||||
"train_image_folder": "../Train&Test_S/Train/images/",
|
||||
"train_annot_folder": "../Train&Test_S/Train/anns/",
|
||||
"cache_name": "../Experimento_fault_1/Resultados_yolo3/full_yolo/experimento_fault_1_gpu.pkl",
|
||||
|
||||
"train_times": 1,
|
||||
|
||||
"batch_size": 2,
|
||||
"learning_rate": 1e-4,
|
||||
"nb_epochs": 200,
|
||||
"warmup_epochs": 15,
|
||||
"ignore_thresh": 0.5,
|
||||
"gpus": "0,1",
|
||||
|
||||
"grid_scales": [1,1,1],
|
||||
"obj_scale": 5,
|
||||
"noobj_scale": 1,
|
||||
"xywh_scale": 1,
|
||||
"class_scale": 1,
|
||||
|
||||
"tensorboard_dir": "log_experimento_fault_gpu",
|
||||
"saved_weights_name": "../Experimento_fault_1/Resultados_yolo3/full_yolo/experimento_yolo3_full_fault.h5",
|
||||
"debug": true
|
||||
},
|
||||
|
||||
"valid": {
|
||||
"valid_image_folder": "../Train&Test_S/Test/images/",
|
||||
"valid_annot_folder": "../Train&Test_S/Test/anns/",
|
||||
"cache_name": "../Experimento_fault_1/Resultados_yolo3/full_yolo/val_fault_1.pkl",
|
||||
|
||||
"valid_times": 1
|
||||
},
|
||||
"test": {
|
||||
"test_image_folder": "../Train&Test_S/Test/images/",
|
||||
"test_annot_folder": "../Train&Test_S/Test/anns/",
|
||||
"cache_name": "../Experimento_fault_1/Resultados_yolo3/full_yolo/test_fault_1.pkl",
|
||||
|
||||
"test_times": 1
|
||||
}
|
||||
}
|
||||
49
keras-yolo3-master/config_full_yolo_panel.json
Executable file
49
keras-yolo3-master/config_full_yolo_panel.json
Executable file
@@ -0,0 +1,49 @@
|
||||
{
|
||||
"model" : {
|
||||
"min_input_size": 400,
|
||||
"max_input_size": 400,
|
||||
"anchors": [5,7, 10,14, 15, 15, 26,32, 45,119, 54,18, 94,59, 109,183, 200,21],
|
||||
"labels": ["panel"],
|
||||
"backend": "full_yolo_backend.h5"
|
||||
},
|
||||
|
||||
"train": {
|
||||
"train_image_folder": "../Train&Test_A/Train/images/",
|
||||
"train_annot_folder": "../Train&Test_A/Train/anns/",
|
||||
"cache_name": "../Resultados_yolo3_panel/experimento_fault_1_gpu.pkl",
|
||||
|
||||
"train_times": 1,
|
||||
|
||||
"batch_size": 2,
|
||||
"learning_rate": 1e-4,
|
||||
"nb_epochs": 200,
|
||||
"warmup_epochs": 15,
|
||||
"ignore_thresh": 0.5,
|
||||
"gpus": "0,1",
|
||||
|
||||
"grid_scales": [1,1,1],
|
||||
"obj_scale": 5,
|
||||
"noobj_scale": 1,
|
||||
"xywh_scale": 1,
|
||||
"class_scale": 1,
|
||||
|
||||
"tensorboard_dir": "log_experimento_fault_gpu",
|
||||
"saved_weights_name": "../Resultados_yolo3_panel/experimento_yolo3_full_panel.h5",
|
||||
"debug": true
|
||||
},
|
||||
|
||||
"valid": {
|
||||
"valid_image_folder": "../Train&Test_A/Test/images/",
|
||||
"valid_annot_folder": "../Train&Test_A/Test/anns/",
|
||||
"cache_name": "../Resultados_yolo3_panel//val_fault_1.pkl",
|
||||
|
||||
"valid_times": 1
|
||||
},
|
||||
"test": {
|
||||
"test_image_folder": "../Train&Test_A/Test/images/",
|
||||
"test_annot_folder": "../Train&Test_A/Test/anns/",
|
||||
"cache_name": "../Resultados_yolo3_panel//test_fault_1.pkl",
|
||||
|
||||
"test_times": 1
|
||||
}
|
||||
}
|
||||
80
keras-yolo3-master/evaluate.py
Normal file
80
keras-yolo3-master/evaluate.py
Normal file
@@ -0,0 +1,80 @@
|
||||
#! /usr/bin/env python
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import numpy as np
|
||||
import json
|
||||
from voc import parse_voc_annotation
|
||||
from yolo import create_yolov3_model
|
||||
from generator import BatchGenerator
|
||||
from utils.utils import normalize, evaluate
|
||||
from keras.callbacks import EarlyStopping, ModelCheckpoint
|
||||
from keras.optimizers import Adam
|
||||
from keras.models import load_model
|
||||
|
||||
def _main_(args):
|
||||
config_path = args.conf
|
||||
|
||||
with open(config_path) as config_buffer:
|
||||
config = json.loads(config_buffer.read())
|
||||
|
||||
###############################
|
||||
# Create the validation generator
|
||||
###############################
|
||||
valid_ints, labels = parse_voc_annotation(
|
||||
config['test']['test_annot_folder'],
|
||||
config['test']['test_image_folder'],
|
||||
config['test']['cache_name'],
|
||||
config['model']['labels']
|
||||
)
|
||||
|
||||
labels = labels.keys() if len(config['model']['labels']) == 0 else config['model']['labels']
|
||||
labels = sorted(labels)
|
||||
|
||||
valid_generator = BatchGenerator(
|
||||
instances = valid_ints,
|
||||
anchors = config['model']['anchors'],
|
||||
labels = labels,
|
||||
downsample = 32, # ratio between network input's size and network output's size, 32 for YOLOv3
|
||||
max_box_per_image = 0,
|
||||
batch_size = config['train']['batch_size'],
|
||||
min_net_size = config['model']['min_input_size'],
|
||||
max_net_size = config['model']['max_input_size'],
|
||||
shuffle = True,
|
||||
jitter = 0.0,
|
||||
norm = normalize
|
||||
)
|
||||
|
||||
###############################
|
||||
# Load the model and do evaluation
|
||||
###############################
|
||||
os.environ['CUDA_VISIBLE_DEVICES'] = config['train']['gpus']
|
||||
|
||||
infer_model = load_model(config['train']['saved_weights_name'])
|
||||
|
||||
# compute mAP for all the classes
|
||||
average_precisions = evaluate(infer_model, valid_generator)
|
||||
|
||||
# print the score
|
||||
total_instances = []
|
||||
precisions = []
|
||||
print(average_precisions.items())
|
||||
for label, (average_precision, num_annotations) in average_precisions.items():
|
||||
print('{:.0f} instances of class'.format(num_annotations),
|
||||
labels[label], 'with average precision: {:.4f}'.format(average_precision))
|
||||
total_instances.append(num_annotations)
|
||||
precisions.append(average_precision)
|
||||
|
||||
if sum(total_instances) == 0:
|
||||
print('No test instances found.')
|
||||
return
|
||||
|
||||
print('mAP using the weighted average of precisions among classes: {:.4f}'.format(sum([a * b for a, b in zip(total_instances, precisions)]) / sum(total_instances)))
|
||||
print('mAP: {:.4f}'.format(sum(precisions) / sum(x > 0 for x in total_instances)))
|
||||
|
||||
if __name__ == '__main__':
|
||||
argparser = argparse.ArgumentParser(description='Evaluate YOLO_v3 model on any dataset')
|
||||
argparser.add_argument('-c', '--conf', help='path to configuration file')
|
||||
|
||||
args = argparser.parse_args()
|
||||
_main_(args)
|
||||
132
keras-yolo3-master/gen_anchors.py
Executable file
132
keras-yolo3-master/gen_anchors.py
Executable file
@@ -0,0 +1,132 @@
|
||||
import random
|
||||
import argparse
|
||||
import numpy as np
|
||||
|
||||
from voc import parse_voc_annotation
|
||||
import json
|
||||
|
||||
def IOU(ann, centroids):
|
||||
w, h = ann
|
||||
similarities = []
|
||||
|
||||
for centroid in centroids:
|
||||
c_w, c_h = centroid
|
||||
|
||||
if c_w >= w and c_h >= h:
|
||||
similarity = w*h/(c_w*c_h)
|
||||
elif c_w >= w and c_h <= h:
|
||||
similarity = w*c_h/(w*h + (c_w-w)*c_h)
|
||||
elif c_w <= w and c_h >= h:
|
||||
similarity = c_w*h/(w*h + c_w*(c_h-h))
|
||||
else: #means both w,h are bigger than c_w and c_h respectively
|
||||
similarity = (c_w*c_h)/(w*h)
|
||||
similarities.append(similarity) # will become (k,) shape
|
||||
|
||||
return np.array(similarities)
|
||||
|
||||
def avg_IOU(anns, centroids):
|
||||
n,d = anns.shape
|
||||
sum = 0.
|
||||
|
||||
for i in range(anns.shape[0]):
|
||||
sum+= max(IOU(anns[i], centroids))
|
||||
|
||||
return sum/n
|
||||
|
||||
def print_anchors(centroids):
|
||||
out_string = ''
|
||||
|
||||
anchors = centroids.copy()
|
||||
|
||||
widths = anchors[:, 0]
|
||||
sorted_indices = np.argsort(widths)
|
||||
|
||||
r = "anchors: ["
|
||||
for i in sorted_indices:
|
||||
out_string += str(int(anchors[i,0]*416)) + ',' + str(int(anchors[i,1]*416)) + ', '
|
||||
|
||||
print(out_string[:-2])
|
||||
|
||||
def run_kmeans(ann_dims, anchor_num):
|
||||
ann_num = ann_dims.shape[0]
|
||||
iterations = 0
|
||||
prev_assignments = np.ones(ann_num)*(-1)
|
||||
iteration = 0
|
||||
old_distances = np.zeros((ann_num, anchor_num))
|
||||
|
||||
indices = [random.randrange(ann_dims.shape[0]) for i in range(anchor_num)]
|
||||
centroids = ann_dims[indices]
|
||||
anchor_dim = ann_dims.shape[1]
|
||||
|
||||
while True:
|
||||
distances = []
|
||||
iteration += 1
|
||||
for i in range(ann_num):
|
||||
d = 1 - IOU(ann_dims[i], centroids)
|
||||
distances.append(d)
|
||||
distances = np.array(distances) # distances.shape = (ann_num, anchor_num)
|
||||
|
||||
print("iteration {}: dists = {}".format(iteration, np.sum(np.abs(old_distances-distances))))
|
||||
|
||||
#assign samples to centroids
|
||||
assignments = np.argmin(distances,axis=1)
|
||||
|
||||
if (assignments == prev_assignments).all() :
|
||||
return centroids
|
||||
|
||||
#calculate new centroids
|
||||
centroid_sums=np.zeros((anchor_num, anchor_dim), np.float)
|
||||
for i in range(ann_num):
|
||||
centroid_sums[assignments[i]]+=ann_dims[i]
|
||||
for j in range(anchor_num):
|
||||
centroids[j] = centroid_sums[j]/(np.sum(assignments==j) + 1e-6)
|
||||
|
||||
prev_assignments = assignments.copy()
|
||||
old_distances = distances.copy()
|
||||
|
||||
def _main_(argv):
|
||||
config_path = args.conf
|
||||
num_anchors = args.anchors
|
||||
|
||||
with open(config_path) as config_buffer:
|
||||
config = json.loads(config_buffer.read())
|
||||
|
||||
train_imgs, train_labels = parse_voc_annotation(
|
||||
config['train']['train_annot_folder'],
|
||||
config['train']['train_image_folder'],
|
||||
config['train']['cache_name'],
|
||||
config['model']['labels']
|
||||
)
|
||||
|
||||
# run k_mean to find the anchors
|
||||
annotation_dims = []
|
||||
for image in train_imgs:
|
||||
print(image['filename'])
|
||||
for obj in image['object']:
|
||||
relative_w = (float(obj['xmax']) - float(obj['xmin']))/image['width']
|
||||
relatice_h = (float(obj["ymax"]) - float(obj['ymin']))/image['height']
|
||||
annotation_dims.append(tuple(map(float, (relative_w,relatice_h))))
|
||||
|
||||
annotation_dims = np.array(annotation_dims)
|
||||
centroids = run_kmeans(annotation_dims, num_anchors)
|
||||
|
||||
# write anchors to file
|
||||
print('\naverage IOU for', num_anchors, 'anchors:', '%0.2f' % avg_IOU(annotation_dims, centroids))
|
||||
print_anchors(centroids)
|
||||
|
||||
if __name__ == '__main__':
|
||||
argparser = argparse.ArgumentParser()
|
||||
|
||||
argparser.add_argument(
|
||||
'-c',
|
||||
'--conf',
|
||||
default='config.json',
|
||||
help='path to configuration file')
|
||||
argparser.add_argument(
|
||||
'-a',
|
||||
'--anchors',
|
||||
default=9,
|
||||
help='number of anchors to use')
|
||||
|
||||
args = argparser.parse_args()
|
||||
_main_(args)
|
||||
228
keras-yolo3-master/generator.py
Executable file
228
keras-yolo3-master/generator.py
Executable file
@@ -0,0 +1,228 @@
|
||||
import cv2
|
||||
import copy
|
||||
import numpy as np
|
||||
from keras.utils import Sequence
|
||||
from utils.bbox import BoundBox, bbox_iou
|
||||
from utils.image import apply_random_scale_and_crop, random_distort_image, random_flip, correct_bounding_boxes
|
||||
|
||||
class BatchGenerator(Sequence):
|
||||
def __init__(self,
|
||||
instances,
|
||||
anchors,
|
||||
labels,
|
||||
downsample=32, # ratio between network input's size and network output's size, 32 for YOLOv3
|
||||
max_box_per_image=30,
|
||||
batch_size=1,
|
||||
min_net_size=320,
|
||||
max_net_size=608,
|
||||
shuffle=True,
|
||||
jitter=True,
|
||||
norm=None
|
||||
):
|
||||
self.instances = instances
|
||||
self.batch_size = batch_size
|
||||
self.labels = labels
|
||||
self.downsample = downsample
|
||||
self.max_box_per_image = max_box_per_image
|
||||
self.min_net_size = (min_net_size//self.downsample)*self.downsample
|
||||
self.max_net_size = (max_net_size//self.downsample)*self.downsample
|
||||
self.shuffle = shuffle
|
||||
self.jitter = jitter
|
||||
self.norm = norm
|
||||
self.anchors = [BoundBox(0, 0, anchors[2*i], anchors[2*i+1]) for i in range(len(anchors)//2)]
|
||||
self.net_h = 416
|
||||
self.net_w = 416
|
||||
|
||||
if shuffle: np.random.shuffle(self.instances)
|
||||
|
||||
def __len__(self):
|
||||
return int(np.ceil(float(len(self.instances))/self.batch_size))
|
||||
|
||||
def __getitem__(self, idx):
|
||||
# get image input size, change every 10 batches
|
||||
net_h, net_w = self._get_net_size(idx)
|
||||
base_grid_h, base_grid_w = net_h//self.downsample, net_w//self.downsample
|
||||
|
||||
# determine the first and the last indices of the batch
|
||||
l_bound = idx*self.batch_size
|
||||
r_bound = (idx+1)*self.batch_size
|
||||
|
||||
if r_bound > len(self.instances):
|
||||
r_bound = len(self.instances)
|
||||
l_bound = r_bound - self.batch_size
|
||||
|
||||
x_batch = np.zeros((r_bound - l_bound, net_h, net_w, 3)) # input images
|
||||
t_batch = np.zeros((r_bound - l_bound, 1, 1, 1, self.max_box_per_image, 4)) # list of groundtruth boxes
|
||||
|
||||
# initialize the inputs and the outputs
|
||||
yolo_1 = np.zeros((r_bound - l_bound, 1*base_grid_h, 1*base_grid_w, len(self.anchors)//3, 4+1+len(self.labels))) # desired network output 1
|
||||
yolo_2 = np.zeros((r_bound - l_bound, 2*base_grid_h, 2*base_grid_w, len(self.anchors)//3, 4+1+len(self.labels))) # desired network output 2
|
||||
yolo_3 = np.zeros((r_bound - l_bound, 4*base_grid_h, 4*base_grid_w, len(self.anchors)//3, 4+1+len(self.labels))) # desired network output 3
|
||||
yolos = [yolo_3, yolo_2, yolo_1]
|
||||
|
||||
dummy_yolo_1 = np.zeros((r_bound - l_bound, 1))
|
||||
dummy_yolo_2 = np.zeros((r_bound - l_bound, 1))
|
||||
dummy_yolo_3 = np.zeros((r_bound - l_bound, 1))
|
||||
|
||||
instance_count = 0
|
||||
true_box_index = 0
|
||||
|
||||
# do the logic to fill in the inputs and the output
|
||||
for train_instance in self.instances[l_bound:r_bound]:
|
||||
# augment input image and fix object's position and size
|
||||
img, all_objs = self._aug_image(train_instance, net_h, net_w)
|
||||
|
||||
for obj in all_objs:
|
||||
# find the best anchor box for this object
|
||||
max_anchor = None
|
||||
max_index = -1
|
||||
max_iou = -1
|
||||
|
||||
shifted_box = BoundBox(0,
|
||||
0,
|
||||
obj['xmax']-obj['xmin'],
|
||||
obj['ymax']-obj['ymin'])
|
||||
|
||||
for i in range(len(self.anchors)):
|
||||
anchor = self.anchors[i]
|
||||
iou = bbox_iou(shifted_box, anchor)
|
||||
|
||||
if max_iou < iou:
|
||||
max_anchor = anchor
|
||||
max_index = i
|
||||
max_iou = iou
|
||||
|
||||
# determine the yolo to be responsible for this bounding box
|
||||
yolo = yolos[max_index//3]
|
||||
grid_h, grid_w = yolo.shape[1:3]
|
||||
|
||||
# determine the position of the bounding box on the grid
|
||||
center_x = .5*(obj['xmin'] + obj['xmax'])
|
||||
center_x = center_x / float(net_w) * grid_w # sigma(t_x) + c_x
|
||||
center_y = .5*(obj['ymin'] + obj['ymax'])
|
||||
center_y = center_y / float(net_h) * grid_h # sigma(t_y) + c_y
|
||||
|
||||
# determine the sizes of the bounding box
|
||||
w = np.log((obj['xmax'] - obj['xmin']) / float(max_anchor.xmax)) # t_w
|
||||
h = np.log((obj['ymax'] - obj['ymin']) / float(max_anchor.ymax)) # t_h
|
||||
|
||||
box = [center_x, center_y, w, h]
|
||||
|
||||
# determine the index of the label
|
||||
obj_indx = self.labels.index(obj['name'])
|
||||
|
||||
# determine the location of the cell responsible for this object
|
||||
grid_x = int(np.floor(center_x))
|
||||
grid_y = int(np.floor(center_y))
|
||||
|
||||
# assign ground truth x, y, w, h, confidence and class probs to y_batch
|
||||
yolo[instance_count, grid_y, grid_x, max_index%3] = 0
|
||||
yolo[instance_count, grid_y, grid_x, max_index%3, 0:4] = box
|
||||
yolo[instance_count, grid_y, grid_x, max_index%3, 4 ] = 1.
|
||||
yolo[instance_count, grid_y, grid_x, max_index%3, 5+obj_indx] = 1
|
||||
|
||||
# assign the true box to t_batch
|
||||
true_box = [center_x, center_y, obj['xmax'] - obj['xmin'], obj['ymax'] - obj['ymin']]
|
||||
t_batch[instance_count, 0, 0, 0, true_box_index] = true_box
|
||||
|
||||
true_box_index += 1
|
||||
true_box_index = true_box_index % self.max_box_per_image
|
||||
|
||||
# assign input image to x_batch
|
||||
if self.norm != None:
|
||||
x_batch[instance_count] = self.norm(img)
|
||||
else:
|
||||
# plot image and bounding boxes for sanity check
|
||||
for obj in all_objs:
|
||||
cv2.rectangle(img, (obj['xmin'],obj['ymin']), (obj['xmax'],obj['ymax']), (255,0,0), 3)
|
||||
cv2.putText(img, obj['name'],
|
||||
(obj['xmin']+2, obj['ymin']+12),
|
||||
0, 1.2e-3 * img.shape[0],
|
||||
(0,255,0), 2)
|
||||
|
||||
x_batch[instance_count] = img
|
||||
|
||||
# increase instance counter in the current batch
|
||||
instance_count += 1
|
||||
|
||||
return [x_batch, t_batch, yolo_1, yolo_2, yolo_3], [dummy_yolo_1, dummy_yolo_2, dummy_yolo_3]
|
||||
|
||||
def _get_net_size(self, idx):
|
||||
if idx%10 == 0:
|
||||
net_size = self.downsample*np.random.randint(self.min_net_size/self.downsample, \
|
||||
self.max_net_size/self.downsample+1)
|
||||
#print("resizing: ", net_size, net_size)
|
||||
self.net_h, self.net_w = net_size, net_size
|
||||
return self.net_h, self.net_w
|
||||
|
||||
def _aug_image(self, instance, net_h, net_w):
|
||||
image_name = instance['filename']
|
||||
image = cv2.imread(image_name) # RGB image
|
||||
|
||||
if image is None: print('Cannot find ', image_name)
|
||||
image = image[:,:,::-1] # RGB image
|
||||
|
||||
image_h, image_w, _ = image.shape
|
||||
|
||||
# determine the amount of scaling and cropping
|
||||
dw = self.jitter * image_w;
|
||||
dh = self.jitter * image_h;
|
||||
|
||||
new_ar = (image_w + np.random.uniform(-dw, dw)) / (image_h + np.random.uniform(-dh, dh));
|
||||
scale = np.random.uniform(0.25, 2);
|
||||
|
||||
if (new_ar < 1):
|
||||
new_h = int(scale * net_h);
|
||||
new_w = int(net_h * new_ar);
|
||||
else:
|
||||
new_w = int(scale * net_w);
|
||||
new_h = int(net_w / new_ar);
|
||||
|
||||
dx = int(np.random.uniform(0, net_w - new_w));
|
||||
dy = int(np.random.uniform(0, net_h - new_h));
|
||||
|
||||
# apply scaling and cropping
|
||||
im_sized = apply_random_scale_and_crop(image, new_w, new_h, net_w, net_h, dx, dy)
|
||||
|
||||
# randomly distort hsv space
|
||||
im_sized = random_distort_image(im_sized)
|
||||
|
||||
# randomly flip
|
||||
flip = np.random.randint(2)
|
||||
im_sized = random_flip(im_sized, flip)
|
||||
|
||||
# correct the size and pos of bounding boxes
|
||||
all_objs = correct_bounding_boxes(instance['object'], new_w, new_h, net_w, net_h, dx, dy, flip, image_w, image_h)
|
||||
|
||||
return im_sized, all_objs
|
||||
|
||||
def on_epoch_end(self):
|
||||
if self.shuffle: np.random.shuffle(self.instances)
|
||||
|
||||
def num_classes(self):
|
||||
return len(self.labels)
|
||||
|
||||
def size(self):
|
||||
return len(self.instances)
|
||||
|
||||
def get_anchors(self):
|
||||
anchors = []
|
||||
|
||||
for anchor in self.anchors:
|
||||
anchors += [anchor.xmax, anchor.ymax]
|
||||
|
||||
return anchors
|
||||
|
||||
def load_annotation(self, i):
|
||||
annots = []
|
||||
|
||||
for obj in self.instances[i]['object']:
|
||||
annot = [obj['xmin'], obj['ymin'], obj['xmax'], obj['ymax'], self.labels.index(obj['name'])]
|
||||
annots += [annot]
|
||||
|
||||
if len(annots) == 0: annots = [[]]
|
||||
|
||||
return np.array(annots)
|
||||
|
||||
def load_image(self, i):
|
||||
return cv2.imread(self.instances[i]['filename'])
|
||||
BIN
keras-yolo3-master/generator.pyc
Executable file
BIN
keras-yolo3-master/generator.pyc
Executable file
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
BIN
keras-yolo3-master/output/prueba.mp4
Normal file
BIN
keras-yolo3-master/output/prueba.mp4
Normal file
Binary file not shown.
BIN
keras-yolo3-master/output/prueba_0.mp4
Normal file
BIN
keras-yolo3-master/output/prueba_0.mp4
Normal file
Binary file not shown.
144
keras-yolo3-master/predict.py
Executable file
144
keras-yolo3-master/predict.py
Executable file
@@ -0,0 +1,144 @@
|
||||
#! /usr/bin/env python
|
||||
|
||||
import time
|
||||
import os
|
||||
import argparse
|
||||
import json
|
||||
import cv2
|
||||
from utils.utils import get_yolo_boxes, makedirs
|
||||
from utils.bbox import draw_boxes
|
||||
from keras.models import load_model
|
||||
from tqdm import tqdm
|
||||
import numpy as np
|
||||
|
||||
def _main_(args):
|
||||
config_path = args.conf
|
||||
input_path = args.input
|
||||
output_path = args.output
|
||||
|
||||
with open(config_path) as config_buffer:
|
||||
config = json.load(config_buffer)
|
||||
|
||||
makedirs(output_path)
|
||||
|
||||
###############################
|
||||
# Set some parameter
|
||||
###############################
|
||||
net_h, net_w = 416, 416 # a multiple of 32, the smaller the faster
|
||||
obj_thresh, nms_thresh = 0.5, 0.45
|
||||
|
||||
###############################
|
||||
# Load the model
|
||||
###############################
|
||||
os.environ['CUDA_VISIBLE_DEVICES'] = config['train']['gpus']
|
||||
infer_model = load_model(config['train']['saved_weights_name'])
|
||||
|
||||
###############################
|
||||
# Predict bounding boxes
|
||||
###############################
|
||||
if 'webcam' in input_path: # do detection on the first webcam
|
||||
video_reader = cv2.VideoCapture(0)
|
||||
|
||||
# the main loop
|
||||
batch_size = 1
|
||||
images = []
|
||||
while True:
|
||||
ret_val, image = video_reader.read()
|
||||
if ret_val == True: images += [image]
|
||||
|
||||
if (len(images)==batch_size) or (ret_val==False and len(images)>0):
|
||||
batch_boxes = get_yolo_boxes(infer_model, images, net_h, net_w, config['model']['anchors'], obj_thresh, nms_thresh)
|
||||
|
||||
for i in range(len(images)):
|
||||
draw_boxes(images[i], batch_boxes[i], config['model']['labels'], obj_thresh)
|
||||
cv2.imshow('video with bboxes', images[i])
|
||||
images = []
|
||||
if cv2.waitKey(1) == 27:
|
||||
break # esc to quit
|
||||
cv2.destroyAllWindows()
|
||||
elif input_path[-4:] == '.mp4': # do detection on a video
|
||||
video_out = output_path + input_path.split('/')[-1]
|
||||
video_reader = cv2.VideoCapture(input_path)
|
||||
|
||||
nb_frames = int(video_reader.get(cv2.CAP_PROP_FRAME_COUNT))
|
||||
frame_h = int(video_reader.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
||||
frame_w = int(video_reader.get(cv2.CAP_PROP_FRAME_WIDTH))
|
||||
|
||||
video_writer = cv2.VideoWriter(video_out,
|
||||
cv2.VideoWriter_fourcc(*'MPEG'),
|
||||
50.0,
|
||||
(frame_w, frame_h))
|
||||
# the main loop
|
||||
batch_size = 1
|
||||
images = []
|
||||
start_point = 0 #%
|
||||
show_window = False
|
||||
for i in tqdm(range(nb_frames)):
|
||||
_, image = video_reader.read()
|
||||
|
||||
if (float(i+1)/nb_frames) > start_point/100.:
|
||||
images += [image]
|
||||
|
||||
if (i%batch_size == 0) or (i == (nb_frames-1) and len(images) > 0):
|
||||
# predict the bounding boxes
|
||||
batch_boxes = get_yolo_boxes(infer_model, images, net_h, net_w, config['model']['anchors'], obj_thresh, nms_thresh)
|
||||
|
||||
for i in range(len(images)):
|
||||
# draw bounding boxes on the image using labels
|
||||
draw_boxes(images[i], batch_boxes[i], config['model']['labels'], obj_thresh)
|
||||
|
||||
# show the video with detection bounding boxes
|
||||
if show_window: cv2.imshow('video with bboxes', images[i])
|
||||
|
||||
# write result to the output video
|
||||
video_writer.write(images[i])
|
||||
images = []
|
||||
|
||||
if show_window and cv2.waitKey(1) == 27: break # esc to quit
|
||||
|
||||
if show_window: cv2.destroyAllWindows()
|
||||
video_reader.release()
|
||||
video_writer.release()
|
||||
else: # do detection on an image or a set of images
|
||||
|
||||
|
||||
|
||||
image_paths = []
|
||||
|
||||
if os.path.isdir(input_path):
|
||||
for inp_file in os.listdir(input_path):
|
||||
image_paths += [input_path + inp_file]
|
||||
else:
|
||||
image_paths += [input_path]
|
||||
|
||||
image_paths = [inp_file for inp_file in image_paths if (inp_file[-4:] in ['.jpg', '.png', 'JPEG'])]
|
||||
|
||||
# the main loop
|
||||
times = []
|
||||
|
||||
for image_path in image_paths:
|
||||
image = cv2.imread(image_path)
|
||||
print(image_path)
|
||||
start = time.time()
|
||||
# predict the bounding boxes
|
||||
boxes = get_yolo_boxes(infer_model, [image], net_h, net_w, config['model']['anchors'], obj_thresh, nms_thresh)[0]
|
||||
print('Elapsed time = {}'.format(time.time() - start))
|
||||
times.append(time.time() - start)
|
||||
# draw bounding boxes on the image using labels
|
||||
draw_boxes(image, boxes, config['model']['labels'], obj_thresh)
|
||||
|
||||
# write the image with bounding boxes to file
|
||||
cv2.imwrite(output_path + image_path.split('/')[-1], np.uint8(image))
|
||||
|
||||
file = open(args.output + '/time.txt','w')
|
||||
file.write('Tiempo promedio:' + str(np.mean(times)))
|
||||
file.close()
|
||||
|
||||
if __name__ == '__main__':
|
||||
argparser = argparse.ArgumentParser(description='Predict with a trained yolo model')
|
||||
argparser.add_argument('-c', '--conf', help='path to configuration file')
|
||||
argparser.add_argument('-i', '--input', help='path to an image, a directory of images, a video, or webcam')
|
||||
argparser.add_argument('-o', '--output', default='output/', help='path to output directory')
|
||||
|
||||
args = argparser.parse_args()
|
||||
_main_(args)
|
||||
294
keras-yolo3-master/train.py
Executable file
294
keras-yolo3-master/train.py
Executable file
@@ -0,0 +1,294 @@
|
||||
#! /usr/bin/env python
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import numpy as np
|
||||
import json
|
||||
from voc import parse_voc_annotation
|
||||
from yolo import create_yolov3_model, dummy_loss
|
||||
from generator import BatchGenerator
|
||||
from utils.utils import normalize, evaluate, makedirs
|
||||
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
|
||||
from keras.optimizers import Adam
|
||||
from callbacks import CustomModelCheckpoint, CustomTensorBoard
|
||||
from utils.multi_gpu_model import multi_gpu_model
|
||||
import tensorflow as tf
|
||||
import keras
|
||||
from keras.models import load_model
|
||||
|
||||
def create_training_instances(
|
||||
train_annot_folder,
|
||||
train_image_folder,
|
||||
train_cache,
|
||||
valid_annot_folder,
|
||||
valid_image_folder,
|
||||
valid_cache,
|
||||
labels,
|
||||
):
|
||||
# parse annotations of the training set
|
||||
train_ints, train_labels = parse_voc_annotation(train_annot_folder, train_image_folder, train_cache, labels)
|
||||
|
||||
# parse annotations of the validation set, if any, otherwise split the training set
|
||||
if os.path.exists(valid_annot_folder):
|
||||
valid_ints, valid_labels = parse_voc_annotation(valid_annot_folder, valid_image_folder, valid_cache, labels)
|
||||
else:
|
||||
print("valid_annot_folder not exists. Spliting the trainining set.")
|
||||
|
||||
train_valid_split = int(0.8*len(train_ints))
|
||||
np.random.seed(0)
|
||||
np.random.shuffle(train_ints)
|
||||
np.random.seed()
|
||||
|
||||
valid_ints = train_ints[train_valid_split:]
|
||||
train_ints = train_ints[:train_valid_split]
|
||||
|
||||
# compare the seen labels with the given labels in config.json
|
||||
if len(labels) > 0:
|
||||
overlap_labels = set(labels).intersection(set(train_labels.keys()))
|
||||
|
||||
print('Seen labels: \t' + str(train_labels) + '\n')
|
||||
print('Given labels: \t' + str(labels))
|
||||
|
||||
# return None, None, None if some given label is not in the dataset
|
||||
if len(overlap_labels) < len(labels):
|
||||
print('Some labels have no annotations! Please revise the list of labels in the config.json.')
|
||||
return None, None, None
|
||||
else:
|
||||
print('No labels are provided. Train on all seen labels.')
|
||||
print(train_labels)
|
||||
labels = train_labels.keys()
|
||||
|
||||
max_box_per_image = max([len(inst['object']) for inst in (train_ints + valid_ints)])
|
||||
|
||||
return train_ints, valid_ints, sorted(labels), max_box_per_image
|
||||
|
||||
def create_callbacks(saved_weights_name, tensorboard_logs, model_to_save):
|
||||
makedirs(tensorboard_logs)
|
||||
|
||||
early_stop = EarlyStopping(
|
||||
monitor = 'loss',
|
||||
min_delta = 0.01,
|
||||
patience = 25,
|
||||
mode = 'min',
|
||||
verbose = 1
|
||||
)
|
||||
checkpoint = CustomModelCheckpoint(
|
||||
model_to_save = model_to_save,
|
||||
filepath = saved_weights_name,# + '{epoch:02d}.h5',
|
||||
monitor = 'loss',
|
||||
verbose = 1,
|
||||
save_best_only = True,
|
||||
mode = 'min',
|
||||
period = 1
|
||||
)
|
||||
reduce_on_plateau = ReduceLROnPlateau(
|
||||
monitor = 'loss',
|
||||
factor = 0.5,
|
||||
patience = 15,
|
||||
verbose = 1,
|
||||
mode = 'min',
|
||||
epsilon = 0.01,
|
||||
cooldown = 0,
|
||||
min_lr = 0
|
||||
)
|
||||
tensorboard = CustomTensorBoard(
|
||||
log_dir = tensorboard_logs,
|
||||
write_graph = True,
|
||||
write_images = True,
|
||||
)
|
||||
return [early_stop, checkpoint, reduce_on_plateau, tensorboard]
|
||||
|
||||
def create_model(
|
||||
nb_class,
|
||||
anchors,
|
||||
max_box_per_image,
|
||||
max_grid, batch_size,
|
||||
warmup_batches,
|
||||
ignore_thresh,
|
||||
multi_gpu,
|
||||
saved_weights_name,
|
||||
lr,
|
||||
grid_scales,
|
||||
obj_scale,
|
||||
noobj_scale,
|
||||
xywh_scale,
|
||||
class_scale,
|
||||
backend
|
||||
):
|
||||
if multi_gpu > 1:
|
||||
with tf.device('/cpu:0'):
|
||||
template_model, infer_model = create_yolov3_model(
|
||||
nb_class = nb_class,
|
||||
anchors = anchors,
|
||||
max_box_per_image = max_box_per_image,
|
||||
max_grid = max_grid,
|
||||
batch_size = batch_size//multi_gpu,
|
||||
warmup_batches = warmup_batches,
|
||||
ignore_thresh = ignore_thresh,
|
||||
grid_scales = grid_scales,
|
||||
obj_scale = obj_scale,
|
||||
noobj_scale = noobj_scale,
|
||||
xywh_scale = xywh_scale,
|
||||
class_scale = class_scale
|
||||
)
|
||||
else:
|
||||
template_model, infer_model = create_yolov3_model(
|
||||
nb_class = nb_class,
|
||||
anchors = anchors,
|
||||
max_box_per_image = max_box_per_image,
|
||||
max_grid = max_grid,
|
||||
batch_size = batch_size,
|
||||
warmup_batches = warmup_batches,
|
||||
ignore_thresh = ignore_thresh,
|
||||
grid_scales = grid_scales,
|
||||
obj_scale = obj_scale,
|
||||
noobj_scale = noobj_scale,
|
||||
xywh_scale = xywh_scale,
|
||||
class_scale = class_scale
|
||||
)
|
||||
|
||||
# load the pretrained weight if exists, otherwise load the backend weight only
|
||||
if os.path.exists(saved_weights_name):
|
||||
print("\nLoading pretrained weights.\n")
|
||||
template_model.load_weights(saved_weights_name)
|
||||
else:
|
||||
template_model.load_weights(backend, by_name=True)
|
||||
|
||||
if multi_gpu > 1:
|
||||
train_model = multi_gpu_model(template_model, gpus=multi_gpu)
|
||||
else:
|
||||
train_model = template_model
|
||||
|
||||
optimizer = Adam(lr=lr, clipnorm=0.001)
|
||||
train_model.compile(loss=dummy_loss, optimizer=optimizer)
|
||||
|
||||
return train_model, infer_model
|
||||
|
||||
def _main_(args):
|
||||
config_path = args.conf
|
||||
|
||||
with open(config_path) as config_buffer:
|
||||
config = json.loads(config_buffer.read())
|
||||
|
||||
###############################
|
||||
# Parse the annotations
|
||||
###############################
|
||||
train_ints, valid_ints, labels, max_box_per_image = create_training_instances(
|
||||
config['train']['train_annot_folder'],
|
||||
config['train']['train_image_folder'],
|
||||
config['train']['cache_name'],
|
||||
config['valid']['valid_annot_folder'],
|
||||
config['valid']['valid_image_folder'],
|
||||
config['valid']['cache_name'],
|
||||
config['model']['labels']
|
||||
)
|
||||
print('\nTraining on: \t' + str(labels) + '\n')
|
||||
|
||||
###############################
|
||||
# Create the generators
|
||||
###############################
|
||||
train_generator = BatchGenerator(
|
||||
instances = train_ints,
|
||||
anchors = config['model']['anchors'],
|
||||
labels = labels,
|
||||
downsample = 32, # ratio between network input's size and network output's size, 32 for YOLOv3
|
||||
max_box_per_image = max_box_per_image,
|
||||
batch_size = config['train']['batch_size'],
|
||||
min_net_size = config['model']['min_input_size'],
|
||||
max_net_size = config['model']['max_input_size'],
|
||||
shuffle = True,
|
||||
jitter = 0.3,
|
||||
norm = normalize
|
||||
)
|
||||
|
||||
valid_generator = BatchGenerator(
|
||||
instances = valid_ints,
|
||||
anchors = config['model']['anchors'],
|
||||
labels = labels,
|
||||
downsample = 32, # ratio between network input's size and network output's size, 32 for YOLOv3
|
||||
max_box_per_image = max_box_per_image,
|
||||
batch_size = config['train']['batch_size'],
|
||||
min_net_size = config['model']['min_input_size'],
|
||||
max_net_size = config['model']['max_input_size'],
|
||||
shuffle = True,
|
||||
jitter = 0.0,
|
||||
norm = normalize
|
||||
)
|
||||
|
||||
###############################
|
||||
# Create the model
|
||||
###############################
|
||||
if os.path.exists(config['train']['saved_weights_name']):
|
||||
config['train']['warmup_epochs'] = 0
|
||||
warmup_batches = config['train']['warmup_epochs'] * (config['train']['train_times']*len(train_generator))
|
||||
|
||||
os.environ['CUDA_VISIBLE_DEVICES'] = config['train']['gpus']
|
||||
multi_gpu = len(config['train']['gpus'].split(','))
|
||||
print('multi_gpu:' + str(multi_gpu))
|
||||
|
||||
train_model, infer_model = create_model(
|
||||
nb_class = len(labels),
|
||||
anchors = config['model']['anchors'],
|
||||
max_box_per_image = max_box_per_image,
|
||||
max_grid = [config['model']['max_input_size'], config['model']['max_input_size']],
|
||||
batch_size = config['train']['batch_size'],
|
||||
warmup_batches = warmup_batches,
|
||||
ignore_thresh = config['train']['ignore_thresh'],
|
||||
multi_gpu = multi_gpu,
|
||||
saved_weights_name = config['train']['saved_weights_name'],
|
||||
lr = config['train']['learning_rate'],
|
||||
grid_scales = config['train']['grid_scales'],
|
||||
obj_scale = config['train']['obj_scale'],
|
||||
noobj_scale = config['train']['noobj_scale'],
|
||||
xywh_scale = config['train']['xywh_scale'],
|
||||
class_scale = config['train']['class_scale'],
|
||||
backend = config['model']['backend']
|
||||
)
|
||||
|
||||
###############################
|
||||
# Kick off the training
|
||||
###############################
|
||||
callbacks = create_callbacks(config['train']['saved_weights_name'], config['train']['tensorboard_dir'], infer_model)
|
||||
|
||||
train_model.fit_generator(
|
||||
generator = train_generator,
|
||||
steps_per_epoch = len(train_generator) * config['train']['train_times'],
|
||||
epochs = config['train']['nb_epochs'] + config['train']['warmup_epochs'],
|
||||
verbose = 2 if config['train']['debug'] else 1,
|
||||
callbacks = callbacks,
|
||||
workers = 4,
|
||||
max_queue_size = 8
|
||||
)
|
||||
|
||||
# make a GPU version of infer_model for evaluation
|
||||
if multi_gpu > 1:
|
||||
infer_model = load_model(config['train']['saved_weights_name'])
|
||||
|
||||
###############################
|
||||
# Run the evaluation
|
||||
###############################
|
||||
# compute mAP for all the classes
|
||||
average_precisions = evaluate(infer_model, valid_generator)
|
||||
|
||||
# print the score
|
||||
total_instances = []
|
||||
precisions = []
|
||||
for label, (average_precision, num_annotations) in average_precisions.items():
|
||||
print('{:.0f} instances of class'.format(num_annotations),
|
||||
labels[label], 'with average precision: {:.4f}'.format(average_precision))
|
||||
total_instances.append(num_annotations)
|
||||
precisions.append(average_precision)
|
||||
|
||||
if sum(total_instances) == 0:
|
||||
print('No test instances found.')
|
||||
return
|
||||
|
||||
print('mAP using the weighted average of precisions among classes: {:.4f}'.format(sum([a * b for a, b in zip(total_instances, precisions)]) / sum(total_instances)))
|
||||
print('mAP: {:.4f}'.format(sum(precisions) / sum(x > 0 for x in total_instances)))
|
||||
|
||||
if __name__ == '__main__':
|
||||
argparser = argparse.ArgumentParser(description='train and evaluate YOLO_v3 model on any dataset')
|
||||
argparser.add_argument('-c', '--conf', help='path to configuration file')
|
||||
|
||||
args = argparser.parse_args()
|
||||
_main_(args)
|
||||
0
keras-yolo3-master/utils/__init__.py
Executable file
0
keras-yolo3-master/utils/__init__.py
Executable file
BIN
keras-yolo3-master/utils/__init__.pyc
Executable file
BIN
keras-yolo3-master/utils/__init__.pyc
Executable file
Binary file not shown.
BIN
keras-yolo3-master/utils/__pycache__/__init__.cpython-36.pyc
Executable file
BIN
keras-yolo3-master/utils/__pycache__/__init__.cpython-36.pyc
Executable file
Binary file not shown.
BIN
keras-yolo3-master/utils/__pycache__/bbox.cpython-36.pyc
Normal file
BIN
keras-yolo3-master/utils/__pycache__/bbox.cpython-36.pyc
Normal file
Binary file not shown.
BIN
keras-yolo3-master/utils/__pycache__/colors.cpython-36.pyc
Executable file
BIN
keras-yolo3-master/utils/__pycache__/colors.cpython-36.pyc
Executable file
Binary file not shown.
BIN
keras-yolo3-master/utils/__pycache__/image.cpython-36.pyc
Executable file
BIN
keras-yolo3-master/utils/__pycache__/image.cpython-36.pyc
Executable file
Binary file not shown.
BIN
keras-yolo3-master/utils/__pycache__/multi_gpu_model.cpython-36.pyc
Executable file
BIN
keras-yolo3-master/utils/__pycache__/multi_gpu_model.cpython-36.pyc
Executable file
Binary file not shown.
BIN
keras-yolo3-master/utils/__pycache__/utils.cpython-36.pyc
Normal file
BIN
keras-yolo3-master/utils/__pycache__/utils.cpython-36.pyc
Normal file
Binary file not shown.
89
keras-yolo3-master/utils/bbox.py
Executable file
89
keras-yolo3-master/utils/bbox.py
Executable file
@@ -0,0 +1,89 @@
|
||||
import numpy as np
|
||||
import os
|
||||
import cv2
|
||||
from .colors import get_color
|
||||
|
||||
class BoundBox:
|
||||
def __init__(self, xmin, ymin, xmax, ymax, c = None, classes = None):
|
||||
self.xmin = xmin
|
||||
self.ymin = ymin
|
||||
self.xmax = xmax
|
||||
self.ymax = ymax
|
||||
|
||||
self.c = c
|
||||
self.classes = classes
|
||||
|
||||
self.label = -1
|
||||
self.score = -1
|
||||
|
||||
def get_label(self):
|
||||
if self.label == -1:
|
||||
self.label = np.argmax(self.classes)
|
||||
|
||||
return self.label
|
||||
|
||||
def get_score(self):
|
||||
if self.score == -1:
|
||||
self.score = self.classes[self.get_label()]
|
||||
|
||||
return self.score
|
||||
|
||||
def _interval_overlap(interval_a, interval_b):
|
||||
x1, x2 = interval_a
|
||||
x3, x4 = interval_b
|
||||
|
||||
if x3 < x1:
|
||||
if x4 < x1:
|
||||
return 0
|
||||
else:
|
||||
return min(x2,x4) - x1
|
||||
else:
|
||||
if x2 < x3:
|
||||
return 0
|
||||
else:
|
||||
return min(x2,x4) - x3
|
||||
|
||||
def bbox_iou(box1, box2):
|
||||
intersect_w = _interval_overlap([box1.xmin, box1.xmax], [box2.xmin, box2.xmax])
|
||||
intersect_h = _interval_overlap([box1.ymin, box1.ymax], [box2.ymin, box2.ymax])
|
||||
|
||||
intersect = intersect_w * intersect_h
|
||||
|
||||
w1, h1 = box1.xmax-box1.xmin, box1.ymax-box1.ymin
|
||||
w2, h2 = box2.xmax-box2.xmin, box2.ymax-box2.ymin
|
||||
|
||||
union = w1*h1 + w2*h2 - intersect
|
||||
|
||||
return float(intersect) / union
|
||||
|
||||
def draw_boxes(image, boxes, labels, obj_thresh, quiet=True):
|
||||
for box in boxes:
|
||||
label_str = ''
|
||||
label = -1
|
||||
|
||||
for i in range(len(labels)):
|
||||
if box.classes[i] > obj_thresh:
|
||||
if label_str != '': label_str += ', '
|
||||
label_str += (labels[i] + ' ' + str(round(box.get_score()*100,0)) + '%')
|
||||
label = i
|
||||
if not quiet: print(label_str)
|
||||
|
||||
if label >= 0:
|
||||
text_size = cv2.getTextSize(label_str, cv2.FONT_HERSHEY_SIMPLEX, 1.1e-4 * image.shape[0], 2)
|
||||
width, height = text_size[0][0], text_size[0][1]
|
||||
region = np.array([[box.xmin-3, box.ymin],
|
||||
[box.xmin-3, box.ymin-height-16],
|
||||
[box.xmin+width+6, box.ymin-height-16],
|
||||
[box.xmin+width+6, box.ymin]], dtype='int32')
|
||||
|
||||
cv2.rectangle(img=image, pt1=(box.xmin,box.ymin), pt2=(box.xmax,box.ymax), color=get_color(label), thickness=1)
|
||||
cv2.fillPoly(img=image, pts=[region], color=get_color(label))
|
||||
cv2.putText(img=image,
|
||||
text=label_str,
|
||||
org=(box.xmin+6, box.ymin - 6),
|
||||
fontFace=cv2.FONT_HERSHEY_SIMPLEX,
|
||||
fontScale=0.7e-3 * image.shape[0],
|
||||
color=(0,0,0),
|
||||
thickness=2)
|
||||
|
||||
return image
|
||||
BIN
keras-yolo3-master/utils/bbox.pyc
Executable file
BIN
keras-yolo3-master/utils/bbox.pyc
Executable file
Binary file not shown.
96
keras-yolo3-master/utils/colors.py
Executable file
96
keras-yolo3-master/utils/colors.py
Executable file
@@ -0,0 +1,96 @@
|
||||
def get_color(label):
|
||||
""" Return a color from a set of predefined colors. Contains 80 colors in total.
|
||||
code originally from https://github.com/fizyr/keras-retinanet/
|
||||
Args
|
||||
label: The label to get the color for.
|
||||
Returns
|
||||
A list of three values representing a RGB color.
|
||||
"""
|
||||
if label < len(colors):
|
||||
return colors[label]
|
||||
else:
|
||||
print('Label {} has no color, returning default.'.format(label))
|
||||
return (0, 255, 0)
|
||||
|
||||
colors = [
|
||||
[31 , 0 , 255] ,
|
||||
[0 , 159 , 255] ,
|
||||
[255 , 95 , 0] ,
|
||||
[255 , 19 , 0] ,
|
||||
[255 , 0 , 0] ,
|
||||
[255 , 38 , 0] ,
|
||||
[0 , 255 , 25] ,
|
||||
[255 , 0 , 133] ,
|
||||
[255 , 172 , 0] ,
|
||||
[108 , 0 , 255] ,
|
||||
[0 , 82 , 255] ,
|
||||
[0 , 255 , 6] ,
|
||||
[255 , 0 , 152] ,
|
||||
[223 , 0 , 255] ,
|
||||
[12 , 0 , 255] ,
|
||||
[0 , 255 , 178] ,
|
||||
[108 , 255 , 0] ,
|
||||
[184 , 0 , 255] ,
|
||||
[255 , 0 , 76] ,
|
||||
[146 , 255 , 0] ,
|
||||
[51 , 0 , 255] ,
|
||||
[0 , 197 , 255] ,
|
||||
[255 , 248 , 0] ,
|
||||
[255 , 0 , 19] ,
|
||||
[255 , 0 , 38] ,
|
||||
[89 , 255 , 0] ,
|
||||
[127 , 255 , 0] ,
|
||||
[255 , 153 , 0] ,
|
||||
[0 , 255 , 255] ,
|
||||
[0 , 255 , 216] ,
|
||||
[0 , 255 , 121] ,
|
||||
[255 , 0 , 248] ,
|
||||
[70 , 0 , 255] ,
|
||||
[0 , 255 , 159] ,
|
||||
[0 , 216 , 255] ,
|
||||
[0 , 6 , 255] ,
|
||||
[0 , 63 , 255] ,
|
||||
[31 , 255 , 0] ,
|
||||
[255 , 57 , 0] ,
|
||||
[255 , 0 , 210] ,
|
||||
[0 , 255 , 102] ,
|
||||
[242 , 255 , 0] ,
|
||||
[255 , 191 , 0] ,
|
||||
[0 , 255 , 63] ,
|
||||
[255 , 0 , 95] ,
|
||||
[146 , 0 , 255] ,
|
||||
[184 , 255 , 0] ,
|
||||
[255 , 114 , 0] ,
|
||||
[0 , 255 , 235] ,
|
||||
[255 , 229 , 0] ,
|
||||
[0 , 178 , 255] ,
|
||||
[255 , 0 , 114] ,
|
||||
[255 , 0 , 57] ,
|
||||
[0 , 140 , 255] ,
|
||||
[0 , 121 , 255] ,
|
||||
[12 , 255 , 0] ,
|
||||
[255 , 210 , 0] ,
|
||||
[0 , 255 , 44] ,
|
||||
[165 , 255 , 0] ,
|
||||
[0 , 25 , 255] ,
|
||||
[0 , 255 , 140] ,
|
||||
[0 , 101 , 255] ,
|
||||
[0 , 255 , 82] ,
|
||||
[223 , 255 , 0] ,
|
||||
[242 , 0 , 255] ,
|
||||
[89 , 0 , 255] ,
|
||||
[165 , 0 , 255] ,
|
||||
[70 , 255 , 0] ,
|
||||
[255 , 0 , 172] ,
|
||||
[255 , 76 , 0] ,
|
||||
[203 , 255 , 0] ,
|
||||
[204 , 0 , 255] ,
|
||||
[255 , 0 , 229] ,
|
||||
[255 , 133 , 0] ,
|
||||
[127 , 0 , 255] ,
|
||||
[0 , 235 , 255] ,
|
||||
[0 , 255 , 197] ,
|
||||
[255 , 0 , 191] ,
|
||||
[0 , 44 , 255] ,
|
||||
[50 , 255 , 0]
|
||||
]
|
||||
BIN
keras-yolo3-master/utils/colors.pyc
Executable file
BIN
keras-yolo3-master/utils/colors.pyc
Executable file
Binary file not shown.
86
keras-yolo3-master/utils/image.py
Executable file
86
keras-yolo3-master/utils/image.py
Executable file
@@ -0,0 +1,86 @@
|
||||
import cv2
|
||||
import numpy as np
|
||||
import copy
|
||||
|
||||
def _rand_scale(scale):
|
||||
scale = np.random.uniform(1, scale)
|
||||
return scale if (np.random.randint(2) == 0) else 1./scale;
|
||||
|
||||
def _constrain(min_v, max_v, value):
|
||||
if value < min_v: return min_v
|
||||
if value > max_v: return max_v
|
||||
return value
|
||||
|
||||
def random_flip(image, flip):
|
||||
if flip == 1: return cv2.flip(image, 1)
|
||||
return image
|
||||
|
||||
def correct_bounding_boxes(boxes, new_w, new_h, net_w, net_h, dx, dy, flip, image_w, image_h):
|
||||
boxes = copy.deepcopy(boxes)
|
||||
|
||||
# randomize boxes' order
|
||||
np.random.shuffle(boxes)
|
||||
|
||||
# correct sizes and positions
|
||||
sx, sy = float(new_w)/image_w, float(new_h)/image_h
|
||||
zero_boxes = []
|
||||
|
||||
for i in range(len(boxes)):
|
||||
boxes[i]['xmin'] = int(_constrain(0, net_w, boxes[i]['xmin']*sx + dx))
|
||||
boxes[i]['xmax'] = int(_constrain(0, net_w, boxes[i]['xmax']*sx + dx))
|
||||
boxes[i]['ymin'] = int(_constrain(0, net_h, boxes[i]['ymin']*sy + dy))
|
||||
boxes[i]['ymax'] = int(_constrain(0, net_h, boxes[i]['ymax']*sy + dy))
|
||||
|
||||
if boxes[i]['xmax'] <= boxes[i]['xmin'] or boxes[i]['ymax'] <= boxes[i]['ymin']:
|
||||
zero_boxes += [i]
|
||||
continue
|
||||
|
||||
if flip == 1:
|
||||
swap = boxes[i]['xmin'];
|
||||
boxes[i]['xmin'] = net_w - boxes[i]['xmax']
|
||||
boxes[i]['xmax'] = net_w - swap
|
||||
|
||||
boxes = [boxes[i] for i in range(len(boxes)) if i not in zero_boxes]
|
||||
|
||||
return boxes
|
||||
|
||||
def random_distort_image(image, hue=18, saturation=1.5, exposure=1.5):
|
||||
# determine scale factors
|
||||
dhue = np.random.uniform(-hue, hue)
|
||||
dsat = _rand_scale(saturation);
|
||||
dexp = _rand_scale(exposure);
|
||||
|
||||
# convert RGB space to HSV space
|
||||
image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV).astype('float')
|
||||
|
||||
# change satuation and exposure
|
||||
image[:,:,1] *= dsat
|
||||
image[:,:,2] *= dexp
|
||||
|
||||
# change hue
|
||||
image[:,:,0] += dhue
|
||||
image[:,:,0] -= (image[:,:,0] > 180)*180
|
||||
image[:,:,0] += (image[:,:,0] < 0) *180
|
||||
|
||||
# convert back to RGB from HSV
|
||||
return cv2.cvtColor(image.astype('uint8'), cv2.COLOR_HSV2RGB)
|
||||
|
||||
def apply_random_scale_and_crop(image, new_w, new_h, net_w, net_h, dx, dy):
|
||||
im_sized = cv2.resize(image, (new_w, new_h))
|
||||
|
||||
if dx > 0:
|
||||
im_sized = np.pad(im_sized, ((0,0), (dx,0), (0,0)), mode='constant', constant_values=127)
|
||||
else:
|
||||
im_sized = im_sized[:,-dx:,:]
|
||||
if (new_w + dx) < net_w:
|
||||
im_sized = np.pad(im_sized, ((0,0), (0, net_w - (new_w+dx)), (0,0)), mode='constant', constant_values=127)
|
||||
|
||||
if dy > 0:
|
||||
im_sized = np.pad(im_sized, ((dy,0), (0,0), (0,0)), mode='constant', constant_values=127)
|
||||
else:
|
||||
im_sized = im_sized[-dy:,:,:]
|
||||
|
||||
if (new_h + dy) < net_h:
|
||||
im_sized = np.pad(im_sized, ((0, net_h - (new_h+dy)), (0,0), (0,0)), mode='constant', constant_values=127)
|
||||
|
||||
return im_sized[:net_h, :net_w,:]
|
||||
BIN
keras-yolo3-master/utils/image.pyc
Executable file
BIN
keras-yolo3-master/utils/image.pyc
Executable file
Binary file not shown.
62
keras-yolo3-master/utils/multi_gpu_model.py
Executable file
62
keras-yolo3-master/utils/multi_gpu_model.py
Executable file
@@ -0,0 +1,62 @@
|
||||
from keras.layers import Lambda, concatenate
|
||||
from keras.models import Model
|
||||
import tensorflow as tf
|
||||
|
||||
def multi_gpu_model(model, gpus):
|
||||
if isinstance(gpus, (list, tuple)):
|
||||
num_gpus = len(gpus)
|
||||
target_gpu_ids = gpus
|
||||
else:
|
||||
num_gpus = gpus
|
||||
target_gpu_ids = range(num_gpus)
|
||||
|
||||
def get_slice(data, i, parts):
|
||||
shape = tf.shape(data)
|
||||
batch_size = shape[:1]
|
||||
input_shape = shape[1:]
|
||||
step = batch_size // parts
|
||||
if i == num_gpus - 1:
|
||||
size = batch_size - step * i
|
||||
else:
|
||||
size = step
|
||||
size = tf.concat([size, input_shape], axis=0)
|
||||
stride = tf.concat([step, input_shape * 0], axis=0)
|
||||
start = stride * i
|
||||
return tf.slice(data, start, size)
|
||||
|
||||
all_outputs = []
|
||||
for i in range(len(model.outputs)):
|
||||
all_outputs.append([])
|
||||
|
||||
# Place a copy of the model on each GPU,
|
||||
# each getting a slice of the inputs.
|
||||
for i, gpu_id in enumerate(target_gpu_ids):
|
||||
with tf.device('/gpu:%d' % gpu_id):
|
||||
with tf.name_scope('replica_%d' % gpu_id):
|
||||
inputs = []
|
||||
# Retrieve a slice of the input.
|
||||
for x in model.inputs:
|
||||
input_shape = tuple(x.get_shape().as_list())[1:]
|
||||
slice_i = Lambda(get_slice,
|
||||
output_shape=input_shape,
|
||||
arguments={'i': i,
|
||||
'parts': num_gpus})(x)
|
||||
inputs.append(slice_i)
|
||||
|
||||
# Apply model on slice
|
||||
# (creating a model replica on the target device).
|
||||
outputs = model(inputs)
|
||||
if not isinstance(outputs, list):
|
||||
outputs = [outputs]
|
||||
|
||||
# Save the outputs for merging back together later.
|
||||
for o in range(len(outputs)):
|
||||
all_outputs[o].append(outputs[o])
|
||||
|
||||
# Merge outputs on CPU.
|
||||
with tf.device('/cpu:0'):
|
||||
merged = []
|
||||
for name, outputs in zip(model.output_names, all_outputs):
|
||||
merged.append(concatenate(outputs,
|
||||
axis=0, name=name))
|
||||
return Model(model.inputs, merged)
|
||||
BIN
keras-yolo3-master/utils/multi_gpu_model.pyc
Executable file
BIN
keras-yolo3-master/utils/multi_gpu_model.pyc
Executable file
Binary file not shown.
323
keras-yolo3-master/utils/utils.py
Normal file
323
keras-yolo3-master/utils/utils.py
Normal file
@@ -0,0 +1,323 @@
|
||||
import cv2
|
||||
import numpy as np
|
||||
import os
|
||||
from .bbox import BoundBox, bbox_iou
|
||||
from scipy.special import expit
|
||||
|
||||
def _sigmoid(x):
|
||||
return expit(x)
|
||||
|
||||
def makedirs(path):
|
||||
try:
|
||||
os.makedirs(path)
|
||||
except OSError:
|
||||
if not os.path.isdir(path):
|
||||
raise
|
||||
|
||||
def evaluate(model,
|
||||
generator,
|
||||
iou_threshold=0.5,
|
||||
obj_thresh=0.5,
|
||||
nms_thresh=0.45,
|
||||
net_h=416,
|
||||
net_w=416,
|
||||
save_path=None):
|
||||
""" Evaluate a given dataset using a given model.
|
||||
code originally from https://github.com/fizyr/keras-retinanet
|
||||
|
||||
# Arguments
|
||||
model : The model to evaluate.
|
||||
generator : The generator that represents the dataset to evaluate.
|
||||
iou_threshold : The threshold used to consider when a detection is positive or negative.
|
||||
obj_thresh : The threshold used to distinguish between object and non-object
|
||||
nms_thresh : The threshold used to determine whether two detections are duplicates
|
||||
net_h : The height of the input image to the model, higher value results in better accuracy
|
||||
net_w : The width of the input image to the model
|
||||
save_path : The path to save images with visualized detections to.
|
||||
# Returns
|
||||
A dict mapping class names to mAP scores.
|
||||
"""
|
||||
# gather all detections and annotations
|
||||
all_detections = [[None for i in range(generator.num_classes())] for j in range(generator.size())]
|
||||
all_annotations = [[None for i in range(generator.num_classes())] for j in range(generator.size())]
|
||||
|
||||
for i in range(generator.size()):
|
||||
raw_image = [generator.load_image(i)]
|
||||
|
||||
# make the boxes and the labels
|
||||
pred_boxes = get_yolo_boxes(model, raw_image, net_h, net_w, generator.get_anchors(), obj_thresh, nms_thresh)[0]
|
||||
|
||||
score = np.array([box.get_score() for box in pred_boxes])
|
||||
pred_labels = np.array([box.label for box in pred_boxes])
|
||||
|
||||
if len(pred_boxes) > 0:
|
||||
pred_boxes = np.array([[box.xmin, box.ymin, box.xmax, box.ymax, box.get_score()] for box in pred_boxes])
|
||||
else:
|
||||
pred_boxes = np.array([[]])
|
||||
|
||||
# sort the boxes and the labels according to scores
|
||||
score_sort = np.argsort(-score)
|
||||
pred_labels = pred_labels[score_sort]
|
||||
pred_boxes = pred_boxes[score_sort]
|
||||
|
||||
# copy detections to all_detections
|
||||
for label in range(generator.num_classes()):
|
||||
all_detections[i][label] = pred_boxes[pred_labels == label, :]
|
||||
|
||||
annotations = generator.load_annotation(i)
|
||||
|
||||
# copy detections to all_annotations
|
||||
for label in range(generator.num_classes()):
|
||||
all_annotations[i][label] = annotations[annotations[:, 4] == label, :4].copy()
|
||||
|
||||
# compute mAP by comparing all detections and all annotations
|
||||
average_precisions = {}
|
||||
|
||||
for label in range(generator.num_classes()):
|
||||
false_positives = np.zeros((0,))
|
||||
true_positives = np.zeros((0,))
|
||||
scores = np.zeros((0,))
|
||||
num_annotations = 0.0
|
||||
|
||||
for i in range(generator.size()):
|
||||
detections = all_detections[i][label]
|
||||
annotations = all_annotations[i][label]
|
||||
num_annotations += annotations.shape[0]
|
||||
detected_annotations = []
|
||||
|
||||
for d in detections:
|
||||
scores = np.append(scores, d[4])
|
||||
|
||||
if annotations.shape[0] == 0:
|
||||
false_positives = np.append(false_positives, 1)
|
||||
true_positives = np.append(true_positives, 0)
|
||||
continue
|
||||
|
||||
overlaps = compute_overlap(np.expand_dims(d, axis=0), annotations)
|
||||
assigned_annotation = np.argmax(overlaps, axis=1)
|
||||
max_overlap = overlaps[0, assigned_annotation]
|
||||
|
||||
if max_overlap >= iou_threshold and assigned_annotation not in detected_annotations:
|
||||
false_positives = np.append(false_positives, 0)
|
||||
true_positives = np.append(true_positives, 1)
|
||||
detected_annotations.append(assigned_annotation)
|
||||
else:
|
||||
false_positives = np.append(false_positives, 1)
|
||||
true_positives = np.append(true_positives, 0)
|
||||
|
||||
# no annotations -> AP for this class is 0 (is this correct?)
|
||||
if num_annotations == 0:
|
||||
average_precisions[label] = 0
|
||||
continue
|
||||
|
||||
# sort by score
|
||||
indices = np.argsort(-scores)
|
||||
false_positives = false_positives[indices]
|
||||
true_positives = true_positives[indices]
|
||||
|
||||
# compute false positives and true positives
|
||||
false_positives = np.cumsum(false_positives)
|
||||
true_positives = np.cumsum(true_positives)
|
||||
|
||||
# compute recall and precision
|
||||
recall = true_positives / num_annotations
|
||||
precision = true_positives / np.maximum(true_positives + false_positives, np.finfo(np.float64).eps)
|
||||
|
||||
# compute average precision
|
||||
average_precision = compute_ap(recall, precision)
|
||||
average_precisions[label] = average_precision,num_annotations
|
||||
|
||||
return average_precisions
|
||||
|
||||
def correct_yolo_boxes(boxes, image_h, image_w, net_h, net_w):
|
||||
if (float(net_w)/image_w) < (float(net_h)/image_h):
|
||||
new_w = net_w
|
||||
new_h = (image_h*net_w)/image_w
|
||||
else:
|
||||
new_h = net_w
|
||||
new_w = (image_w*net_h)/image_h
|
||||
|
||||
for i in range(len(boxes)):
|
||||
x_offset, x_scale = (net_w - new_w)/2./net_w, float(new_w)/net_w
|
||||
y_offset, y_scale = (net_h - new_h)/2./net_h, float(new_h)/net_h
|
||||
|
||||
boxes[i].xmin = int((boxes[i].xmin - x_offset) / x_scale * image_w)
|
||||
boxes[i].xmax = int((boxes[i].xmax - x_offset) / x_scale * image_w)
|
||||
boxes[i].ymin = int((boxes[i].ymin - y_offset) / y_scale * image_h)
|
||||
boxes[i].ymax = int((boxes[i].ymax - y_offset) / y_scale * image_h)
|
||||
|
||||
def do_nms(boxes, nms_thresh):
|
||||
if len(boxes) > 0:
|
||||
nb_class = len(boxes[0].classes)
|
||||
else:
|
||||
return
|
||||
|
||||
for c in range(nb_class):
|
||||
sorted_indices = np.argsort([-box.classes[c] for box in boxes])
|
||||
|
||||
for i in range(len(sorted_indices)):
|
||||
index_i = sorted_indices[i]
|
||||
|
||||
if boxes[index_i].classes[c] == 0: continue
|
||||
|
||||
for j in range(i+1, len(sorted_indices)):
|
||||
index_j = sorted_indices[j]
|
||||
|
||||
if bbox_iou(boxes[index_i], boxes[index_j]) >= nms_thresh:
|
||||
boxes[index_j].classes[c] = 0
|
||||
|
||||
def decode_netout(netout, anchors, obj_thresh, net_h, net_w):
|
||||
grid_h, grid_w = netout.shape[:2]
|
||||
nb_box = 3
|
||||
netout = netout.reshape((grid_h, grid_w, nb_box, -1))
|
||||
nb_class = netout.shape[-1] - 5
|
||||
|
||||
boxes = []
|
||||
|
||||
netout[..., :2] = _sigmoid(netout[..., :2])
|
||||
netout[..., 4] = _sigmoid(netout[..., 4])
|
||||
netout[..., 5:] = netout[..., 4][..., np.newaxis] * _softmax(netout[..., 5:])
|
||||
netout[..., 5:] *= netout[..., 5:] > obj_thresh
|
||||
|
||||
for i in range(grid_h*grid_w):
|
||||
row = i // grid_w
|
||||
col = i % grid_w
|
||||
|
||||
for b in range(nb_box):
|
||||
# 4th element is objectness score
|
||||
objectness = netout[row, col, b, 4]
|
||||
|
||||
if(objectness <= obj_thresh): continue
|
||||
|
||||
# first 4 elements are x, y, w, and h
|
||||
x, y, w, h = netout[row,col,b,:4]
|
||||
|
||||
x = (col + x) / grid_w # center position, unit: image width
|
||||
y = (row + y) / grid_h # center position, unit: image height
|
||||
w = anchors[2 * b + 0] * np.exp(w) / net_w # unit: image width
|
||||
h = anchors[2 * b + 1] * np.exp(h) / net_h # unit: image height
|
||||
|
||||
# last elements are class probabilities
|
||||
classes = netout[row,col,b,5:]
|
||||
|
||||
box = BoundBox(x-w/2, y-h/2, x+w/2, y+h/2, objectness, classes)
|
||||
|
||||
boxes.append(box)
|
||||
|
||||
return boxes
|
||||
|
||||
def preprocess_input(image, net_h, net_w):
|
||||
new_h, new_w, _ = image.shape
|
||||
|
||||
# determine the new size of the image
|
||||
if (float(net_w)/new_w) < (float(net_h)/new_h):
|
||||
new_h = (new_h * net_w)//new_w
|
||||
new_w = net_w
|
||||
else:
|
||||
new_w = (new_w * net_h)//new_h
|
||||
new_h = net_h
|
||||
|
||||
# resize the image to the new size
|
||||
resized = cv2.resize(image[:,:,::-1]/255., (new_w, new_h))
|
||||
|
||||
# embed the image into the standard letter box
|
||||
new_image = np.ones((net_h, net_w, 3)) * 0.5
|
||||
new_image[(net_h-new_h)//2:(net_h+new_h)//2, (net_w-new_w)//2:(net_w+new_w)//2, :] = resized
|
||||
new_image = np.expand_dims(new_image, 0)
|
||||
|
||||
return new_image
|
||||
|
||||
def normalize(image):
|
||||
return image/255.
|
||||
|
||||
def get_yolo_boxes(model, images, net_h, net_w, anchors, obj_thresh, nms_thresh):
|
||||
image_h, image_w, _ = images[0].shape
|
||||
nb_images = len(images)
|
||||
batch_input = np.zeros((nb_images, net_h, net_w, 3))
|
||||
|
||||
# preprocess the input
|
||||
for i in range(nb_images):
|
||||
batch_input[i] = preprocess_input(images[i], net_h, net_w)
|
||||
|
||||
# run the prediction
|
||||
batch_output = model.predict_on_batch(batch_input)
|
||||
batch_boxes = [None]*nb_images
|
||||
|
||||
for i in range(nb_images):
|
||||
yolos = [batch_output[0][i], batch_output[1][i], batch_output[2][i]]
|
||||
boxes = []
|
||||
|
||||
# decode the output of the network
|
||||
for j in range(len(yolos)):
|
||||
yolo_anchors = anchors[(2-j)*6:(3-j)*6] # config['model']['anchors']
|
||||
boxes += decode_netout(yolos[j], yolo_anchors, obj_thresh, net_h, net_w)
|
||||
|
||||
# correct the sizes of the bounding boxes
|
||||
correct_yolo_boxes(boxes, image_h, image_w, net_h, net_w)
|
||||
|
||||
# suppress non-maximal boxes
|
||||
do_nms(boxes, nms_thresh)
|
||||
|
||||
batch_boxes[i] = boxes
|
||||
|
||||
return batch_boxes
|
||||
|
||||
def compute_overlap(a, b):
|
||||
"""
|
||||
Code originally from https://github.com/rbgirshick/py-faster-rcnn.
|
||||
Parameters
|
||||
----------
|
||||
a: (N, 4) ndarray of float
|
||||
b: (K, 4) ndarray of float
|
||||
Returns
|
||||
-------
|
||||
overlaps: (N, K) ndarray of overlap between boxes and query_boxes
|
||||
"""
|
||||
area = (b[:, 2] - b[:, 0]) * (b[:, 3] - b[:, 1])
|
||||
|
||||
iw = np.minimum(np.expand_dims(a[:, 2], axis=1), b[:, 2]) - np.maximum(np.expand_dims(a[:, 0], 1), b[:, 0])
|
||||
ih = np.minimum(np.expand_dims(a[:, 3], axis=1), b[:, 3]) - np.maximum(np.expand_dims(a[:, 1], 1), b[:, 1])
|
||||
|
||||
iw = np.maximum(iw, 0)
|
||||
ih = np.maximum(ih, 0)
|
||||
|
||||
ua = np.expand_dims((a[:, 2] - a[:, 0]) * (a[:, 3] - a[:, 1]), axis=1) + area - iw * ih
|
||||
|
||||
ua = np.maximum(ua, np.finfo(float).eps)
|
||||
|
||||
intersection = iw * ih
|
||||
|
||||
return intersection / ua
|
||||
|
||||
def compute_ap(recall, precision):
|
||||
""" Compute the average precision, given the recall and precision curves.
|
||||
Code originally from https://github.com/rbgirshick/py-faster-rcnn.
|
||||
|
||||
# Arguments
|
||||
recall: The recall curve (list).
|
||||
precision: The precision curve (list).
|
||||
# Returns
|
||||
The average precision as computed in py-faster-rcnn.
|
||||
"""
|
||||
# correct AP calculation
|
||||
# first append sentinel values at the end
|
||||
mrec = np.concatenate(([0.], recall, [1.]))
|
||||
mpre = np.concatenate(([0.], precision, [0.]))
|
||||
|
||||
# compute the precision envelope
|
||||
for i in range(mpre.size - 1, 0, -1):
|
||||
mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
|
||||
|
||||
# to calculate area under PR curve, look for points
|
||||
# where X axis (recall) changes value
|
||||
i = np.where(mrec[1:] != mrec[:-1])[0]
|
||||
|
||||
# and sum (\Delta recall) * prec
|
||||
ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
|
||||
return ap
|
||||
|
||||
def _softmax(x, axis=-1):
|
||||
x = x - np.amax(x, axis, keepdims=True)
|
||||
e_x = np.exp(x)
|
||||
|
||||
return e_x / e_x.sum(axis, keepdims=True)
|
||||
BIN
keras-yolo3-master/utils/utils.pyc
Executable file
BIN
keras-yolo3-master/utils/utils.pyc
Executable file
Binary file not shown.
67
keras-yolo3-master/voc.py
Executable file
67
keras-yolo3-master/voc.py
Executable file
@@ -0,0 +1,67 @@
|
||||
import numpy as np
|
||||
import os
|
||||
import xml.etree.ElementTree as ET
|
||||
import pickle
|
||||
|
||||
def parse_voc_annotation(ann_dir, img_dir, cache_name, labels=[]):
|
||||
if os.path.exists(cache_name):
|
||||
with open(cache_name, 'rb') as handle:
|
||||
cache = pickle.load(handle)
|
||||
all_insts, seen_labels = cache['all_insts'], cache['seen_labels']
|
||||
else:
|
||||
all_insts = []
|
||||
seen_labels = {}
|
||||
|
||||
for ann in sorted(os.listdir(ann_dir)):
|
||||
img = {'object':[]}
|
||||
|
||||
try:
|
||||
tree = ET.parse(ann_dir + ann)
|
||||
except Exception as e:
|
||||
print(e)
|
||||
print('Ignore this bad annotation: ' + ann_dir + ann)
|
||||
continue
|
||||
|
||||
for elem in tree.iter():
|
||||
if 'filename' in elem.tag:
|
||||
img['filename'] = img_dir + elem.text
|
||||
if 'width' in elem.tag:
|
||||
img['width'] = int(elem.text)
|
||||
if 'height' in elem.tag:
|
||||
img['height'] = int(elem.text)
|
||||
if 'object' in elem.tag or 'part' in elem.tag:
|
||||
obj = {}
|
||||
|
||||
for attr in list(elem):
|
||||
if 'name' in attr.tag:
|
||||
obj['name'] = attr.text
|
||||
|
||||
if obj['name'] in seen_labels:
|
||||
seen_labels[obj['name']] += 1
|
||||
else:
|
||||
seen_labels[obj['name']] = 1
|
||||
|
||||
if len(labels) > 0 and obj['name'] not in labels:
|
||||
break
|
||||
else:
|
||||
img['object'] += [obj]
|
||||
|
||||
if 'bndbox' in attr.tag:
|
||||
for dim in list(attr):
|
||||
if 'xmin' in dim.tag:
|
||||
obj['xmin'] = int(round(float(dim.text)))
|
||||
if 'ymin' in dim.tag:
|
||||
obj['ymin'] = int(round(float(dim.text)))
|
||||
if 'xmax' in dim.tag:
|
||||
obj['xmax'] = int(round(float(dim.text)))
|
||||
if 'ymax' in dim.tag:
|
||||
obj['ymax'] = int(round(float(dim.text)))
|
||||
|
||||
if len(img['object']) > 0:
|
||||
all_insts += [img]
|
||||
|
||||
cache = {'all_insts': all_insts, 'seen_labels': seen_labels}
|
||||
with open(cache_name, 'wb') as handle:
|
||||
pickle.dump(cache, handle, protocol=pickle.HIGHEST_PROTOCOL)
|
||||
|
||||
return all_insts, seen_labels
|
||||
BIN
keras-yolo3-master/voc.pyc
Executable file
BIN
keras-yolo3-master/voc.pyc
Executable file
Binary file not shown.
364
keras-yolo3-master/yolo.py
Executable file
364
keras-yolo3-master/yolo.py
Executable file
@@ -0,0 +1,364 @@
|
||||
from keras.layers import Conv2D, Input, BatchNormalization, LeakyReLU, ZeroPadding2D, UpSampling2D, Lambda
|
||||
from keras.layers.merge import add, concatenate
|
||||
from keras.models import Model
|
||||
from keras.engine.topology import Layer
|
||||
import tensorflow as tf
|
||||
|
||||
class YoloLayer(Layer):
|
||||
def __init__(self, anchors, max_grid, batch_size, warmup_batches, ignore_thresh,
|
||||
grid_scale, obj_scale, noobj_scale, xywh_scale, class_scale,
|
||||
**kwargs):
|
||||
# make the model settings persistent
|
||||
self.ignore_thresh = ignore_thresh
|
||||
self.warmup_batches = warmup_batches
|
||||
self.anchors = tf.constant(anchors, dtype='float', shape=[1,1,1,3,2])
|
||||
self.grid_scale = grid_scale
|
||||
self.obj_scale = obj_scale
|
||||
self.noobj_scale = noobj_scale
|
||||
self.xywh_scale = xywh_scale
|
||||
self.class_scale = class_scale
|
||||
|
||||
# make a persistent mesh grid
|
||||
max_grid_h, max_grid_w = max_grid
|
||||
|
||||
cell_x = tf.to_float(tf.reshape(tf.tile(tf.range(max_grid_w), [max_grid_h]), (1, max_grid_h, max_grid_w, 1, 1)))
|
||||
cell_y = tf.transpose(cell_x, (0,2,1,3,4))
|
||||
self.cell_grid = tf.tile(tf.concat([cell_x,cell_y],-1), [batch_size, 1, 1, 3, 1])
|
||||
|
||||
super(YoloLayer, self).__init__(**kwargs)
|
||||
|
||||
def build(self, input_shape):
|
||||
super(YoloLayer, self).build(input_shape) # Be sure to call this somewhere!
|
||||
|
||||
def call(self, x):
|
||||
input_image, y_pred, y_true, true_boxes = x
|
||||
|
||||
# adjust the shape of the y_predict [batch, grid_h, grid_w, 3, 4+1+nb_class]
|
||||
y_pred = tf.reshape(y_pred, tf.concat([tf.shape(y_pred)[:3], tf.constant([3, -1])], axis=0))
|
||||
|
||||
# initialize the masks
|
||||
object_mask = tf.expand_dims(y_true[..., 4], 4)
|
||||
|
||||
# the variable to keep track of number of batches processed
|
||||
batch_seen = tf.Variable(0.)
|
||||
|
||||
# compute grid factor and net factor
|
||||
grid_h = tf.shape(y_true)[1]
|
||||
grid_w = tf.shape(y_true)[2]
|
||||
grid_factor = tf.reshape(tf.cast([grid_w, grid_h], tf.float32), [1,1,1,1,2])
|
||||
|
||||
net_h = tf.shape(input_image)[1]
|
||||
net_w = tf.shape(input_image)[2]
|
||||
net_factor = tf.reshape(tf.cast([net_w, net_h], tf.float32), [1,1,1,1,2])
|
||||
|
||||
"""
|
||||
Adjust prediction
|
||||
"""
|
||||
pred_box_xy = (self.cell_grid[:,:grid_h,:grid_w,:,:] + tf.sigmoid(y_pred[..., :2])) # sigma(t_xy) + c_xy
|
||||
pred_box_wh = y_pred[..., 2:4] # t_wh
|
||||
pred_box_conf = tf.expand_dims(tf.sigmoid(y_pred[..., 4]), 4) # adjust confidence
|
||||
pred_box_class = y_pred[..., 5:] # adjust class probabilities
|
||||
|
||||
"""
|
||||
Adjust ground truth
|
||||
"""
|
||||
true_box_xy = y_true[..., 0:2] # (sigma(t_xy) + c_xy)
|
||||
true_box_wh = y_true[..., 2:4] # t_wh
|
||||
true_box_conf = tf.expand_dims(y_true[..., 4], 4)
|
||||
true_box_class = tf.argmax(y_true[..., 5:], -1)
|
||||
|
||||
"""
|
||||
Compare each predicted box to all true boxes
|
||||
"""
|
||||
# initially, drag all objectness of all boxes to 0
|
||||
conf_delta = pred_box_conf - 0
|
||||
|
||||
# then, ignore the boxes which have good overlap with some true box
|
||||
true_xy = true_boxes[..., 0:2] / grid_factor
|
||||
true_wh = true_boxes[..., 2:4] / net_factor
|
||||
|
||||
true_wh_half = true_wh / 2.
|
||||
true_mins = true_xy - true_wh_half
|
||||
true_maxes = true_xy + true_wh_half
|
||||
|
||||
pred_xy = tf.expand_dims(pred_box_xy / grid_factor, 4)
|
||||
pred_wh = tf.expand_dims(tf.exp(pred_box_wh) * self.anchors / net_factor, 4)
|
||||
|
||||
pred_wh_half = pred_wh / 2.
|
||||
pred_mins = pred_xy - pred_wh_half
|
||||
pred_maxes = pred_xy + pred_wh_half
|
||||
|
||||
intersect_mins = tf.maximum(pred_mins, true_mins)
|
||||
intersect_maxes = tf.minimum(pred_maxes, true_maxes)
|
||||
|
||||
intersect_wh = tf.maximum(intersect_maxes - intersect_mins, 0.)
|
||||
intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]
|
||||
|
||||
true_areas = true_wh[..., 0] * true_wh[..., 1]
|
||||
pred_areas = pred_wh[..., 0] * pred_wh[..., 1]
|
||||
|
||||
union_areas = pred_areas + true_areas - intersect_areas
|
||||
iou_scores = tf.truediv(intersect_areas, union_areas)
|
||||
|
||||
best_ious = tf.reduce_max(iou_scores, axis=4)
|
||||
conf_delta *= tf.expand_dims(tf.to_float(best_ious < self.ignore_thresh), 4)
|
||||
|
||||
"""
|
||||
Compute some online statistics
|
||||
"""
|
||||
true_xy = true_box_xy / grid_factor
|
||||
true_wh = tf.exp(true_box_wh) * self.anchors / net_factor
|
||||
|
||||
true_wh_half = true_wh / 2.
|
||||
true_mins = true_xy - true_wh_half
|
||||
true_maxes = true_xy + true_wh_half
|
||||
|
||||
pred_xy = pred_box_xy / grid_factor
|
||||
pred_wh = tf.exp(pred_box_wh) * self.anchors / net_factor
|
||||
|
||||
pred_wh_half = pred_wh / 2.
|
||||
pred_mins = pred_xy - pred_wh_half
|
||||
pred_maxes = pred_xy + pred_wh_half
|
||||
|
||||
intersect_mins = tf.maximum(pred_mins, true_mins)
|
||||
intersect_maxes = tf.minimum(pred_maxes, true_maxes)
|
||||
intersect_wh = tf.maximum(intersect_maxes - intersect_mins, 0.)
|
||||
intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]
|
||||
|
||||
true_areas = true_wh[..., 0] * true_wh[..., 1]
|
||||
pred_areas = pred_wh[..., 0] * pred_wh[..., 1]
|
||||
|
||||
union_areas = pred_areas + true_areas - intersect_areas
|
||||
iou_scores = tf.truediv(intersect_areas, union_areas)
|
||||
iou_scores = object_mask * tf.expand_dims(iou_scores, 4)
|
||||
|
||||
count = tf.reduce_sum(object_mask)
|
||||
count_noobj = tf.reduce_sum(1 - object_mask)
|
||||
detect_mask = tf.to_float((pred_box_conf*object_mask) >= 0.5)
|
||||
class_mask = tf.expand_dims(tf.to_float(tf.equal(tf.argmax(pred_box_class, -1), true_box_class)), 4)
|
||||
recall50 = tf.reduce_sum(tf.to_float(iou_scores >= 0.5 ) * detect_mask * class_mask) / (count + 1e-3)
|
||||
recall75 = tf.reduce_sum(tf.to_float(iou_scores >= 0.75) * detect_mask * class_mask) / (count + 1e-3)
|
||||
avg_iou = tf.reduce_sum(iou_scores) / (count + 1e-3)
|
||||
avg_obj = tf.reduce_sum(pred_box_conf * object_mask) / (count + 1e-3)
|
||||
avg_noobj = tf.reduce_sum(pred_box_conf * (1-object_mask)) / (count_noobj + 1e-3)
|
||||
avg_cat = tf.reduce_sum(object_mask * class_mask) / (count + 1e-3)
|
||||
|
||||
"""
|
||||
Warm-up training
|
||||
"""
|
||||
batch_seen = tf.assign_add(batch_seen, 1.)
|
||||
|
||||
true_box_xy, true_box_wh, xywh_mask = tf.cond(tf.less(batch_seen, self.warmup_batches+1),
|
||||
lambda: [true_box_xy + (0.5 + self.cell_grid[:,:grid_h,:grid_w,:,:]) * (1-object_mask),
|
||||
true_box_wh + tf.zeros_like(true_box_wh) * (1-object_mask),
|
||||
tf.ones_like(object_mask)],
|
||||
lambda: [true_box_xy,
|
||||
true_box_wh,
|
||||
object_mask])
|
||||
|
||||
"""
|
||||
Compare each true box to all anchor boxes
|
||||
"""
|
||||
wh_scale = tf.exp(true_box_wh) * self.anchors / net_factor
|
||||
wh_scale = tf.expand_dims(2 - wh_scale[..., 0] * wh_scale[..., 1], axis=4) # the smaller the box, the bigger the scale
|
||||
|
||||
xy_delta = xywh_mask * (pred_box_xy-true_box_xy) * wh_scale * self.xywh_scale
|
||||
wh_delta = xywh_mask * (pred_box_wh-true_box_wh) * wh_scale * self.xywh_scale
|
||||
conf_delta = object_mask * (pred_box_conf-true_box_conf) * self.obj_scale + (1-object_mask) * conf_delta * self.noobj_scale
|
||||
class_delta = object_mask * \
|
||||
tf.expand_dims(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=true_box_class, logits=pred_box_class), 4) * \
|
||||
self.class_scale
|
||||
|
||||
loss_xy = tf.reduce_sum(tf.square(xy_delta), list(range(1,5)))
|
||||
loss_wh = tf.reduce_sum(tf.square(wh_delta), list(range(1,5)))
|
||||
loss_conf = tf.reduce_sum(tf.square(conf_delta), list(range(1,5)))
|
||||
loss_class = tf.reduce_sum(class_delta, list(range(1,5)))
|
||||
|
||||
loss = loss_xy + loss_wh + loss_conf + loss_class
|
||||
|
||||
#loss = tf.Print(loss, [grid_h, avg_obj], message='avg_obj \t\t', summarize=1000)
|
||||
#loss = tf.Print(loss, [grid_h, avg_noobj], message='avg_noobj \t\t', summarize=1000)
|
||||
#loss = tf.Print(loss, [grid_h, avg_iou], message='avg_iou \t\t', summarize=1000)
|
||||
#loss = tf.Print(loss, [grid_h, avg_cat], message='avg_cat \t\t', summarize=1000)
|
||||
#loss = tf.Print(loss, [grid_h, recall50], message='recall50 \t', summarize=1000)
|
||||
#loss = tf.Print(loss, [grid_h, recall75], message='recall75 \t', summarize=1000)
|
||||
#loss = tf.Print(loss, [grid_h, count], message='count \t', summarize=1000)
|
||||
#loss = tf.Print(loss, [grid_h, tf.reduce_sum(loss_xy),
|
||||
# tf.reduce_sum(loss_wh),
|
||||
# tf.reduce_sum(loss_conf),
|
||||
# tf.reduce_sum(loss_class)], message='loss xy, wh, conf, class: \t', summarize=1000)
|
||||
|
||||
|
||||
return loss*self.grid_scale
|
||||
|
||||
def compute_output_shape(self, input_shape):
|
||||
return [(None, 1)]
|
||||
|
||||
def _conv_block(inp, convs, do_skip=True):
|
||||
x = inp
|
||||
count = 0
|
||||
|
||||
for conv in convs:
|
||||
if count == (len(convs) - 2) and do_skip:
|
||||
skip_connection = x
|
||||
count += 1
|
||||
|
||||
if conv['stride'] > 1: x = ZeroPadding2D(((1,0),(1,0)))(x) # unlike tensorflow darknet prefer left and top paddings
|
||||
x = Conv2D(conv['filter'],
|
||||
conv['kernel'],
|
||||
strides=conv['stride'],
|
||||
padding='valid' if conv['stride'] > 1 else 'same', # unlike tensorflow darknet prefer left and top paddings
|
||||
name='conv_' + str(conv['layer_idx']),
|
||||
use_bias=False if conv['bnorm'] else True)(x)
|
||||
if conv['bnorm']: x = BatchNormalization(epsilon=0.001, name='bnorm_' + str(conv['layer_idx']))(x)
|
||||
if conv['leaky']: x = LeakyReLU(alpha=0.1, name='leaky_' + str(conv['layer_idx']))(x)
|
||||
|
||||
return add([skip_connection, x]) if do_skip else x
|
||||
|
||||
def create_yolov3_model(
|
||||
nb_class,
|
||||
anchors,
|
||||
max_box_per_image,
|
||||
max_grid,
|
||||
batch_size,
|
||||
warmup_batches,
|
||||
ignore_thresh,
|
||||
grid_scales,
|
||||
obj_scale,
|
||||
noobj_scale,
|
||||
xywh_scale,
|
||||
class_scale
|
||||
):
|
||||
input_image = Input(shape=(None, None, 3)) # net_h, net_w, 3
|
||||
true_boxes = Input(shape=(1, 1, 1, max_box_per_image, 4))
|
||||
true_yolo_1 = Input(shape=(None, None, len(anchors)//6, 4+1+nb_class)) # grid_h, grid_w, nb_anchor, 5+nb_class
|
||||
true_yolo_2 = Input(shape=(None, None, len(anchors)//6, 4+1+nb_class)) # grid_h, grid_w, nb_anchor, 5+nb_class
|
||||
true_yolo_3 = Input(shape=(None, None, len(anchors)//6, 4+1+nb_class)) # grid_h, grid_w, nb_anchor, 5+nb_class
|
||||
|
||||
# Layer 0 => 4
|
||||
x = _conv_block(input_image, [{'filter': 32, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 0},
|
||||
{'filter': 64, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 1},
|
||||
{'filter': 32, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 2},
|
||||
{'filter': 64, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 3}])
|
||||
|
||||
# Layer 5 => 8
|
||||
x = _conv_block(x, [{'filter': 128, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 5},
|
||||
{'filter': 64, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 6},
|
||||
{'filter': 128, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 7}])
|
||||
|
||||
# Layer 9 => 11
|
||||
x = _conv_block(x, [{'filter': 64, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 9},
|
||||
{'filter': 128, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 10}])
|
||||
|
||||
# Layer 12 => 15
|
||||
x = _conv_block(x, [{'filter': 256, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 12},
|
||||
{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 13},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 14}])
|
||||
|
||||
# Layer 16 => 36
|
||||
for i in range(7):
|
||||
x = _conv_block(x, [{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 16+i*3},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 17+i*3}])
|
||||
|
||||
skip_36 = x
|
||||
|
||||
# Layer 37 => 40
|
||||
x = _conv_block(x, [{'filter': 512, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 37},
|
||||
{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 38},
|
||||
{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 39}])
|
||||
|
||||
# Layer 41 => 61
|
||||
for i in range(7):
|
||||
x = _conv_block(x, [{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 41+i*3},
|
||||
{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 42+i*3}])
|
||||
|
||||
skip_61 = x
|
||||
|
||||
# Layer 62 => 65
|
||||
x = _conv_block(x, [{'filter': 1024, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 62},
|
||||
{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 63},
|
||||
{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 64}])
|
||||
|
||||
# Layer 66 => 74
|
||||
for i in range(3):
|
||||
x = _conv_block(x, [{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 66+i*3},
|
||||
{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 67+i*3}])
|
||||
|
||||
# Layer 75 => 79
|
||||
x = _conv_block(x, [{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 75},
|
||||
{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 76},
|
||||
{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 77},
|
||||
{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 78},
|
||||
{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 79}], do_skip=False)
|
||||
|
||||
# Layer 80 => 82
|
||||
pred_yolo_1 = _conv_block(x, [{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 80},
|
||||
{'filter': (3*(5+nb_class)), 'kernel': 1, 'stride': 1, 'bnorm': False, 'leaky': False, 'layer_idx': 81}], do_skip=False)
|
||||
loss_yolo_1 = YoloLayer(anchors[12:],
|
||||
[1*num for num in max_grid],
|
||||
batch_size,
|
||||
warmup_batches,
|
||||
ignore_thresh,
|
||||
grid_scales[0],
|
||||
obj_scale,
|
||||
noobj_scale,
|
||||
xywh_scale,
|
||||
class_scale)([input_image, pred_yolo_1, true_yolo_1, true_boxes])
|
||||
|
||||
# Layer 83 => 86
|
||||
x = _conv_block(x, [{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 84}], do_skip=False)
|
||||
x = UpSampling2D(2)(x)
|
||||
x = concatenate([x, skip_61])
|
||||
|
||||
# Layer 87 => 91
|
||||
x = _conv_block(x, [{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 87},
|
||||
{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 88},
|
||||
{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 89},
|
||||
{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 90},
|
||||
{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 91}], do_skip=False)
|
||||
|
||||
# Layer 92 => 94
|
||||
pred_yolo_2 = _conv_block(x, [{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 92},
|
||||
{'filter': (3*(5+nb_class)), 'kernel': 1, 'stride': 1, 'bnorm': False, 'leaky': False, 'layer_idx': 93}], do_skip=False)
|
||||
loss_yolo_2 = YoloLayer(anchors[6:12],
|
||||
[2*num for num in max_grid],
|
||||
batch_size,
|
||||
warmup_batches,
|
||||
ignore_thresh,
|
||||
grid_scales[1],
|
||||
obj_scale,
|
||||
noobj_scale,
|
||||
xywh_scale,
|
||||
class_scale)([input_image, pred_yolo_2, true_yolo_2, true_boxes])
|
||||
|
||||
# Layer 95 => 98
|
||||
x = _conv_block(x, [{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 96}], do_skip=False)
|
||||
x = UpSampling2D(2)(x)
|
||||
x = concatenate([x, skip_36])
|
||||
|
||||
# Layer 99 => 106
|
||||
pred_yolo_3 = _conv_block(x, [{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 99},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 100},
|
||||
{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 101},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 102},
|
||||
{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 103},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 104},
|
||||
{'filter': (3*(5+nb_class)), 'kernel': 1, 'stride': 1, 'bnorm': False, 'leaky': False, 'layer_idx': 105}], do_skip=False)
|
||||
loss_yolo_3 = YoloLayer(anchors[:6],
|
||||
[4*num for num in max_grid],
|
||||
batch_size,
|
||||
warmup_batches,
|
||||
ignore_thresh,
|
||||
grid_scales[2],
|
||||
obj_scale,
|
||||
noobj_scale,
|
||||
xywh_scale,
|
||||
class_scale)([input_image, pred_yolo_3, true_yolo_3, true_boxes])
|
||||
|
||||
train_model = Model([input_image, true_boxes, true_yolo_1, true_yolo_2, true_yolo_3], [loss_yolo_1, loss_yolo_2, loss_yolo_3])
|
||||
infer_model = Model(input_image, [pred_yolo_1, pred_yolo_2, pred_yolo_3])
|
||||
|
||||
return [train_model, infer_model]
|
||||
|
||||
def dummy_loss(y_true, y_pred):
|
||||
return tf.sqrt(tf.reduce_sum(y_pred))
|
||||
BIN
keras-yolo3-master/yolo.pyc
Executable file
BIN
keras-yolo3-master/yolo.pyc
Executable file
Binary file not shown.
434
keras-yolo3-master/yolo3_one_file_to_detect_them_all.py
Executable file
434
keras-yolo3-master/yolo3_one_file_to_detect_them_all.py
Executable file
@@ -0,0 +1,434 @@
|
||||
import argparse
|
||||
import os
|
||||
import numpy as np
|
||||
from keras.layers import Conv2D, Input, BatchNormalization, LeakyReLU, ZeroPadding2D, UpSampling2D
|
||||
from keras.layers.merge import add, concatenate
|
||||
from keras.models import Model
|
||||
import struct
|
||||
import cv2
|
||||
|
||||
np.set_printoptions(threshold=np.nan)
|
||||
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
|
||||
os.environ["CUDA_VISIBLE_DEVICES"]="0"
|
||||
|
||||
argparser = argparse.ArgumentParser(
|
||||
description='test yolov3 network with coco weights')
|
||||
|
||||
argparser.add_argument(
|
||||
'-w',
|
||||
'--weights',
|
||||
help='path to weights file')
|
||||
|
||||
argparser.add_argument(
|
||||
'-i',
|
||||
'--image',
|
||||
help='path to image file')
|
||||
|
||||
class WeightReader:
|
||||
def __init__(self, weight_file):
|
||||
with open(weight_file, 'rb') as w_f:
|
||||
major, = struct.unpack('i', w_f.read(4))
|
||||
minor, = struct.unpack('i', w_f.read(4))
|
||||
revision, = struct.unpack('i', w_f.read(4))
|
||||
|
||||
if (major*10 + minor) >= 2 and major < 1000 and minor < 1000:
|
||||
w_f.read(8)
|
||||
else:
|
||||
w_f.read(4)
|
||||
|
||||
transpose = (major > 1000) or (minor > 1000)
|
||||
|
||||
binary = w_f.read()
|
||||
|
||||
self.offset = 0
|
||||
self.all_weights = np.frombuffer(binary, dtype='float32')
|
||||
|
||||
def read_bytes(self, size):
|
||||
self.offset = self.offset + size
|
||||
return self.all_weights[self.offset-size:self.offset]
|
||||
|
||||
def load_weights(self, model):
|
||||
for i in range(106):
|
||||
try:
|
||||
conv_layer = model.get_layer('conv_' + str(i))
|
||||
print("loading weights of convolution #" + str(i))
|
||||
|
||||
if i not in [81, 93, 105]:
|
||||
norm_layer = model.get_layer('bnorm_' + str(i))
|
||||
|
||||
size = np.prod(norm_layer.get_weights()[0].shape)
|
||||
|
||||
beta = self.read_bytes(size) # bias
|
||||
gamma = self.read_bytes(size) # scale
|
||||
mean = self.read_bytes(size) # mean
|
||||
var = self.read_bytes(size) # variance
|
||||
|
||||
weights = norm_layer.set_weights([gamma, beta, mean, var])
|
||||
|
||||
if len(conv_layer.get_weights()) > 1:
|
||||
bias = self.read_bytes(np.prod(conv_layer.get_weights()[1].shape))
|
||||
kernel = self.read_bytes(np.prod(conv_layer.get_weights()[0].shape))
|
||||
|
||||
kernel = kernel.reshape(list(reversed(conv_layer.get_weights()[0].shape)))
|
||||
kernel = kernel.transpose([2,3,1,0])
|
||||
conv_layer.set_weights([kernel, bias])
|
||||
else:
|
||||
kernel = self.read_bytes(np.prod(conv_layer.get_weights()[0].shape))
|
||||
kernel = kernel.reshape(list(reversed(conv_layer.get_weights()[0].shape)))
|
||||
kernel = kernel.transpose([2,3,1,0])
|
||||
conv_layer.set_weights([kernel])
|
||||
except ValueError:
|
||||
print("no convolution #" + str(i))
|
||||
|
||||
def reset(self):
|
||||
self.offset = 0
|
||||
|
||||
class BoundBox:
|
||||
def __init__(self, xmin, ymin, xmax, ymax, objness = None, classes = None):
|
||||
self.xmin = xmin
|
||||
self.ymin = ymin
|
||||
self.xmax = xmax
|
||||
self.ymax = ymax
|
||||
|
||||
self.objness = objness
|
||||
self.classes = classes
|
||||
|
||||
self.label = -1
|
||||
self.score = -1
|
||||
|
||||
def get_label(self):
|
||||
if self.label == -1:
|
||||
self.label = np.argmax(self.classes)
|
||||
|
||||
return self.label
|
||||
|
||||
def get_score(self):
|
||||
if self.score == -1:
|
||||
self.score = self.classes[self.get_label()]
|
||||
|
||||
return self.score
|
||||
|
||||
def _conv_block(inp, convs, skip=True):
|
||||
x = inp
|
||||
count = 0
|
||||
|
||||
for conv in convs:
|
||||
if count == (len(convs) - 2) and skip:
|
||||
skip_connection = x
|
||||
count += 1
|
||||
|
||||
if conv['stride'] > 1: x = ZeroPadding2D(((1,0),(1,0)))(x) # peculiar padding as darknet prefer left and top
|
||||
x = Conv2D(conv['filter'],
|
||||
conv['kernel'],
|
||||
strides=conv['stride'],
|
||||
padding='valid' if conv['stride'] > 1 else 'same', # peculiar padding as darknet prefer left and top
|
||||
name='conv_' + str(conv['layer_idx']),
|
||||
use_bias=False if conv['bnorm'] else True)(x)
|
||||
if conv['bnorm']: x = BatchNormalization(epsilon=0.001, name='bnorm_' + str(conv['layer_idx']))(x)
|
||||
if conv['leaky']: x = LeakyReLU(alpha=0.1, name='leaky_' + str(conv['layer_idx']))(x)
|
||||
|
||||
return add([skip_connection, x]) if skip else x
|
||||
|
||||
def _interval_overlap(interval_a, interval_b):
|
||||
x1, x2 = interval_a
|
||||
x3, x4 = interval_b
|
||||
|
||||
if x3 < x1:
|
||||
if x4 < x1:
|
||||
return 0
|
||||
else:
|
||||
return min(x2,x4) - x1
|
||||
else:
|
||||
if x2 < x3:
|
||||
return 0
|
||||
else:
|
||||
return min(x2,x4) - x3
|
||||
|
||||
def _sigmoid(x):
|
||||
return 1. / (1. + np.exp(-x))
|
||||
|
||||
def bbox_iou(box1, box2):
|
||||
intersect_w = _interval_overlap([box1.xmin, box1.xmax], [box2.xmin, box2.xmax])
|
||||
intersect_h = _interval_overlap([box1.ymin, box1.ymax], [box2.ymin, box2.ymax])
|
||||
|
||||
intersect = intersect_w * intersect_h
|
||||
|
||||
w1, h1 = box1.xmax-box1.xmin, box1.ymax-box1.ymin
|
||||
w2, h2 = box2.xmax-box2.xmin, box2.ymax-box2.ymin
|
||||
|
||||
union = w1*h1 + w2*h2 - intersect
|
||||
|
||||
return float(intersect) / union
|
||||
|
||||
def make_yolov3_model():
|
||||
input_image = Input(shape=(None, None, 3))
|
||||
|
||||
# Layer 0 => 4
|
||||
x = _conv_block(input_image, [{'filter': 32, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 0},
|
||||
{'filter': 64, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 1},
|
||||
{'filter': 32, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 2},
|
||||
{'filter': 64, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 3}])
|
||||
|
||||
# Layer 5 => 8
|
||||
x = _conv_block(x, [{'filter': 128, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 5},
|
||||
{'filter': 64, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 6},
|
||||
{'filter': 128, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 7}])
|
||||
|
||||
# Layer 9 => 11
|
||||
x = _conv_block(x, [{'filter': 64, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 9},
|
||||
{'filter': 128, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 10}])
|
||||
|
||||
# Layer 12 => 15
|
||||
x = _conv_block(x, [{'filter': 256, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 12},
|
||||
{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 13},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 14}])
|
||||
|
||||
# Layer 16 => 36
|
||||
for i in range(7):
|
||||
x = _conv_block(x, [{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 16+i*3},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 17+i*3}])
|
||||
|
||||
skip_36 = x
|
||||
|
||||
# Layer 37 => 40
|
||||
x = _conv_block(x, [{'filter': 512, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 37},
|
||||
{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 38},
|
||||
{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 39}])
|
||||
|
||||
# Layer 41 => 61
|
||||
for i in range(7):
|
||||
x = _conv_block(x, [{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 41+i*3},
|
||||
{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 42+i*3}])
|
||||
|
||||
skip_61 = x
|
||||
|
||||
# Layer 62 => 65
|
||||
x = _conv_block(x, [{'filter': 1024, 'kernel': 3, 'stride': 2, 'bnorm': True, 'leaky': True, 'layer_idx': 62},
|
||||
{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 63},
|
||||
{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 64}])
|
||||
|
||||
# Layer 66 => 74
|
||||
for i in range(3):
|
||||
x = _conv_block(x, [{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 66+i*3},
|
||||
{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 67+i*3}])
|
||||
|
||||
# Layer 75 => 79
|
||||
x = _conv_block(x, [{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 75},
|
||||
{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 76},
|
||||
{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 77},
|
||||
{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 78},
|
||||
{'filter': 512, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 79}], skip=False)
|
||||
|
||||
# Layer 80 => 82
|
||||
yolo_82 = _conv_block(x, [{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 80},
|
||||
{'filter': 255, 'kernel': 1, 'stride': 1, 'bnorm': False, 'leaky': False, 'layer_idx': 81}], skip=False)
|
||||
|
||||
# Layer 83 => 86
|
||||
x = _conv_block(x, [{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 84}], skip=False)
|
||||
x = UpSampling2D(2)(x)
|
||||
x = concatenate([x, skip_61])
|
||||
|
||||
# Layer 87 => 91
|
||||
x = _conv_block(x, [{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 87},
|
||||
{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 88},
|
||||
{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 89},
|
||||
{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 90},
|
||||
{'filter': 256, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 91}], skip=False)
|
||||
|
||||
# Layer 92 => 94
|
||||
yolo_94 = _conv_block(x, [{'filter': 512, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 92},
|
||||
{'filter': 255, 'kernel': 1, 'stride': 1, 'bnorm': False, 'leaky': False, 'layer_idx': 93}], skip=False)
|
||||
|
||||
# Layer 95 => 98
|
||||
x = _conv_block(x, [{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 96}], skip=False)
|
||||
x = UpSampling2D(2)(x)
|
||||
x = concatenate([x, skip_36])
|
||||
|
||||
# Layer 99 => 106
|
||||
yolo_106 = _conv_block(x, [{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 99},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 100},
|
||||
{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 101},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 102},
|
||||
{'filter': 128, 'kernel': 1, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 103},
|
||||
{'filter': 256, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 104},
|
||||
{'filter': 255, 'kernel': 1, 'stride': 1, 'bnorm': False, 'leaky': False, 'layer_idx': 105}], skip=False)
|
||||
|
||||
model = Model(input_image, [yolo_82, yolo_94, yolo_106])
|
||||
return model
|
||||
|
||||
def preprocess_input(image, net_h, net_w):
|
||||
new_h, new_w, _ = image.shape
|
||||
|
||||
# determine the new size of the image
|
||||
if (float(net_w)/new_w) < (float(net_h)/new_h):
|
||||
new_h = (new_h * net_w)/new_w
|
||||
new_w = net_w
|
||||
else:
|
||||
new_w = (new_w * net_h)/new_h
|
||||
new_h = net_h
|
||||
|
||||
# resize the image to the new size
|
||||
resized = cv2.resize(image[:,:,::-1]/255., (int(new_w), int(new_h)))
|
||||
|
||||
# embed the image into the standard letter box
|
||||
new_image = np.ones((net_h, net_w, 3)) * 0.5
|
||||
new_image[int((net_h-new_h)//2):int((net_h+new_h)//2), int((net_w-new_w)//2):int((net_w+new_w)//2), :] = resized
|
||||
new_image = np.expand_dims(new_image, 0)
|
||||
|
||||
return new_image
|
||||
|
||||
def decode_netout(netout, anchors, obj_thresh, nms_thresh, net_h, net_w):
|
||||
grid_h, grid_w = netout.shape[:2]
|
||||
nb_box = 3
|
||||
netout = netout.reshape((grid_h, grid_w, nb_box, -1))
|
||||
nb_class = netout.shape[-1] - 5
|
||||
|
||||
boxes = []
|
||||
|
||||
netout[..., :2] = _sigmoid(netout[..., :2])
|
||||
netout[..., 4:] = _sigmoid(netout[..., 4:])
|
||||
netout[..., 5:] = netout[..., 4][..., np.newaxis] * netout[..., 5:]
|
||||
netout[..., 5:] *= netout[..., 5:] > obj_thresh
|
||||
|
||||
for i in range(grid_h*grid_w):
|
||||
row = i / grid_w
|
||||
col = i % grid_w
|
||||
|
||||
for b in range(nb_box):
|
||||
# 4th element is objectness score
|
||||
objectness = netout[int(row)][int(col)][b][4]
|
||||
#objectness = netout[..., :4]
|
||||
|
||||
if(objectness.all() <= obj_thresh): continue
|
||||
|
||||
# first 4 elements are x, y, w, and h
|
||||
x, y, w, h = netout[int(row)][int(col)][b][:4]
|
||||
|
||||
x = (col + x) / grid_w # center position, unit: image width
|
||||
y = (row + y) / grid_h # center position, unit: image height
|
||||
w = anchors[2 * b + 0] * np.exp(w) / net_w # unit: image width
|
||||
h = anchors[2 * b + 1] * np.exp(h) / net_h # unit: image height
|
||||
|
||||
# last elements are class probabilities
|
||||
classes = netout[int(row)][col][b][5:]
|
||||
|
||||
box = BoundBox(x-w/2, y-h/2, x+w/2, y+h/2, objectness, classes)
|
||||
#box = BoundBox(x-w/2, y-h/2, x+w/2, y+h/2, None, classes)
|
||||
|
||||
boxes.append(box)
|
||||
|
||||
return boxes
|
||||
|
||||
def correct_yolo_boxes(boxes, image_h, image_w, net_h, net_w):
|
||||
if (float(net_w)/image_w) < (float(net_h)/image_h):
|
||||
new_w = net_w
|
||||
new_h = (image_h*net_w)/image_w
|
||||
else:
|
||||
new_h = net_w
|
||||
new_w = (image_w*net_h)/image_h
|
||||
|
||||
for i in range(len(boxes)):
|
||||
x_offset, x_scale = (net_w - new_w)/2./net_w, float(new_w)/net_w
|
||||
y_offset, y_scale = (net_h - new_h)/2./net_h, float(new_h)/net_h
|
||||
|
||||
boxes[i].xmin = int((boxes[i].xmin - x_offset) / x_scale * image_w)
|
||||
boxes[i].xmax = int((boxes[i].xmax - x_offset) / x_scale * image_w)
|
||||
boxes[i].ymin = int((boxes[i].ymin - y_offset) / y_scale * image_h)
|
||||
boxes[i].ymax = int((boxes[i].ymax - y_offset) / y_scale * image_h)
|
||||
|
||||
def do_nms(boxes, nms_thresh):
|
||||
if len(boxes) > 0:
|
||||
nb_class = len(boxes[0].classes)
|
||||
else:
|
||||
return
|
||||
|
||||
for c in range(nb_class):
|
||||
sorted_indices = np.argsort([-box.classes[c] for box in boxes])
|
||||
|
||||
for i in range(len(sorted_indices)):
|
||||
index_i = sorted_indices[i]
|
||||
|
||||
if boxes[index_i].classes[c] == 0: continue
|
||||
|
||||
for j in range(i+1, len(sorted_indices)):
|
||||
index_j = sorted_indices[j]
|
||||
|
||||
if bbox_iou(boxes[index_i], boxes[index_j]) >= nms_thresh:
|
||||
boxes[index_j].classes[c] = 0
|
||||
|
||||
def draw_boxes(image, boxes, labels, obj_thresh):
|
||||
for box in boxes:
|
||||
label_str = ''
|
||||
label = -1
|
||||
|
||||
for i in range(len(labels)):
|
||||
if box.classes[i] > obj_thresh:
|
||||
label_str += labels[i]
|
||||
label = i
|
||||
print(labels[i] + ': ' + str(box.classes[i]*100) + '%')
|
||||
|
||||
if label >= 0:
|
||||
cv2.rectangle(image, (box.xmin,box.ymin), (box.xmax,box.ymax), (0,255,0), 3)
|
||||
cv2.putText(image,
|
||||
label_str + ' ' + str(box.get_score()),
|
||||
(box.xmin, box.ymin - 13),
|
||||
cv2.FONT_HERSHEY_SIMPLEX,
|
||||
1e-3 * image.shape[0],
|
||||
(0,255,0), 2)
|
||||
|
||||
return image
|
||||
|
||||
def _main_(args):
|
||||
weights_path = args.weights
|
||||
image_path = args.image
|
||||
|
||||
# set some parameters
|
||||
net_h, net_w = 416, 416
|
||||
obj_thresh, nms_thresh = 0.5, 0.45
|
||||
anchors = [[116,90, 156,198, 373,326], [30,61, 62,45, 59,119], [10,13, 16,30, 33,23]]
|
||||
labels = ["person", "bicycle", "car", "motorbike", "aeroplane", "bus", "train", "truck", \
|
||||
"boat", "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", \
|
||||
"bird", "cat", "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", \
|
||||
"backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", \
|
||||
"sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard", \
|
||||
"tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", \
|
||||
"apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", \
|
||||
"chair", "sofa", "pottedplant", "bed", "diningtable", "toilet", "tvmonitor", "laptop", "mouse", \
|
||||
"remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator", \
|
||||
"book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush"]
|
||||
|
||||
# make the yolov3 model to predict 80 classes on COCO
|
||||
yolov3 = make_yolov3_model()
|
||||
|
||||
# load the weights trained on COCO into the model
|
||||
weight_reader = WeightReader(weights_path)
|
||||
weight_reader.load_weights(yolov3)
|
||||
yolov3.save('yolo_infer_coco.h5')
|
||||
# preprocess the image
|
||||
image = cv2.imread(image_path)
|
||||
image_h, image_w, _ = image.shape
|
||||
new_image = preprocess_input(image, net_h, net_w)
|
||||
|
||||
# run the prediction
|
||||
yolos = yolov3.predict(new_image)
|
||||
boxes = []
|
||||
|
||||
for i in range(len(yolos)):
|
||||
# decode the output of the network
|
||||
boxes += decode_netout(yolos[i][0], anchors[i], obj_thresh, nms_thresh, net_h, net_w)
|
||||
|
||||
# correct the sizes of the bounding boxes
|
||||
correct_yolo_boxes(boxes, image_h, image_w, net_h, net_w)
|
||||
|
||||
# suppress non-maximal boxes
|
||||
do_nms(boxes, nms_thresh)
|
||||
|
||||
# draw bounding boxes on the image using labels
|
||||
draw_boxes(image, boxes, labels, obj_thresh)
|
||||
|
||||
# write the image with bounding boxes to file
|
||||
cv2.imwrite(image_path[:-4] + '_detected' + image_path[-4:], (image).astype('uint8'))
|
||||
|
||||
if __name__ == '__main__':
|
||||
args = argparser.parse_args()
|
||||
_main_(args)
|
||||
40
keras-yolo3-master/zoo/config_kangaroo.json
Executable file
40
keras-yolo3-master/zoo/config_kangaroo.json
Executable file
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"model" : {
|
||||
"min_input_size": 288,
|
||||
"max_input_size": 448,
|
||||
"anchors": [55,69, 75,234, 133,240, 136,129, 142,363, 203,290, 228,184, 285,359, 341,260],
|
||||
"labels": ["kangaroo"]
|
||||
},
|
||||
|
||||
"train": {
|
||||
"train_image_folder": "/home/andy/Desktop/github/kangaroo/images/",
|
||||
"train_annot_folder": "/home/andy/Desktop/github/kangaroo/annots/",
|
||||
"cache_name": "kangaroo_train.pkl",
|
||||
|
||||
"train_times": 3,
|
||||
"batch_size": 16,
|
||||
"learning_rate": 1e-4,
|
||||
"nb_epochs": 100,
|
||||
"warmup_epochs": 3,
|
||||
"ignore_thresh": 0.5,
|
||||
"gpus": "0,1",
|
||||
|
||||
"grid_scales": [1,1,1],
|
||||
"obj_scale": 5,
|
||||
"noobj_scale": 1,
|
||||
"xywh_scale": 1,
|
||||
"class_scale": 1,
|
||||
|
||||
"tensorboard_dir": "log_kangaroo",
|
||||
"saved_weights_name": "kangaroo.h5",
|
||||
"debug": true
|
||||
},
|
||||
|
||||
"valid": {
|
||||
"valid_image_folder": "",
|
||||
"valid_annot_folder": "",
|
||||
"cache_name": "",
|
||||
|
||||
"valid_times": 1
|
||||
}
|
||||
}
|
||||
40
keras-yolo3-master/zoo/config_raccoon.json
Executable file
40
keras-yolo3-master/zoo/config_raccoon.json
Executable file
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"model" : {
|
||||
"min_input_size": 288,
|
||||
"max_input_size": 448,
|
||||
"anchors": [17,18, 28,24, 36,34, 42,44, 56,51, 72,66, 90,95, 92,154, 139,281],
|
||||
"labels": ["raccoon"]
|
||||
},
|
||||
|
||||
"train": {
|
||||
"train_image_folder": "/home/andy/Desktop/github/raccoon_dataset/images/",
|
||||
"train_annot_folder": "/home/andy/Desktop/github/raccoon_dataset/annotations/",
|
||||
"cache_name": "raccoon_train.pkl",
|
||||
|
||||
"train_times": 3,
|
||||
"batch_size": 16,
|
||||
"learning_rate": 1e-4,
|
||||
"nb_epochs": 100,
|
||||
"warmup_epochs": 3,
|
||||
"ignore_thresh": 0.5,
|
||||
"gpus": "0,1",
|
||||
|
||||
"grid_scales": [1,1,1],
|
||||
"obj_scale": 5,
|
||||
"noobj_scale": 1,
|
||||
"xywh_scale": 1,
|
||||
"class_scale": 1,
|
||||
|
||||
"tensorboard_dir": "log_raccoon",
|
||||
"saved_weights_name": "raccoon.h5",
|
||||
"debug": true
|
||||
},
|
||||
|
||||
"valid": {
|
||||
"valid_image_folder": "",
|
||||
"valid_annot_folder": "",
|
||||
"cache_name": "",
|
||||
|
||||
"valid_times": 1
|
||||
}
|
||||
}
|
||||
40
keras-yolo3-master/zoo/config_rbc.json
Executable file
40
keras-yolo3-master/zoo/config_rbc.json
Executable file
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"model" : {
|
||||
"min_input_size": 224,
|
||||
"max_input_size": 480,
|
||||
"anchors": [25,33, 52,94, 56,71, 67,83, 68,98, 73,65, 81,96, 116,134, 147,182],
|
||||
"labels": ["Platelets", "RBC", "WBC"]
|
||||
},
|
||||
|
||||
"train": {
|
||||
"train_image_folder": "/home/experiencor/data/BCCD_Dataset/BCCD/JPEGImages/",
|
||||
"train_annot_folder": "/home/experiencor/data/BCCD_Dataset/BCCD/Annotations/",
|
||||
"cache_name": "rbc_train.pkl",
|
||||
|
||||
"train_times": 3,
|
||||
"batch_size": 16,
|
||||
"learning_rate": 1e-4,
|
||||
"nb_epochs": 100,
|
||||
"warmup_epochs": 3,
|
||||
"ignore_thresh": 0.5,
|
||||
"gpus": "0,1",
|
||||
|
||||
"grid_scales": [1,1,1],
|
||||
"obj_scale": 5,
|
||||
"noobj_scale": 1,
|
||||
"xywh_scale": 1,
|
||||
"class_scale": 1,
|
||||
|
||||
"tensorboard_dir": "log_rbc",
|
||||
"saved_weights_name": "rbc.h5",
|
||||
"debug": true
|
||||
},
|
||||
|
||||
"valid": {
|
||||
"valid_image_folder": "",
|
||||
"valid_annot_folder": "",
|
||||
"cache_name": "",
|
||||
|
||||
"valid_times": 1
|
||||
}
|
||||
}
|
||||
40
keras-yolo3-master/zoo/config_voc.json
Executable file
40
keras-yolo3-master/zoo/config_voc.json
Executable file
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"model" : {
|
||||
"min_input_size": 224,
|
||||
"max_input_size": 480,
|
||||
"anchors": [24,34, 46,84, 68,185, 116,286, 122,97, 171,180, 214,327, 326,193, 359,359],
|
||||
"labels": ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
|
||||
},
|
||||
|
||||
"train": {
|
||||
"train_image_folder": "/home/experiencor/data/pascal/train/images/",
|
||||
"train_annot_folder": "/home/experiencor/data/pascal/train/annots/",
|
||||
"cache_name": "voc_train.pkl",
|
||||
|
||||
"train_times": 1,
|
||||
"batch_size": 8,
|
||||
"learning_rate": 1e-5,
|
||||
"nb_epochs": 100,
|
||||
"warmup_epochs": 3,
|
||||
"ignore_thresh": 0.5,
|
||||
"gpus": "0",
|
||||
|
||||
"grid_scales": [1,1,1],
|
||||
"obj_scale": 5,
|
||||
"noobj_scale": 1,
|
||||
"xywh_scale": 1,
|
||||
"class_scale": 1,
|
||||
|
||||
"tensorboard_dir": "log_voc",
|
||||
"saved_weights_name": "voc.h5",
|
||||
"debug": true
|
||||
},
|
||||
|
||||
"valid": {
|
||||
"valid_image_folder": "/home/experiencor/data/pascal/valid/images/",
|
||||
"valid_annot_folder": "/home/experiencor/data/pascal/valid/annots/",
|
||||
"cache_name": "voc_valid.pkl",
|
||||
|
||||
"valid_times": 1
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user