Baltic Robbo Battles – training model to be able to recognize food and non-food

After fixing chasis – fortunatelly it turned out that there were only disconected cables, steering started working. Unfortunatelly after first detection tests I found out that food is not recognized at all. No matter whether it was picture of food or real food on plate-nothing was returned from YOLO detector.

It was obvious that model wasn’t trained using food pictures.

On Saturday I decided that retreining of model is needed. I followed this tutorial: https://pylessons.com/YOLOv3-TF2-GoogleColab/, I downloaded Food images from:

https://storage.googleapis.com/openimages/web/visualizer/index.html?set=valtest&type=detection&c=%2Fm%2F02wbm

It is quite good source of images and boundig boxes. Then using converter I have converted these images to XML and then to YOLO format. I have also adjusted config file and put everything to GoogleColab.

However as it is often when you use things for the first time – not everything goes right. After first fights with not running trainig and set of strange errors, I decided to resign from this GoogleColab and run everything on my computer with NVIDIA graphics card.

Using Anaconda I have created empty env and run in terminal:

conda update
conda install -y numpy=1.18.2
conda install -y scipy=1.4.1
conda install -y wget=3.2
conda install -y seaborn=0.10.0
conda install -y tensorflow=2.3.0
conda install -y tensorflow-gpu=2.3.0
conda install -y opencv-python=4.1.2.30
conda install -y tqdm=4.43.0
conda install -y pandas
conda install -y awscli
conda install -y urllib3
conda install -y mss
conda install -y opencv
conda install -y lxml
pip install opencv-python

I have also installed separatelly Jupyter Lab.

Having so prepared environment I could test several things using following notebook:

#check existence of grahics card for using Tensorflow GPU  
!nvidia-smi

#import and check TensorFlow ver.
import tensorflow as tf
print(tf.__version__)
tf.test.gpu_device_name()

#it should return 2.3.0 and GPU:0.0

#now load initial YOLO model
import cv2
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline  
import tensorflow as tf
from yolov3.yolov4 import Create_Yolo
from yolov3.utils import load_yolo_weights, detect_image
from yolov3.configs import *

if YOLO_TYPE == "yolov4":
    Darknet_weights = YOLO_V4_TINY_WEIGHTS if TRAIN_YOLO_TINY else YOLO_V4_WEIGHTS
if YOLO_TYPE == "yolov3":
    Darknet_weights = YOLO_V3_TINY_WEIGHTS if TRAIN_YOLO_TINY else YOLO_V3_WEIGHTS

yolo = Create_Yolo(input_size=YOLO_INPUT_SIZE)
load_yolo_weights(yolo, Darknet_weights) # use Darknet weights

print(TRAIN_ANNOT_PATH)

#check whether it works on exemplary image:
image_path   = "./IMAGES/street.jpg"

image = detect_image(yolo, image_path, '', input_size=YOLO_INPUT_SIZE, show=False, rectangle_colors=(255,0,0))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(30,15))
plt.imshow(image)


#now train the model
from train import *
from yolov3.configs import *

tf.keras.backend.clear_session()

trainset = Dataset('train')
testset = Dataset('test')
    
main()

After debugging train.py file I discovered that CNN stopped on one of the last layers, because of incompatibilty of train, test and input image sizes.

So the winning config should look like that:

#================================================================
#
#   File name   : configs.py
#   Author      : PyLessons
#   Created date: 2020-08-18
#   Website     : https://pylessons.com/
#   GitHub      : https://github.com/pythonlessons/TensorFlow-2.x-YOLOv3
#   Description : yolov3 configuration file
#
#================================================================

# YOLO options
YOLO_TYPE                   = "yolov3" # yolov4 or yolov3
YOLO_FRAMEWORK              = "tf" # "tf" or "trt"
YOLO_V3_WEIGHTS             = "model_data/yolov3.weights"
YOLO_V4_WEIGHTS             = "model_data/yolov4.weights"
YOLO_V3_TINY_WEIGHTS        = "model_data/yolov3-tiny.weights"
YOLO_V4_TINY_WEIGHTS        = "model_data/yolov4-tiny.weights"
YOLO_TRT_QUANTIZE_MODE      = "INT8" # INT8, FP16, FP32
YOLO_CUSTOM_WEIGHTS         = False # "checkpoints/yolov3_custom" # used in evaluate_mAP.py and custom model detection, if not using leave False
                            # YOLO_CUSTOM_WEIGHTS also used with TensorRT and custom model detection
YOLO_COCO_CLASSES           = "model_data/coco/coco.names"
YOLO_STRIDES                = [8, 16, 32]
YOLO_IOU_LOSS_THRESH        = 0.5
YOLO_ANCHOR_PER_SCALE       = 3
YOLO_MAX_BBOX_PER_SCALE     = 100
YOLO_INPUT_SIZE             = 416 
if YOLO_TYPE                == "yolov4":
    YOLO_ANCHORS            = [[[12,  16], [19,   36], [40,   28]],
                               [[36,  75], [76,   55], [72,  146]],
                               [[142,110], [192, 243], [459, 401]]]
if YOLO_TYPE                == "yolov3":
    YOLO_ANCHORS            = [[[10,  13], [16,   30], [33,   23]],
                               [[30,  61], [62,   45], [59,  119]],
                               [[116, 90], [156, 198], [373, 326]]]
# Train options
TRAIN_YOLO_TINY             = True 
TRAIN_SAVE_BEST_ONLY        = True # saves only best model according validation loss (True recommended)
TRAIN_SAVE_CHECKPOINT       = False # saves all best validated checkpoints in training process (may require a lot disk space) (False recommended)
TRAIN_CLASSES               = "model_data/license_plate_names.txt"
TRAIN_ANNOT_PATH            = "model_data/license_plate_train.txt"
TRAIN_LOGDIR                = "./log"
TRAIN_CHECKPOINTS_FOLDER    = "checkpoints"
TRAIN_MODEL_NAME            = f"{YOLO_TYPE}_custom"
TRAIN_LOAD_IMAGES_TO_RAM    = True # With True faster training, but need more RAM
TRAIN_BATCH_SIZE            = 4
TRAIN_INPUT_SIZE            = 416
TRAIN_DATA_AUG              = True
TRAIN_TRANSFER              = True
TRAIN_FROM_CHECKPOINT       = False # "checkpoints/yolov3_custom"
TRAIN_LR_INIT               = 1e-4
TRAIN_LR_END                = 1e-6
TRAIN_WARMUP_EPOCHS         = 2
TRAIN_EPOCHS                = 100

# TEST options
TEST_ANNOT_PATH             = "model_data/license_plate_test.txt"
TEST_BATCH_SIZE             = 4
TEST_INPUT_SIZE             = 416
TEST_DATA_AUG               = False
TEST_DECTECTED_IMAGE_PATH   = ""
TEST_SCORE_THRESHOLD        = 0.3
TEST_IOU_THRESHOLD          = 0.45

if TRAIN_YOLO_TINY:
    YOLO_STRIDES            = [16, 32]    
    # YOLO_ANCHORS            = [[[23, 27],  [37, 58],   [81,  82]], # this line can be uncommented for default coco weights
    YOLO_ANCHORS            = [[[10, 14],  [23, 27],   [37, 58]],
                               [[81,  82], [135, 169], [344, 319]]]

where the most important lines are here:

TRAIN_INPUT_SIZE            = 416
TEST_INPUT_SIZE             = 416
TEST_ANNOT_PATH             = "model_data/license_plate_test.txt"
TRAIN_CLASSES               = "model_data/license_plate_names.txt"
TRAIN_ANNOT_PATH            = "model_data/license_plate_train.txt"
TRAIN_TRANSFER              = True

As you can see – this training is using transfer training, which means last layers of CNN are dropped and replaced by new ones, trained specially for food images. Simple as that and magic!

and the last lines in notebook:

 #use the new trained model: 
yolo = Create_Yolo(input_size=YOLO_INPUT_SIZE, CLASSES=TRAIN_CLASSES)
 yolo.load_weights("./checkpoints/yolov3_custom") # use keras weights

#take as our image for prediction img with apple:

image_path   = "./IMAGES/apple.jpg"
 image = detect_image(yolo, image_path, "", input_size=YOLO_INPUT_SIZE, show=False, CLASSES=TRAIN_CLASSES, rectangle_colors=(255,0,0))
 image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
 plt.figure(figsize=(30,15))
 plt.imshow(image) 

and bingo!:

Model was trained and food images recognized !

Dodaj komentarz

Twój adres e-mail nie zostanie opublikowany. Wymagane pola są oznaczone *