core package

core.yolov4 module

core.yolov4.YOLO(input_layer, NUM_CLASS, model='yolov4')

This function calls either YOLOv4 or YOLOv3

Parameters
  • input_layer (object) – Tensor

  • NUM_CLASS (int) – How many classes there are (80 for coco)

  • model (str) – States if yolov3 or yolov4 should be run

Returns

List of conv layers at differnt stages in the CNN

Return type

list[objects]

core.yolov4.YOLOv3(input_layer, NUM_CLASS)

The Yolov3 function

Parameters
  • input_layer (object) – Tensor

  • NUM_CLASS (int) – How many classes there are (80 for coco)

Returns

List of conv layers at differnt stages in the CNN

Return type

list[objects]

core.yolov4.YOLOv4(input_layer, NUM_CLASS)

The Yolov4 function

Parameters
  • input_layer (object) – Tensor

  • NUM_CLASS (int) – How many classes there are (80 for coco)

Returns

List of conv layers at differnt stages in the CNN

Return type

list[objects]

core.yolov4.compute_loss(pred, conv, label, bboxes, STRIDES, NUM_CLASS, IOU_LOSS_THRESH, i=0)

Computes the losses of the prediction using the iou functions

Parameters
  • pred (object) – The prediction itself

  • conv (object) – The conv layer

  • label (object) – The labels for the corresponding prediction

  • bboxes (object) – The bounding boxes for use in calculating the iou

  • STRIDES (ndarray) – How many pixels to move the neural network’s filter

  • NUM_CLASS (int) – FILLER

  • IOU_LOSS_THRESH (float) – The IOU threshold amount

  • i (int) – Keeps track of which stride to be using

Returns

  • int – giou_loss

  • float – conf_loss

  • object – prob_loss

core.yolov4.decode(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i=0, XYSCALE=[1, 1, 1])

Function to decode the incoming Tensor

Parameters
  • conv_output (object) – Tensor in training

  • output_size (int) – What size the output should be

  • NUM_CLASS (int) – How many classes there are (80 for coco)

  • STRIDES (ndarray) – How many pixels to move the neural network’s filter

  • ANCHORS (List[int]) – List of all the anchors for the training of the NN

  • i (int) – Keeps track of which anchor, stride and XYSCALE to be using

  • XYSCALE (list[floats]) – How much to scale the x and y coords of the data.

Returns

  • object – Prediction of variables

  • object – Prediction probabilities

core.yolov4.decode_train(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i=0, XYSCALE=[1, 1, 1])

Function to decode the incoming Tensor for training new models

Parameters
  • conv_output (object) – Tensor in training

  • output_size (int) – What size the output should be

  • NUM_CLASS (int) – How many classes there are (80 for coco)

  • STRIDES (int) – How many pixels to move the neural network’s filter

  • ANCHORS (List[int]) – List of all the anchors for the training of the NN

  • i (int) – Keeps track of which anchor, stride and XYSCALE to be using

  • XYSCALE (list) – How much to scale the x and y coords of the data.

Returns

Concatenated list of predictions and confidence scores

Return type

List[objects]

core.yolov4.filter_boxes(box_xywh, scores, score_threshold=0.4, input_shape=tensorflow.constant)

Filter out the bounding boxes that are below the score threshold

Parameters
  • box_xywh (object) – The box coords

  • scores (object) – The scores for the bounding boxes

  • score_threshold (float) – The constant for which is the threshold for what is allowed/not

  • input_shape (list[int]) – list of dimensions of the input shape

Returns

tuple of the boxes along with the confidence level

Return type

tuple

core.backbone module

core.backbone.cspdarknet53(input_data)

This function represents the backbone CNN for cspdarknet53 for YOLOV4

Parameters

input_data (object) – Tensor

Returns

  • object – Returns the tensor after it has run through some of the layers

  • object – Returns the tensor after it has run through most the layers

  • object – Returns the final tensor after it has run through all the layers

core.backbone.darknet53(input_data)

This function represents the backbone CNN for darknet53 for YOLOV3

Parameters

input_data (object) – Tensor

Returns

  • object – Returns the tensor after it has run through some of the layers

  • object – Returns the tensor after it has run through most the layers

  • object – Returns the final tensor after it has run through all the layers

core.common module

class core.common.BatchNormalization(*args: Any, **kwargs: Any)

Bases: tensorflow.keras.layers.tensorflow.keras.layers.BatchNormalization._name

“Frozen state” and “inference mode” are two separate concepts. layer.trainable = False is to freeze the layer, so the layer will use stored moving var and mean in the “inference mode”, and both gama and beta will not be updated !

call(x, training=False)
core.common.convolutional(input_layer, filters_shape, downsample=False, activate=True, bn=True, activate_type='leaky')

Applies convultions to input layers

Parameters
  • input_layer (object) – A conv layer

  • filters_shape (tuple) – A tuple of the filter shapes

  • downsample (bool) – If True then downsamples the layer

  • activate (bool) – If True then apply an activation function to the layer

  • bn (bool) – If True then apply BatchNormalization to layer

  • activate_type (str) – What activation function should be applied

Returns

A convolution

Return type

object

core.common.mish(x)

Applies the mish activation function to x

Parameters

x (object) – A conv layer ready to be activated

Returns

Activated conv layer

Return type

object

core.common.residual_block(input_layer, input_channel, filter_num1, filter_num2, activate_type='leaky')

Takes input and combines it with a copy of itself that has been applied to the convolutional function twice

Parameters
  • input_layer (object) – A conv layer

  • input_channel (int) – How many input channels

  • filter_num1 (int) – Filter1 for conv2D within convolutional function

  • filter_num2 (int) – Filter2 for conv2D within convolutional function

  • activate_type (str) – What activation function should be applied

Returns

Conv layer

Return type

object

core.common.route_group(input_layer, groups, group_id)
core.common.upsample(input_layer)

core.dataset module

class core.dataset.Dataset(FLAGS, is_training: bool, dataset_type: str = 'converted_coco')

Bases: object

implement Dataset here

load_annotations()

loads the annotations of the dataset

Returns

A list of the annotations within the dataset

Return type

list

parse_annotation(annotation)

Performs parsing of the annotations of the dataset

Parameters

annotation (str) – A single annotation

Returns

  • ndarray – The image itself

  • ndarray – The bbox of the annotation

Raises

KeyError – Raises a KeyError if the path does not exist

preprocess_true_boxes(bboxes)

Processes the ground truth bounding boxes

Parameters

bboxes (ndarray) – The bounding boxes of the images

Returns

The processed bounding box split up into the needed parts

Return type

ndarrays

random_crop(image, bboxes)

performs a random horizontal crop of the image for use in training

Parameters
  • image (ndarray) – The image itself ready to be cropped if needed

  • bboxes (ndarray) – Array of the bounding boxes

Returns

  • ndarray – The cropped image

  • ndarray – The bounding box of the image

random_horizontal_flip(image, bboxes)

performs a random horizontal flip of the image for use in training

Parameters
  • image (ndarray) – The image itself ready to be flipped if needed

  • bboxes (ndarray) – Array of the bounding boxes

Returns

  • ndarray – The flipped image

  • ndarray – The bounding box of the image

random_translate(image, bboxes)

performs a random translate of the image for use in training

Parameters
  • image (ndarray) – The image itself ready to be translated if needed

  • bboxes (ndarray) – Array of the bounding boxes

Returns

  • ndarray – The translated image

  • ndarray – The bounding box of the image

core.utils_y module

core.utils_y.bbox_ciou(bboxes1, bboxes2)

Complete IoU

Parameters
  • bboxes1 ((a, b, ..., 4)) –

  • bboxes2 ((A, B, ..., 4)) – x:X is 1:n or n:n or n:1

Returns

Return type

Test

core.utils_y.bbox_giou(bboxes1, bboxes2)

Generalized IoU

Parameters
  • bboxes1 ((a, b, ..., 4)) –

  • bboxes2 ((A, B, ..., 4)) – x:X is 1:n or n:n or n:1

Returns

Return type

Test

core.utils_y.bbox_iou(bboxes1, bboxes2)

Test

Parameters
  • bboxes1 ((a, b, ..., 4)) –

  • bboxes2 ((A, B, ..., 4)) – x:X is 1:n or n:n or n:1

Returns

Return type

Test

core.utils_y.count_objects(data, by_class=False, allowed_classes=None)
core.utils_y.draw_bbox(image, bboxes, info=False, counted_classes=None, show_label=True, allowed_classes=None)
core.utils_y.format_boxes(bboxes, image_height, image_width)

Formats the bounding boxes

Parameters
  • bboxes (ndarray) – The bounding boxes

  • image_height (int) – The height of the image

  • image_width (int) – The width of the image

Returns

The now formatted bounding boxes

Return type

ndarray

core.utils_y.freeze_all(model, frozen=True)
core.utils_y.get_anchors(anchors)

Fetches and convers the anchors list into a formatted numpy array

Parameters

anchors (list[int]) – The anchors ready to be reshaped and converted

Returns

The reshaped anchors

Return type

ndarray

core.utils_y.image_preprocess(image, target_size, gt_boxes=None)
core.utils_y.load_config(FLAGS)

Loads the config file

Parameters

FLAGS (flag) – The flag that determine which model is being used

Returns

  • ndarray – Array[int] of the strides

  • ndarray – Array[int] of the anchors

  • int – The amount of classes to be classified (80 for coco)

  • ndarray – Array[float] of the XYscales

core.utils_y.load_freeze_layer(model='yolov4')

This function loads which freeze layout to use

Parameters

model (str) – which model is being used yolo v3/v4

Returns

A list which id’s need to be used

Return type

list[str]

core.utils_y.load_weights(model, weights_file, model_name='yolov4')

This void function loads the weights

Parameters
  • model (object) – The model being used

  • weights_file (weights) – Weights for pretrained model

  • model_name (str) – which model is being used yolo v3/v4

Returns

Return type

None

core.utils_y.nms(bboxes, iou_threshold, sigma=0.3, method='nms')

[summary]

Parameters
  • bboxes ([type]) – (xmin, ymin, xmax, ymax, score, class)

  • iou_threshold ([type]) – [description]

  • sigma (float, optional) – [description], by default 0.3

  • method (str, optional) – [description], by default ‘nms’

Returns

[description]

Return type

[type]

core.utils_y.read_class_names(class_file_name)

Reads in the classname file and converts it into a dict

Parameters

class_file_name (names) – The .names file of all the class names, separated by a new line

Returns

The IDs and names of the class names

Return type

dict

core.utils_y.unfreeze_all(model, frozen=False)

core.utils module

core.utils.bbox_ciou(bboxes1, bboxes2)

Complete IoU

Parameters
  • bboxes1 ((a, b, ..., 4)) –

  • bboxes2 ((A, B, ..., 4)) – x:X is 1:n or n:n or n:1

Returns

Return type

Test

core.utils.bbox_giou(bboxes1, bboxes2)

Generalized IoU

Parameters
  • bboxes1 ((a, b, ..., 4)) –

  • bboxes2 ((A, B, ..., 4)) – x:X is 1:n or n:n or n:1

Returns

Return type

Test

core.utils.bbox_iou(bboxes1, bboxes2)

TEST

Parameters
  • bboxes1 ((a, b, ..., 4)) –

  • bboxes2 ((A, B, ..., 4)) – x:X is 1:n or n:n or n:1

Returns

Return type

Test

core.utils.draw_bbox(image, bboxes, info=False, show_label=True, classes=None)
core.utils.format_boxes(bboxes, image_height, image_width)

Formats the bounding boxes

Parameters
  • bboxes (ndarray) – The bounding boxes

  • image_height (int) – The height of the image

  • image_width (int) – The width of the image

Returns

The now formatted bounding boxes

Return type

ndarray

core.utils.freeze_all(model, frozen=True)
core.utils.get_anchors(anchors)

Fetches and convers the anchors list into a formatted numpy array

Parameters

anchors (list[int]) – The anchors ready to be reshaped and converted

Returns

The reshaped anchors

Return type

ndarray

core.utils.image_preprocess(image, target_size, gt_boxes=None)
core.utils.load_config(FLAGS)

Loads the config file

Parameters

FLAGS (flag) – The flag that determine which model is being used

Returns

  • ndarray – Array[int] of the strides

  • ndarray – Array[int] of the anchors

  • int – The amount of classes to be classified (80 for coco)

  • ndarray – Array[float] of the XYscales

core.utils.load_freeze_layer(model='yolov4')

This function loads which freeze layout to use

Parameters

model (str) – which model is being used yolo v3/v4

Returns

A list which id’s need to be used

Return type

list[str]

core.utils.load_weights(model, weights_file, model_name='yolov4')

This void function loads the weights

Parameters
  • model (object) – The model being used

  • weights_file (weights) – Weights for pretrained model

  • model_name (str) – which model is being used yolo v3/v4

Returns

Return type

None

core.utils.nms(bboxes, iou_threshold, sigma=0.3, method='nms')

[summary]

Parameters
  • bboxes ([type]) – (xmin, ymin, xmax, ymax, score, class)

  • iou_threshold ([type]) – [description]

  • sigma (float, optional) – [description], by default 0.3

  • method (str, optional) – [description], by default ‘nms’

Returns

[description]

Return type

[type]

core.utils.read_class_names(class_file_name)

Reads in the classname file and converts it into a dict

Parameters

class_file_name (names) – The .names file of all the class names, separated by a new line

Returns

The IDs and names of the class names

Return type

dict

core.utils.unfreeze_all(model, frozen=False)