APPLICATIONS

Object detection

Overview

What is object detection?

Object detection is a computer vision technique that aims to identify objects of interest in an image, such as vehicles, people, or buildings. During object detection, the model is applied to an input image, and the model outputs a set of bounding boxes and class labels, indicating the location and identity of the objects in the image. The model can also estimate the confidence score of each detection, indicating how confident it is in the detection being correct.

There are several reasons why object detection is used in computer vision:

Automation: Object detection can be used to automate tasks that would otherwise require manual intervention, such as monitoring surveillance cameras, detecting and counting objects in an image, or tracking moving objects in a video.
Object Tracking: The object detection model can be used to track objects in images. Typically, object detection models are combined with tracking algorithms to track objects.
Object Counting: Counting objects is very important in the medical and pathology fields. The number of objects found through Object Detection can be easily checked through Deep Block.
Context Awareness: In order to understand videos or complex images, we need to understand where and what objects exist in these images and videos. For this purpose, the inference results of the object detection model must be passed back to the multi modal ML model as input.
Object Classification: Sometimes, people need to classify images or objects. Object detection is much more useful than a simple image classification model because it detects an object and at the same time tells you what category the object belongs to.

Workflow

The workflow to train your object detection model is divided into 3 stages:

Train: The training stage is the most crucial stage of the AI modeling pipeline. It is divided into 2 modules:
- Preprocess: Before running the training process, the data must be carefully selected and annotated with object labels and bounding boxes.
- Run: Running the model training will push the data batch into multiple training cycles until the model has had enough opportunities to learn the patterns in the data.
Evaluate: Evaluate the performance of the trained model on new, unseen data.
Predict: Apply your newly trained model to make predictions based on the patterns learned from the training data.

1 - Preprocess

The "1. Preprocess" module is displayed by default when clicking on the "TRAIN" stage.

If not, click on "1. Preprocess".

To get started with your object detection model training, you must first gather the raw data that will be annotated. This includes images from multiple sources that are related to the use case you are trying to solve.

Once the data is ready to be annotated, refer to the Master the labeling tool to learn how to import images or datasets, and label images.

If you already have annotation data, you can upload the annotation information in json format to Deep Block.

If you would like to leave the annotation or image data collection to us, or if you only have a small amount of training data, please contact us.

2 - Run

Now that your data is annotated, you can run the training module.

In the "TRAIN" stage of the workflow, click on the "2. Run" module to start the model training.
Enter the adequate value in the Epochs field.
Click 'TRAIN" on the top-right corner of the interface.
During this process, a few Processing and Loading pop-up windows will appear, you can click away to make them disappear.
A graph illustrating the progress of the Training Score will appear and evolve with each cycle. Wait until the end of all your training cycles.

How to select the right value for Epochs?

An epoch is a single iteration through the entire training dataset in machine learning. During each epoch, the model is trained on a batch of training data and the parameters of the model are updated based on the results. The goal of each epoch is to improve the model's performance on the training data. Typically, multiple epochs are run during the training process to ensure that the model has seen the entire training dataset multiple times and has had enough opportunities to learn the patterns in the data.

More epochs are usually better for training a deep learning model, as this allows the model to see more training examples and to continue refining its weights and biases. However, there is a trade-off between the number of epochs and overfitting. If the model is trained for too many epochs, it may start to memorize the training data and become less capable of generalizing to new, unseen data.

We recommend preparing at least 1000 annotations per class with 15 epochs for object detection model training. However, this value may vary depending on multiple factors including the amount of the data and the data quality.

If you have a lot of annotation like 4000 annotations per each class, then you can even train less than 10 epochs.

Discover more about determining the optimal training epoch by clicking here.

What is the loss?

Click on "Loss" to display the loss score graph.

The goal of training a machine learning model is to minimize the loss value, so that the model can make predictions that are as close as possible to the target values(ground truth).

The loss score is calculated during each training iteration or epoch, and is used to update the model's weights.

The loss score mirrors the training score. If it goes down, it means that the model is learning properly.

The loss value is explained in our free deep learning course.
Learn about loss through our free deep learning course and the following video.

3 - Configuration

The "Configuration" panel contains different options to change the image processing settings.

Dividing an Image

Deep Block's greatest strength is its very easy front-end interface and its ability to effectively process VERY LARGE-SCALE images.

Users have the opportunity to annotate extremely high resolution images within Deep Block, allowing for precise and thorough analysis.

Additionally, Deep Block enables the training of ML models simply by annotating large-scale images without any alterations to the original image files.

You can upload large satellite or microscopic images to Deep Block and train a model based on these images.

Check out the following link to learn how to use Deep Block's image splitting function.

With image division, the Deep Block platform enhances its capability to effectively manage and analyze intricate images.

This approach is particularly advantageous when engaging in tasks like object detection or image segmentation.

Witness the power of Deep Block in this article.

The method involves dividing the larger image into a grid of rows and columns, effectively creating a mosaic of interconnected tiles.

After choosing the adequate number of rows and columns, click on "Divide" to start the tiling process. The status will change to "Divide" until the process is over.

To optimize the division process, it's recommended to maintain a balanced size for each tile, aiming for dimensions of approximately 1000 pixels by 1000 pixels.

Machine learning models must see and remember objects in an image with an enough magnification scale.
Please refer to our article on this.

This guideline ensures that each tile encapsulates a substantial amount of information while remaining manageable for processing. For instance, an image of dimensions 8k x 8k pixels can be segmented into an 8 x 8 grid, providing a cohesive framework for comprehensive analysis.

2 - Import and label your evaluation dataset

The Evaluate panel resembles the Train panel in the "1. Preprocess" Module and follows the same functioning. That is where you can import your validation images or data sets for the evaluation stage.

The evaluation dataset should be different from the training dataset so that your model capabilities can be tested on new, unseen data.

To learn more about how to import images and data sets, refer to Mastering the labeling tool.

Once your data is imported, you must annotate it(or just import your test annotation data), if it is not already, just like in the training phase.

This way, the evaluate module will be able to compare your annotations with the model predictions and establish a score.

Learn More

4 - Performance Scores

After evaluation, performance scores are now available.

Click on the "Score" tab in the bottom-left panel of the Project view.

Your model performance score is composed of important 3 metrics:

mAP: mAP, or mean Average Precision, is a metric for measuring the average accuracy of your model. The average precision of a model is defined as the average of its precision scores for different recall values. Precision is defined as the number of true positive detections divided by the total number of detections. A high precision score means that the model is producing few false positive detections. By extension, if your mAP score is high, it means that the model is producing few false positive detections at multiple recall values.
Recall: Recall is defined as the number of true positive divided by the total number of true positive and false negative cases.
A high recall score means that the model has less chance of missing the ground-truth objects in the image.
F1 Score: used to evaluate the overall performance of the model in terms of its ability to correctly identify and segment objects in images.
The F1 score provides a single number by balancing precision and recall.
F1 score is the harmonic average of precision and recall.

1 - Threshold value

The threshold score refers to the detection sensitivity of the trained machine learning model.

If the threshold value is low, the machine learning model tries finding as many objects as possible with little confidence.

If the threshold score is high, the machine learning model only show analysis results that they can guarantee.

After training the model, we recommend that you first analyze the image with a low threshold value to check the confidence score of all found objects, and then gradually increase the threshold score to determine the optimal threshold score.

You can first analyze the image with a low threshold score, then check the confidence score of each bounding box resulting from the image's inference result, and increase the threshold value higher than the confidence score of the false positive case to improve the inference accuracy of the machine learning model.

threshold

This allows you to set the optimal threshold value to find objects while reducing false positive cases.

Enter the appropriate value in the Threshold score (%) field.

The choice of threshold value in object detection is an important step in the process, as it can significantly impact the quality of the detection results.

When inference is performed with a well-trained machine learning model, you can set a high threshold score, and all target objects present in the image are accurately captured.

Build the most accurate ML models with our team if you need a precise object detector.

2 - Prediction Dataset

Now that your model is trained and evaluated, it is time to put it to the test with a large-scale project. Upload your images to try out your brand-new model.

Click on " " to add an image via your webcam.
Click on " " to download the JSON file for the current project.
Click on " " to import images that you wish to use.
Click on " " to remove an image after selecting it.
If a prediction has already been made, click on "CLEAR BOXES" to remove all bounding boxes.

Image file formats supported are: png, apng jpg, svg, tiff, bmp, gif, ico and jp2 (10GB max file size).

3 - Detect

Once your dataset is uploaded, you are ready to launch the prediction.

The important thing here is that when analyzing a large-scale image, the image must be divided into sections and analyzed.
Please refer to the following articles regarding this:

Segment Aerial Photos with an AI Model Trained on Drone Photos

The Power of Deep Block's Patented Algorithm for Large-Scale Image Analysis

First, if the image resolution is large, divide it as you would when training the image.
We recommend using the More Accurate option to analyze images more accurately.
Click on "PREDICT" at the top-right corner of the Project view.
The processing will start. Depending on the number of images uploaded, this process could take several minutes. You can stop it at any time by clicking on "STOP".

Wait until the processing status returns to "IDLE". By then, the model would have created bounding boxes around the desired objects of interest.

Object detection

Overview

Get Started

Workflow

Train

Evaluate

Predict

Get In Touch