Author Topic: Multi-object tracking with dlib  (Read 56 times)

0 Members and 1 Guest are viewing this topic.

Offline Flavio58

Multi-object tracking with dlib
« Reply #1 on: October 30, 2018, 10:02:30 AM »
Advertisement
Multi-object tracking with dlib

In this tutorial, you will learn how to use the dlib library to efficiently track multiple objects in real-time video. So far in this series on object tracking we have learned how to: Track single objects with OpenCV Track multiple objects utilizing OpenCV Perform single object tracking with dlib Track and count people entering a […]


The post Multi-object tracking with dlib appeared first on PyImageSearch.




In this tutorial, you will learn how to use the dlib library to efficiently track multiple objects in real-time video.


So far in this series on object tracking we have learned how to:



  1. Track single objects with OpenCV

  2. Track multiple objects utilizing OpenCV

  3. Perform single object tracking with dlib

  4. Track and count people entering a business/store


We can of course track multiple objects with dlib; however, to obtain the best performance possible, we need to utilize multiprocessing and distribute the object trackers across multiple cores of our processor.


Correctly utilizing multiprocessing allows us to improve our dlib multi-object tracking frames per second (FPS) throughput rate by over 45%!


To learn how to track multiple objects using dlib, just keep reading!



Looking for the source code to this post?

Jump right to the downloads section.



Multi-object tracking with dlib


In the first part of this guide, I’ll demonstrate how to can implement a simple, naïve dlib multi-object tracking script. This program will track multiple objects in video; however, we’ll notice that the script runs a bit slow.


To increase our FPS throughput rate I’ll show you a faster, more efficient dlib multi-object tracker implementation.


Finally, I’ll discuss some improvements and suggestions you can make to enhance our multi-object tracking implementations as well.


Project structure


To get started, make sure you use the “Downloads” section of this tutorial to download the source code and example video.


From there, you can use the

tree
  command to view our project structure:

$ tree
.
├── mobilenet_ssd
│   ├── MobileNetSSD_deploy.caffemodel
│   └── MobileNetSSD_deploy.prototxt
├── multi_object_tracking_slow.py
├── multi_object_tracking_fast.py
├── race.mp4
├── race_output_slow.avi
└── race_output_fast.avi

1 directory, 7 files

The

mobilenet_ssd/
  directory contains our MobileNet + SSD Caffe model files which allow us to detect people (along with other objects).


We’ll review two Python scripts today:



  1. multi_object_tracking_slow.py
     : The simple “naïve” method of dlib multiple object tracking.

  2. multi_object_tracking_fast.py
     : The advanced, fast, method which takes advantage of multiprocessing.


The remaining three files are videos. We have the original

race.mp4
  video and two processed output videos.


The “naïve” dlib multiple object tracking implementation


The first dlib multi-object tracking implementation we are going to cover today is “naïve” in the sense that it will:



  1. Utilize a simple list of tracker objects.

  2. Update each of the trackers sequentially, using only a single core of our processor.


For some object tracking tasks this implementation will be more than sufficient; however, to optimize our FPS throughput rate, we should distribute the object trackers across multiple processes.


We’ll start with our simple implementation in this section and then move on to the faster method in the next section.


To get started, open up the

multi_object_tracking_slow.py
  script and insert the following code:

# import the necessary packages
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import dlib
import cv2

We begin by importing necessary packages and modules on Lines 2-7. Most importantly we’ll be using dlib and OpenCV. We’ll also use some features from my imutils package of convenience functions such as the frames per second counter.


To install dlib, follow this guide. I have a number of OpenCV installation tutorials available as well (even for the latest OpenCV 4!). You might even try the fastest way to install OpenCV on your system via pip.


To install

imutils
 , simply use pip in your terminal:

$ pip install --upgrade imutils

Now that we (a) have the software installed, and (b) have placed the relevant import statements in our script, let’s parse our command line arguments:

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
help="path to Caffe pre-trained model")
ap.add_argument("-v", "--video", required=True,
help="path to input video file")
ap.add_argument("-o", "--output", type=str,
help="path to optional output video file")
ap.add_argument("-c", "--confidence", type=float, default=0.2,
help="minimum probability to filter weak detections")
args = vars(ap.parse_args())

If you aren’t familiar with the terminal and command line arguments, please give this post a read.


Our script processes the following command line arguments at runtime:



  • --prototxt
     : The path to the Caffe “deploy” prototxt file.

  • --model
     : The path to the model file which accompanies the prototxt.

  • --video
     : The path to the input video file. We’ll perform multi-object tracking with dlib on this video.

  • --output
     : An optional path to an output video file. If no path is specified then no video will be output to disk. I recommend outputting to an .avi or .mp4 file.

  • --confidence
     : An optional override for the object detection confidence threshold of
    0.2
     . This value represents the minimum probability to filter weak detections from the object detector.


Let’s define our list of

CLASSES
  that this model supports as well as load our model from disk:

# initialize the list of class labels MobileNet SSD was trained to
# detect
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
"sofa", "train", "tvmonitor"]

# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

The MobileNet SSD pre-trained Caffe model supports 20 classes and 1 background class. The

CLASSES
  are defined on Lines 25-28 in list form.


Note: Do not modify this list or the ordering of class objects if you’re using the Caffe model provided in the “Downloads”. Similarly, if you happen to be loading a different model, you’ll need to define the classes that the model supports here (order does matter). If you’re curious how our object detector works, be sure to refer to this post.


We’re only concerned about the

"person"
  class for today’s foot race example, but you could easily modify Line 95 (covered later in this post) below to track alternate class(es).


On Line 32, we load our pre-trained object detector model. We will use our pre-trained SSD to detect the presence of objects in a video. From there we will create a dlib object tracker to track each of the detected objects.


We have a few more initializations to perform:

# initialize the video stream and output video writer
print("[INFO] starting video stream...")
vs = cv2.VideoCapture(args["video"])
writer = None

# initialize the list of object trackers and corresponding class
# labels
trackers = []
labels = []

# start the frames per second throughput estimator
fps = FPS().start()

On Line 36, we initialize our video stream — we’ll be reading frames from our input video one at a time.


Subsequently, on Line 37 our video

writer
  is initialized to
None
 . We’ll work more with the video
writer
  in the upcoming
while
  loop.


Now let’s initialize our

trackers
  and
labels
  lists on Lines 41 and 42.


And finally, we start our frames per second counter on Line 45.


We’re all set to begin processing our video:

# loop over frames from the video file stream
while True:
# grab the next frame from the video file
(grabbed, frame) = vs.read()

# check to see if we have reached the end of the video file
if frame is None:
break

# resize the frame for faster processing and then convert the
# frame from BGR to RGB ordering (dlib needs RGB ordering)
frame = imutils.resize(frame, width=600)
rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

# if we are supposed to be writing a video to disk, initialize
# the writer
if args["output"] is not None and writer is None:
fourcc = cv2.VideoWriter_fourcc(*"MJPG")
writer = cv2.VideoWriter(args["output"], fourcc, 30,
(frame.shape[1], frame.shape[0]), True)

On Line 48 we begin looping over frames, where Line 50 actually grabs the

frame
  itself.


A quick check to see if we’ve reached the end of the video file and need to stop looping is made on Lines 53 and 54.


Preprocessing takes place on Lines 58 and 59. First, the

frame
  is resized to
600
  pixels wide, maintaining aspect ratio. Then, the
frame
  is converted to the
rgb
  color channel ordering for dlib compatibility (OpenCV’s default is BGR and dlib’s default is RGB).


From there we instantiate the video

writer
  (if necessary) on Lines 63-66. To learn more about writing video to disk with OpenCV, check out my previous blog post.


Let’s begin the object detection phase:

# if there are no object trackers we first need to detect objects
# and then create a tracker for each object
if len(trackers) == 0:
# grab the frame dimensions and convert the frame to a blob
(h, w) = frame.shape[:2]
blob = cv2.dnn.blobFromImage(frame, 0.007843, (w, h), 127.5)

# pass the blob through the network and obtain the detections
# and predictions
net.setInput(blob)
detections = net.forward()

In order to perform object tracking we must first perform object detection, either:



  1. Manually, by stopping the video stream and hand-selecting the bounding box(es) of each object.

  2. Programmatically, using an object detector trained to detect the presence of an object (which is what we are doing here).


If there are no object trackers (Line 70), then we know we have yet to perform object detection.


We create and pass a

blob
  through the SSD network to detect objects on Lines 72-78. To learn about the
cv2.blobFromImage
  function, be sure to refer to my writeup in this article.


Next, we proceed to loop over the detections to find objects belonging to the

"person"
  class since our input video is a human foot race:

# loop over the detections
for i in np.arange(0, detections.shape[2]):
# extract the confidence (i.e., probability) associated
# with the prediction
confidence = detections[0, 0, i, 2]

# filter out weak detections by requiring a minimum
# confidence
if confidence > args["confidence"]:
# extract the index of the class label from the
# detections list
idx = int(detections[0, 0, i, 1])
label = CLASSES[idx]

# if the class label is not a person, ignore it
if CLASSES[idx] != "person":
continue

We begin looping over detections on Line 81 where we:



  1. Filter out weak detections (Line 88).

  2. Ensure each detection is a
    "person"
      (Lines 91-96). You can, of course, remove this line of code or customize it to your own filtering needs.


Now that we’ve located each

"person"
  in the frame, let’s instantiate our trackers and draw our initial bounding box(es) + class label(s):

# compute the (x, y)-coordinates of the bounding box
# for the object
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")

# construct a dlib rectangle object from the bounding
# box coordinates and start the correlation tracker
t = dlib.correlation_tracker()
rect = dlib.rectangle(startX, startY, endX, endY)
t.start_track(rgb, rect)

# update our set of trackers and corresponding class
# labels
labels.append(label)
trackers.append(t)

# grab the corresponding class label for the detection
# and draw the bounding box
cv2.rectangle(frame, (startX, startY), (endX, endY),
(0, 255, 0), 2)
cv2.putText(frame, label, (startX, startY - 15),
cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

To begin tracking objects we:



  • Compute the bounding
    box
      of each detected object (Lines 100 and 101).

  • Instantiate and pass the bounding box coordinates to the tracker (Lines 105-107). The bounding box is especially important here. We need to create a
    dlib.rectangle
      for the bounding box and pass it to the
    start_track
      method. From there, dlib can start to track the object.

  • Finally, we populate the
    trackers
      list with the individual tracker (Line 112).


As a result, in the next code block, we’ll handle the case where trackers have already been established and we just need to update positions.


There are two additional tasks we perform in the initial detection step:



  • Append the class label to the
    labels
      list (Line 111). In the event that you’re tracking multiple types of objects (such as
    "dog"
      +
    "person"
     ), you may wish to know what the type of each object is.

  • Draw each bounding box
    rectangle
      around and class
    label
      above the object (Lines 116-119).


If the length of our detections list is greater than zero, we know we are in the object tracking phase:

# otherwise, we've already performed detection so let's track
# multiple objects
else:
# loop over each of the trackers
for (t, l) in zip(trackers, labels):
# update the tracker and grab the position of the tracked
# object
t.update(rgb)
pos = t.get_position()

# unpack the position object
startX = int(pos.left())
startY = int(pos.top())
endX = int(pos.right())
endY = int(pos.bottom())

# draw the bounding box from the correlation object tracker
cv2.rectangle(frame, (startX, startY), (endX, endY),
(0, 255, 0), 2)
cv2.putText(frame, l, (startX, startY - 15),
cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

In the object tracking phase, we loop over all

trackers
  and corresponding
labels
  on Line 125.


Then we proceed to

update
  each object position (Lines 128-129). In order to update the position, we simply pass the
rgb
  image.


After extracting bounding box coordinates, we can draw a bounding box 

rectangle
  and
label
  for each tracked object (Lines 138-141).


The remaining steps in the frame processing loop involve writing to the output video (if necessary) and displaying the results:

# check to see if we should write the frame to disk
if writer is not None:
writer.write(frame)

# show the output frame
cv2.imshow("Frame", frame)
key = cv2.waitKey(1) & 0xFF

# if the `q` key was pressed, break from the loop
if key == ord("q"):
break

# update the FPS counter
fps.update()

Here we:



  • Write the
    frame
      to video if necessary (Lines 144 and 145).

  • Show the output frame and capture keypresses (Lines 148 and 149). If the
    "q"
      key is pressed (“quit”), we
    break
      out of the loop.

  • Finally, we update our frames per second information for benchmarking purposes (Line 156).


The remaining steps are to print FPS throughput information in the terminal and release pointers:

# stop the timer and display FPS information
fps.stop()
print("[INFO] elapsed time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# check to see if we need to release the video writer pointer
if writer is not None:
writer.release()

# do a bit of cleanup
cv2.destroyAllWindows()
vs.release()

To close out, our

fps
  stats are collected and printed (Lines 159-161), the video
writer
  is released (Lines 164 and 165), and we close all windows + release the video stream.


Let’s assess accuracy and performance.


To follow along and run this script, make sure you use the “Downloads” section of this blog post to download the source code + example video.


From there, open up a terminal and execute the following command:

$ python multi_object_tracking_slow.py --prototxt mobilenet_ssd/MobileNetSSD_deploy.prototxt \
--model mobilenet_ssd/MobileNetSSD_deploy.caffemodel \
--video race.mp4 --output race_output_slow.avi
[INFO] loading model...
[INFO] starting video stream...
[INFO] elapsed time: 24.51
[INFO] approx. FPS: 13.87

It appears that our multi-object tracker is working!


But as you can see, we are only obtaining ~13 FPS.


For some applications, this FPS throughput rate may be sufficient — however, if you need faster FPS, I would suggest taking a look at our more efficient dlib multi-object tracker below.


Secondly, understand that tracking accuracy isn’t perfect. Refer to the third suggestion in the “Improvements and Suggestions” section below as well as read my first post on dlib object tracking for more information.


The fast, efficient dlib multi-object tracking implementation


If you run the dlib multi-object tracking script from the previous section and open up your system’s activity monitor at the same time, you’ll notice that only one core of your processor is being utilized.


In order to speed up our object tracking pipeline we can leverage Python’s multiprocessing module, similar to the threading module, but instead used to spawn processes rather than threads.


Utilizing processes enables our operating system to perform better process scheduling, mapping the process to a particular processor core on our machine (most modern operating systems are able to efficiently schedule processes that are using a lot of CPU in a parallel manner).


If you are new to Python’s multiprocessing module I would suggest you read this excellent introduction from Sebastian Raschka.


Otherwise, go ahead and open up

mutli_object_tracking_fast.py
  and insert the following code:

# import the necessary packages
from imutils.video import FPS
import multiprocessing
import numpy as np
import argparse
import imutils
import dlib
import cv2

Our packages are imported on Lines 2-8. We’re importing the

multiprocessing
  library on Line 3.


We’ll be using the Python

Process
  class to spawn a new process — each new process is independent from the original process.


To spawn this process we need to provide a function that Python can call, which Python will then take and create a brand new process + execute it:

def start_tracker(box, label, rgb, inputQueue, outputQueue):
# construct a dlib rectangle object from the bounding box
# coordinates and then start the correlation tracker
t = dlib.correlation_tracker()
rect = dlib.rectangle(box[0], box[1], box[2], box[3])
t.start_track(rgb, rect)

The first three parameters to

start_tracker
  include:



  • box
     : Bounding box coordinates of the object we are going to track, presumably returned by some sort of object detector, whether manual or programmatic.

  • label
     : Human-readable label of the object.

  • rgb
     : An RGB-ordered image that we’ll be using to start the initial dlib object tracker.


Keep in mind how Python multiprocessing works — Python will call this function and then create a brand new interpreter to execute the code within. Therefore, each

start_tracker
  spawned process will be independent from its parent. To communicate with the Python driver script we need to leverage either Pipes or Queues. Both types of objects are thread/process safe, accomplished using locks and semaphores.


In essence, we are creating a simple producer/consumer relationship:



  1. Our parent process will produce new frames and add them to the queue of a particular object tracker.

  2. The child process will then consume the frame, apply object tracking, and then return the updated bounding box coordinates.


I decided to use

Queue
  objects for this post; however, keep in mind that you could use a
Pipe
  if you wish — be sure to refer to the Python multiprocessing documentation for more details on these objects.


Now let’s begin an infinite loop which will run in the process:

# loop indefinitely -- this function will be called as a daemon
# process so we don't need to worry about joining it
while True:
# attempt to grab the next frame from the input queue
rgb = inputQueue.get()

# if there was an entry in our queue, process it
if rgb is not None:
# update the tracker and grab the position of the tracked
# object
t.update(rgb)
pos = t.get_position()

# unpack the position object
startX = int(pos.left())
startY = int(pos.top())
endX = int(pos.right())
endY = int(pos.bottom())

# add the label + bounding box coordinates to the output
# queue
outputQueue.put((label, (startX, startY, endX, endY)))

We loop indefinitely here — this function will be called as a daemon process, so we don’t need to worry about joining it.


First, we’ll attempt to grab a new frame from the

inputQueue
  on Line 21.


If the frame is not empty, we’ll grab the frame and then 

update
  the object tracker, allowing us to obtain the updated bounding box coordinates (Lines 24-34).


Finally, we write the

label
  and bounding box to the
outputQueue
  so the parent process can utilize them in the main loop of our script (Line 38).


Back to the parent process, we’ll parse our command line arguments:

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
help="path to Caffe pre-trained model")
ap.add_argument("-v", "--video", required=True,
help="path to input video file")
ap.add_argument("-o", "--output", type=str,
help="path to optional output video file")
ap.add_argument("-c", "--confidence", type=float, default=0.2,
help="minimum probability to filter weak detections")
args = vars(ap.parse_args())

The command line arguments for this script are exactly the same as our slower, non-multiprocessing script. If you need a refresher on the arguments, just click here. And furthermore, read my post about argparse and command line arguments if you aren’t familiar with them.


Let’s initialize our input and output queues:

# initialize our lists of queues -- both input queue and output queue
# for *every* object that we will be tracking
inputQueues = []
outputQueues = []

These queues will hold the objects we are tracking. Each process spawned will need two

Queue
  objects:



  1. One to read input frames from

  2. And a second to write results to


This next block is identical to our previous script:

# initialize the list of class labels MobileNet SSD was trained to
# detect
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
"sofa", "train", "tvmonitor"]

# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

# initialize the video stream and output video writer
print("[INFO] starting video stream...")
vs = cv2.VideoCapture(args["video"])
writer = None

# start the frames per second throughput estimator
fps = FPS().start()

We define our model’s

CLASSES
  and load the model itself (Lines 61-68). Remember, these
CLASSES
  are static — our MobileNet SSD supports these classes and only these classes. If you want to detect + track other objects you’ll need to find another pretrained model or train one. Furthermore, the order of this list matters! Do not change the ordering of the list unless you enjoy being confused! I would also recommend reading this tutorial if you want to further understand how object detectors work.


We initialize our video stream object and set our video

writer
  object to
None
  (Lines 72 and 73).


Our frames per second calculator is instantiated and started on Line 76.


Now let’s begin looping over frames from the video stream:

# loop over frames from the video file stream
while True:
# grab the next frame from the video file
(grabbed, frame) = vs.read()

# check to see if we have reached the end of the video file
if frame is None:
break

# resize the frame for faster processing and then convert the
# frame from BGR to RGB ordering (dlib needs RGB ordering)
frame = imutils.resize(frame, width=600)
rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

# if we are supposed to be writing a video to disk, initialize
# the writer
if args["output"] is not None and writer is None:
fourcc = cv2.VideoWriter_fourcc(*"MJPG")
writer = cv2.VideoWriter(args["output"], fourcc, 30,
(frame.shape[1], frame.shape[0]), True)

The above code block is, yet again, identical to the one in the previous script. Be sure to refer above as needed.


Now let’s handle the case where we have no

inputQueues
 :

# if our list of queues is empty then we know we have yet to
# create our first object tracker
if len(inputQueues) == 0:
# grab the frame dimensions and convert the frame to a blob
(h, w) = frame.shape[:2]
blob = cv2.dnn.blobFromImage(frame, 0.007843, (w, h), 127.5)

# pass the blob through the network and obtain the detections
# and predictions
net.setInput(blob)
detections = net.forward()

# loop over the detections
for i in np.arange(0, detections.shape[2]):
# extract the confidence (i.e., probability) associated
# with the prediction
confidence = detections[0, 0, i, 2]

# filter out weak detections by requiring a minimum
# confidence
if confidence > args["confidence"]:
# extract the index of the class label from the
# detections list
idx = int(detections[0, 0, i, 1])
label = CLASSES[idx]

# if the class label is not a person, ignore it
if CLASSES[idx] != "person":
continue

If there are no

inputQueues
  (Line 101) then we know we need to apply object detection prior to object tracking.


We apply object detection on Lines 103-109 and then proceed to loop over the results on Line 112. We grab our

confidence
  values and filter out weak
detections
  on Lines 115-119.


If our

confidence
  meets the threshold established by our command line arguments, we consider the detection, but we further filter it out by class
label
 . In this case, we’re only looking for
"person"
  objects (Lines 122-127).


Assuming we have found a

"person"
 , we’ll create queues and spawn tracking processes:

# compute the (x, y)-coordinates of the bounding box
# for the object
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")
bb = (startX, startY, endX, endY)

# create two brand new input and output queues,
# respectively
iq = multiprocessing.Queue()
oq = multiprocessing.Queue()
inputQueues.append(iq)
outputQueues.append(oq)

# spawn a daemon process for a new object tracker
p = multiprocessing.Process(
target=start_tracker,
args=(bb, label, rgb, iq, oq))
p.daemon = True
p.start()

# grab the corresponding class label for the detection
# and draw the bounding box
cv2.rectangle(frame, (startX, startY), (endX, endY),
(0, 255, 0), 2)
cv2.putText(frame, label, (startX, startY - 15),
cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

We first compute the bounding box coordinates on Lines 131-133.


From there we create two new queues,

iq
  and
oq
  (Lines 137 and 138), appending them to
inputQueues
  and
outputQueues
  respectively (Lines 139 and 140).


From there we spawn a new

start_tracker
  process, passing the bounding box,
label
 ,
rgb
  image, and the
iq
  +
oq
  (Lines 143-147). Don’t forget to read more about multiprocessing here.


We also draw the detected object’s bounding box

rectangle
  and class
label
  (Lines 151-154).


Otherwise, we’ve already performed object detection so we need to apply each of the dlib object trackers to the frame:

# otherwise, we've already performed detection so let's track
# multiple objects
else:
# loop over each of our input ques and add the input RGB
# frame to it, enabling us to update each of the respective
# object trackers running in separate processes
for iq in inputQueues:
iq.put(rgb)

# loop over each of the output queues
for oq in outputQueues:
# grab the updated bounding box coordinates for the
# object -- the .get method is a blocking operation so
# this will pause our execution until the respective
# process finishes the tracking update
(label, (startX, startY, endX, endY)) = oq.get()

# draw the bounding box from the correlation object
# tracker
cv2.rectangle(frame, (startX, startY), (endX, endY),
(0, 255, 0), 2)
cv2.putText(frame, label, (startX, startY - 15),
cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

Looping over each of the

inputQueues
 , we add the
rgb
  image to them (Lines 162 and 163).


Then we loop over each of the

outputQueues
  (Line 166), obtaining the bounding box coordinates from each independent object tracker (Line 171). Finally, we draw the bounding box + associated class
label
  on Lines 175-178.


Let’s finish out the loop and script:

# check to see if we should write the frame to disk
if writer is not None:
writer.write(frame)

# show the output frame
cv2.imshow("Frame", frame)
key = cv2.waitKey(1) & 0xFF

# if the `q` key was pressed, break from the loop
if key == ord("q"):
break

# update the FPS counter
fps.update()

# stop the timer and display FPS information
fps.stop()
print("[INFO] elapsed time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# check to see if we need to release the video writer pointer
if writer is not None:
writer.release()

# do a bit of cleanup
cv2.destroyAllWindows()
vs.release()

We write the frame to the output video if necessary as well as show the

frame
  to the screen (Lines 181-185).


If the

"q"
  key is pressed, we “quit”, breaking out of the loop (Lines 186-190).


If we do continue processing frames, our

fps
  calculator is updated on Line 193, and then we start the process at the beginning of the
while
  loop again.


Otherwise, we’re done processing frames, and we display the FPS throughput info + release pointers and close windows.


To execute this script, make sure you use the “Downloads” section of the post to download the source code + example video.


From there, open up a terminal and execute the following command:

$ python multi_object_tracking_fast.py --prototxt mobilenet_ssd/MobileNetSSD_deploy.prototxt \
--model mobilenet_ssd/MobileNetSSD_deploy.caffemodel \
--video race.mp4 --output race_output_fast.avi
[INFO] loading model...
[INFO] starting video stream...
[INFO] elapsed time: 14.01
[INFO] approx. FPS: 24.26


As you can see, our faster, more efficient multi-object tracker is running at 24 FPS, an improvement by over 45% from our previous implementation ????!


Furthermore, if you open up your activity monitor while this script is running you will see that more of your system’s overall CPU Is being utilized.


This speedup is obtained by allowing each of the dlib object trackers to run in a separate process which in turn enables your operating system to perform more efficient scheduling of the CPU resources.


Improvements and suggestions


The dlib multi-object tracking Python scripts I’ve shared with you today will work just fine for processing shorter video streams; however, if you intend on utilizing this implementation for long-running production environments (in the order of many hours to days of video) there are two primary improvements I would suggest you make:


The first improvement would be to utilize processing pools rather than spawning a brand new process for each object to be tracked.


The implementation covered here today constructs a brand new

Queue
  and
Process
  for each object that we need to track.


For today’s purposes that’s fine, but consider if you wanted to track 50 objects in a video — this implies that you would spawn 50 processes, one for each object. At that point, the overhead of your system managing all those processes will destroy any increase in FPS throughput. Instead, you would want to utilize processing pools.


If your system has N processor cores, then you would want to create a pool with N – 1 processes, leaving one core to your operating system to perform system operations. Each of these processes should perform multiple object tracking, maintaining a list of object trackers, similar to the first multi-object tracking we covered today.


This improvement will allow you to utilize all cores of your processor without the overhead of having to spawn many independent processes.


The second improvement I would make is to clean up the processes and queues.


In the event that dlib reports an object as “lost” or “disappeared” we are not returning from the

start_tracker
  function, implying that that process will live for the life of the parent script and only be killed when the parent exits.


Again, that’s fine for our purposes here today, but if you intend on utilizing this code in production environments, you should:



  1. Update the
    start_tracker
      function to return once dlib reports the object as lost.

  2. Delete the
    inputQueue
      and
    outputQueue
      for the corresponding process as well.


Failing to perform this cleanup will lead to needless computational consumption and memory overhead for long-running jobs.


The third improvement is to improve tracking accuracy by running the object detector every N frames (rather than just once at the start).


I actually demonstrated this in my previous post on people counting with OpenCV. It requires more logic and thought, but yields a much more accurate tracker.


I elected to forego the implementation for this script so that I could teach you the multiprocessing method concisely.


Ideally, you would use this third improvement in addition to multiprocessing.


Summary


In this tutorial, we learned how to utilize the dlib library to perform multi-object tracking.


We also learned how to leverage multiprocessing to:



  1. Distribute the actual object tracker instantiations to multiple cores of our processor,

  2. Thereby leading to an increase in FPS throughput rate by over 45%.


I would encourage you to utilize the multiprocessing implementation of our dlib multi-object tracker for your own applications as it’s faster and more efficient; however, you should refer to the “Improvements and suggestions” section of this tutorial where I discuss how you can further enhance the multi-object tracking implementation.


If you enjoyed this series on object tracking, be sure to enter your email in the form below to download today’s source code + videos as well as to be notified of future tutorials here on PyImageSearch.


Downloads:


If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!




The post Multi-object tracking with dlib appeared first on PyImageSearch.





Source: Multi-object tracking with dlib


Consulente in Informatica dal 1984

Software automazione, progettazione elettronica, computer vision, intelligenza artificiale, IoT, sicurezza informatica, tecnologie di sicurezza militare, SIGINT. 

Facebook:https://www.facebook.com/flaviobernardotti58
Twitter : https://www.twitter.com/Flavio58

Cell:  +39 366 3416556

f.bernardotti@deeplearningitalia.eu

#deeplearning #computervision #embeddedboard #iot #ai

 

Related Topics

  Subject / Started by Replies Last post
0 Replies
294 Views
Last post May 18, 2018, 05:06:43 PM
by Flavio58
0 Replies
230 Views
Last post June 04, 2018, 01:07:00 PM
by Flavio58
0 Replies
113 Views
Last post August 03, 2018, 08:03:38 AM
by Flavio58
0 Replies
83 Views
Last post October 22, 2018, 06:09:49 PM
by Flavio58
0 Replies
85 Views
Last post October 29, 2018, 06:02:26 PM
by Flavio58

Sitemap 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326