I am trying to track a person who appears for part of a 53-second video I found on YouTube as shown in frame #1 from the video:
Frame #1 from Video by Price (2020)
The tracking rectangle is (supposed to be) using OpenCV's meanshift function based on the HSV profile of the person being tracked.
Unfortunately, the red rectangle in the frame doesn't really move with the person in the video; it just wanders aimlessly and eventually stops moving. At first, I thought it was because of the person walking in the opposite direction since that person is closer to the viewer than the person being tracked, which means as that person goes past the person being tracked, the tracking rectangle's 'view' is obstructed. I subsequently rewrote the code so that the tracking rectangle appears from frame #120 (4 seconds into the video at 30 fps):
Frame #120 from Video by Price (2020)
This action, however, didn't improve the situation; in fact, the rectangle moved even less! After the subject moved, the rectangle stayed in its position for some seconds and then drifted to part of the wall and stayed there. I then thought it was due to the fact that the HSV profile within the rectangle included the background. I subsequently reduced the size of the rectangle and made sure the HSV profile within it excluded background, as shown below:
Frame #120 from Video by Price (2020)
This still made no difference. Up until this point, I had been using OpenCV's bitwise and histogram functions to deduce the HSV range for the region within the rectangle. But histogram is 2D and I was only including H and S within the analysis and setting V's range as 0 to 255. After this point, I went for 3D Matplotlib and obtained a more precise V range to go with the precise H and S ranges. The final code for the tracking attempt is below:
import cv2 as cv
import numpy as np
import time
source = 'A quiet Japanese street.mp4' # https://www.youtube.com/watch?v=gbsMx0cyM34
capture = cv.VideoCapture('bank_videos/' + source)
start = time.time()
RED = (0, 0, 255)
# Set up initial location of window
x, y, w, h = 180, 261, 10, 14
track_window = (x, y, w, h)
# Set up the ROI for tracking
roi = frame[y:y+h, x:x+w]
hsv_roi = cv.cvtColor(roi, cv.COLOR_BGR2HSV)
mask = cv.inRange(hsv_roi, np.array((105, 10, 40)), np.array((145, 80, 100)))
roi_hist = cv.calcHist([hsv_roi],[0],mask,[180],[0,180])
# I wondered if the 5th argument for cv.calcHist() should be [0, 180, 0, 256]
# but this raised an exception
cv.normalize(roi_hist,roi_hist,0,255,cv.NORM_MINMAX)
# Set up the termination criteria, either 10 iteration or move by at least 1 pt
term_crit = ( cv.TERM_CRITERIA_EPS | cv.TERM_CRITERIA_COUNT, 10, 1 )
while capture.isOpened():
ret, frame = capture.read()
# if frame is read correctly ret is True
if not ret:
print("Can't receive frame (stream end?). Exiting ...")
break
else:
# Convert BGR to HSV
hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
dst = cv.calcBackProject([hsv],[0],roi_hist,[0,180],1)
if int(time.time() - start) in range(4, 27):
# Draw tracking rectangle
cv.rectangle(frame, (x,y), (x+w,y+h), RED,2)
# Apply meanshift to get the new location
ret, track_window = cv.meanShift(dst, track_window, term_crit)
cv.imshow('Frame', frame)
if cv.waitKey(27) == ord('q'):
break
However, this has not made any difference whatsoever. So, where am I going wrong? Is it with the argument for cv.waitKey()
? I don't think so. The rectangle is not responding to the movement of the subject and I can't figure out why.
(In case you're wondering, I'm using Jupyter Notebook and capture.release()
and cv.destroyAllWindows()
appear in separate subsequent cells.)
Reference
Price, A. (2020) A quiet Japanese street. Available from: https://www.youtube.com/watch?v=gbsMx0cyM34 [Accessed 25 February 2024].