Pedestrian Tracking in UAV Images With Kalman Filter Motion Estimator and Correlation Filter Figure 1. Pedestrian tracking with motion estimation and detector (the subscript Tr represent the final tracking box after camera motion compensation). 5) Camera motion between current and previous frame is compensated using homography matrix (Hk). 6) An object is tracked until track for l consecutive frames are lost. 7) CF tracker and motion estimator are reinitialized automatically when the detector reports an object. VISUAL APPEARANCE MODEL FOR TRACKING Image features such as the histogram of oriented gradients (HOG), local binary pattern, and color-names represent the visual appearance of an image. As opposed to the motion estimation model, the image appearance is exploited to get insights about the target location for tracking in DCF-based tracking algorithms. A discriminative correlation filter works by learning the filter coefficient online using sample image patches collected during tracking. KCF [14] and CSRDCF [20] have been taken as baseline trackers. KCF Henriques et al. proposed kernelized correlation filter [14] based tracking that used FFT and circulant matrix concept to generate a large number of shifted target patches and operated in Fourier domain for higher 6 speed. It used an ½MN base patch to generate n samples by cyclic shifts in horizontal and vertical directions. The shifted sample set B can be given by (1), where ~b ¼ ðb1;b2; ...;bnÞ is the vectorized form of the ½MN dimensional base patch. Shifted version of an image patch can be seen in Figure 2(b), where each image represents a row in the circulant matrix CðbÞ B ¼ Cð~bÞ¼ 2 6 6 6 4 b1 b2 b2 b3 b1 ... ... bn b1 ... ... ... ... bn ... bn1 3 7 7 7 5 Preliminary work on KCF [14] was presented in [13] that demonstrated the connection between ridge regression with cyclically shifted samples and correlation filters. The goal is to find a function gðbÞ¼ wTb that minimizes the squared error between gðbÞ over the samples bi and regression target gi as shown argmin w Xn i ðwTbi giÞ2 þjjwjj2: (2) Here, n denotes the total number of training samples, and is the regularization parameter to prevent overfitting [13]. The closed-form solution [33] of the regression IEEE A&E SYSTEMS MAGAZINE JULY 2023 : (1)