Automatic Pedestrian Detection And Tracking With A Multiple Cue Max Margin Framework


Download Automatic Pedestrian Detection And Tracking With A Multiple Cue Max Margin Framework PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Automatic Pedestrian Detection And Tracking With A Multiple Cue Max Margin Framework book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.

Download

Automatic Pedestrian Detection and Tracking with a Multiple-cue Max-margin Framework


Automatic Pedestrian Detection and Tracking with a Multiple-cue Max-margin Framework

Author: Ferdinand Stefanus

language: en

Publisher:

Release Date: 2010


DOWNLOAD





Object tracking is a computer vision task of predicting objects' locations in a video sequence. Estimating an object's trajectory is usually accomplished by a combination of cues, including an appearance model that describes the appearance of the target object and a motion model that describes object dynamics. In this thesis, we present MMTrack, a principled framework for integrating multiple cues in object tracking. The framework formularizes object tracking as a structured prediction problem solved with Structural Support Vector Machine. The formulation features joint learning of appearance and motion model parameters, as well as incorporation of descriptive and discriminative appearance models. We also show a fully automatic pedestrian detection and tracking system based on MMTrack, and present its performance on real-world data sets.

Tackling Pedestrian Detection in Large Scenes with Multiple Views and Representations


Tackling Pedestrian Detection in Large Scenes with Multiple Views and Representations

Author: Nicola Pellicanò

language: en

Publisher:

Release Date: 2018


DOWNLOAD





Pedestrian detection and tracking have become important fields in Computer Vision research, due to their implications for many applications, e.g. surveillance, autonomous cars, robotics. Pedestrian detection in high density crowds is a natural extension of such research body. The ability to track each pedestrian independently in a dense crowd has multiple applications: study of human social behavior under high densities; detection of anomalies; large event infrastructure planning. On the other hand, high density crowds introduce novel problems to the detection task. First, clutter and occlusion problems are taken to the extreme, so that only heads are visible, and they are not easily separable from the moving background. Second, heads are usually small (they have a diameter of typically less than ten pixels) and with little or no textures. This comes out from two independent constraints, the need of one camera to have a field of view as high as possible, and the need of anonymization, i.e. the pedestrians must be not identifiable because of privacy concerns.In this work we develop a complete framework in order to handle the pedestrian detection and tracking problems under the presence of the novel difficulties that they introduce, by using multiple cameras, in order to implicitly handle the high occlusion issues.As a first contribution, we propose a robust method for camera pose estimation in surveillance environments. We handle problems as high distances between cameras, large perspective variations, and scarcity of matching information, by exploiting an entire video stream to perform the calibration, in such a way that it exhibits fast convergence to a good solution. Moreover, we are concerned not only with a global fitness of the solution, but also with reaching low local errors.As a second contribution, we propose an unsupervised multiple camera detection method which exploits the visual consistency of pixels between multiple views in order to estimate the presence of a pedestrian. After a fully automatic metric registration of the scene, one is capable of jointly estimating the presence of a pedestrian and its height, allowing for the projection of detections on a common ground plane, and thus allowing for 3D tracking, which can be much more robust with respect to image space based tracking.In the third part, we study different methods in order to perform supervised pedestrian detection on single views. Specifically, we aim to build a dense pedestrian segmentation of the scene starting from spatially imprecise labeling of data, i.e. heads centers instead of full head contours, since their extraction is unfeasible in a dense crowd. Most notably, deep architectures for semantic segmentation are studied and adapted to the problem of small head detection in cluttered environments.As last but not least contribution, we propose a novel framework in order to perform efficient information fusion in 2D spaces. The final aim is to perform multiple sensor fusion (supervised detectors on each view, and an unsupervised detector on multiple views) at ground plane level, that is, thus, our discernment frame. Since the space complexity of such discernment frame is very large, we propose an efficient compound hypothesis representation which has been shown to be invariant to the scale of the search space. Through such representation, we are capable of defining efficient basic operators and combination rules of Belief Function Theory. Furthermore, we propose a complementary graph based description of the relationships between compound hypotheses (i.e. intersections and inclusion), in order to perform efficient algorithms for, e.g. high level decision making.Finally, we demonstrate our information fusion approach both at a spatial level, i.e. between detectors of different natures, and at a temporal level, by performing evidential tracking of pedestrians on real large scale scenes in sparse and dense conditions.

Tracking Multiple Pedestrians from Monocular Videos in an Interacting Multiple Model Framework


Tracking Multiple Pedestrians from Monocular Videos in an Interacting Multiple Model Framework

Author: Zhengqiang Jiang

language: en

Publisher:

Release Date: 2013


DOWNLOAD





[Truncated abstract] Detecting and tracking pedestrians in videos have many important computer vision applications including visual surveillance, people recognition, smart environments and human-machine interaction. An automatic pedestrian tracking system comprises 2 main stages: (1) detecting pedestrians in the initial frame or every frame; (2) maintaining the identities of pedestrians in video sequences. Tracking pedestrians in videos is a challenging problem because of background clutters, lighting variation, and occlusion. A reliable pedestrian tracking system should be capable of maintaining the identities of pedestrians in the presence of partial or even, occasionally, full occlusion in video sequences. The appearance model of a pedestrian is widely used in pedestrian tracking. For the appearance model, I compute the colour histograms for the upper and lower bodies of each pedestrian detected by the Histogram of Oriented Gradient (HOG) human detector. Each of these colour histograms is a 3-dimensional entity in the L*a*b colour space and the concatenation of them yields a 4-dimensional tensor. This allows spatial information to be incorporated into the appearance model while the computation cost is kept to its minimum. To get a better estimate of the colour histogram, the kernel density estimation is used to smooth the histogram of the appearance of each pedestrian. The Hellinger distance is used for histogram matching. In this thesis, a multiple pedestrian tracking method for monocular videos captured by a fixed camera in an interacting Multiple Model (IMM) framework is presented. This tracking method involves multiple IMM trackers running in parallel which are tied together by a robust data association component. This method formulates data association as a bipartite graph problem and employs the Munkres' algorithm to associate the new observations with the existing tracks. The data association algorithm incorporates the information from the appearance and the motion models to get the edge weights of the graph. Two strategies, thresolding on the total error (TTE) and validation gate and appearance error (VAE), for data association are designed and compared. The first strategy is to impose a threshold value on the total error from the appearance and motion models as the edge weights of the bipartite graph. The second strategy is to use the combined motion model as a validation gate and the error from the appearance model only as the edge weights. The experimental results show that the second strategy gives better tracking results. Short-term occlusion problems and false negative errors from the detector are dealt with using a sliding window of video frames where tracking persists in the absence of observations. For the interacting multiple model tracking framework, the method incorporates three motion models: a stationary model, a constant velocity model and a constant acceleration model. This method has been evaluated and compared both qualitatively and quantitatively with three common tracking methods using various standard video databases...