Traffic Surveillance
|
||||||||||||||||||||||||||||||||||||
The Problems
Given real-time images from traffic cameras (i.e. live traffic video) located throughout a metropolitan area, our initial interest is the automatic detection of vehicles traveling on the transportation network. This is a first step towards the identification of vehicles, where each “different” detected vehicle is associated with a unique identification number. Vehicle identification may lead to the tracking of a vehicle in a single camera and among different cameras in the network. This allows the computation of live traffic statistics indicating the current conditions of roads: congestion and traveling time. The estimation of car speed, particularly high speed, could be effective in the discovery of dangerous aggressive driving or possible illegal activity. Another application is the detection of unusual events such as accidents, stopped suspicious cars, stalled vehicles needing assistance, and drunk driving. The ultimate vehicle tracking among different cameras corresponds to path computation. The path of any individual vehicle in the transportation network could be retrieved from the sequence of camera detections. This way, patterns of traffic could be analyzed. Such real data could be used in traffic planning to identify congestion bottlenecks, improve traffic lights timing, and to design detours due to roadwork, parades, etc. The Input
Live traffic video is subject to various technical issues. The same link may be shared by different cameras. The streaming videos might be corrupted due to communication problems. The images are plagued with adverse noise or unwanted text. The environment alters the image with raindrops and dust. Unfortunately, the video quality is constrained to a low resolution (356 x 244) and a very low frame rate (0.475-0.675 fps). In order to work on traffic surveillance problems, we assume the cameras don’t move and the input videos are free of such technical defects.
Technical issues with live traffic video. We captured five live traffic videos from two metropolitan areas: Washington D.C. (4 videos) and New York city (1 video). The videos present traffic under different conditions concerning whether (sunny, cloudy, and rainy), lighting (day and night), and population (sparse and dense). Next, we discuss each input, its conditions, and important issues. Video A shows traffic at 15th St NW and Pennsylvania Ave intersection of Washington D.C. The video situation is a sunny day with a sparse population of vehicles (low traffic). The main challenge in this video is the tree shadows which tend to decrease the ability to discern between moving cars and parts of the road without sunlight. Video A: 15th St NW and Pennsylvania
Ave (535Kb) Video B presents the same intersection of downtown D.C., but at night condition with low traffic. In this case, an important issue is the effect that car headlights have in changing the visual appearance of the road image. Video B: 15th St NW and Pennsylvania
Ave (492Kb) Video C portraits Times Square in New York city. This is the best example of independent visual activities such as pedestrians and luminous ads. The video condition is a cloudy day with an average traffic. Video C: Broadway and 46 St (1.07Mb) Video D shows another intersection of Washington D.C. at 3rd St and H St and Massachusetts Ave. The video condition is a cloudy day with a sparse population of vehicles. Video D: 3rd St and H St and
Massachusetts Ave NW (623Kb) Video E presents a segment of Interstate 495 (Beltway) near Connecticut Ave. The video condition is a rainy day with high traffic (very dense). A dense population of vehicles makes a good background construction harder. Furthermore, the process of detecting vehicles as blobs (connected components) is subject to blob merging when the population of vehicles is dense. Video E: I-495 and Connecticut Ave (MD 185)
(666Kb) A Basic Approach for Vehicle Detection
A basic approach for the detection of moving objects on images is background subtraction. Initially, an image background is constructed for each camera using the corresponding input video. For each pixel location in the background image, the color of the pixel is the median of the colors in all frames of the video for the respective pixel location. This background construction technique is successful in most cases, generating a very low noise image of the static content in the video. However, as mentioned above, a very dense population of vehicles leads to artifacts in the background image. This is observed in Background E below as the dark tracks in the middle of each lane of I-495.
Once an image background is built, this image is subtracted from the original video in order to detect moving objects. The assumption is that a moving object has a different visual appearance from the constructed static background. Background Subtraction The input videos contain a fair share of pedestrians, bikers, luminous ads, and other independent distractions. In order to eliminate the influence of such visually dynamic aspects, we define a region of interest for each camera. The background subtraction values are kept only in the region of interest. The other values are assigned to a zero difference.
Region of interest for each camera. With the background subtraction updated according to the region of interest, motion pixels are identified as the locations where absolute difference in appearance is greater than a threshold value. After that, blobs possibly representing cars are found as the connected components of the motion pixels in the image. Components with a very small area are filtered out. Results
The results of this basic approach were evaluated with the percentage of false negatives (e.g. vehicles not detected) and false positives (e.g. artifacts detected as vehicles) to the true total number of vehicles. If the same car appears in two consecutive frames, it is counted twice since vehicle identification is not considered so far.
Evaluation of basic approach Video output |
||||||||||||||||||||||||||||||||||||