Modern Civilization

An Image Based Approach for Vehicle Detection

An Image Based Approach for Vehicle Detection

Chapter 1



The main objective of this work is to detect vehicles in still images which employ the use of the wheels of the vehicle.

The goal of this work is to improve the detection technique from the side view of vehiclesin still images.

The main contributions of this work are:

  • To modify the existing vehicle detection methods for achieving better quality vehicle detection output.
  • To apply edge detection algorithm and Hough circle detection algorithm in the side view of still vehicle images.

Originality of this work

This originality of this work is the modification of the vehicle detection methodologies proposed in [2][6]. The improved output is achieved by using Canny edge detection algorithm [11] and Hough circle detection algorithm [12].

Literature Review

Vehicle detection [1] is an important problem in many related applications, such as self-guided vehicles, driver assistance systems, intelligent parking systems, or measurement of traffic parameters, like vehicle count, speed, and flow. The most common approach to vehicle detection is using active sensors such as lasers or millimeter-wave radars. Prototype vehicles employing active sensors have shown promising results; however, active sensors have several drawbacks such as low resolution, may interfere with each other, and are rather expensive. Passive sensors on the other hand, such as cameras, offer a more affordable solution and can be used to track more effectively cars entering a curve or moving from one side of the road to another. Moreover, visual information can be very important in a number or related applications such as lane detection, traffic sign recognition, or object identification (e.g., pedestrians, obstacles).

One of most common approaches to vehicle detection is using vision-based [10] techniques to analyze vehicles from images or videos. However, due to the variations of vehicle colors, sizes, orientations, shapes, and poses, developing a robust and effective system of vision-based vehicle detection is very challenging. To address the above problems, different approaches using different features and learning algorithms [2-7] for locating vehicles have been investigated. For example, many techniques used background subtraction [17] to extract motion features for detecting moving vehicles from video sequences. However, this kind of motion feature is no longer usable and found in still images.

Previous works also have attempted to use PCA for front view classification of vehicles. Z. Sun et al. use support vector machines [7] after using a gaborfilterbank, template matching on rear view of the vehicles. M. Bertozzi et al. uses stereo correspondence matching to find vehicles. All of these do not use a side view, and most do not single out a single feature that they are trying to detect.

As mentioned before, vehicles have larger appearance variations including their colors, sizes, and shapes which will change according to their different viewing positions, lighting conditions, and cluttered background. All the variations will increase many difficulties and challenges in selecting a general feature to describe vehicles. In this paper, a novel wheel-based detection method to detect vehicles from still images is proposed. The goal is to use a specific ubiquitous feature to track on all cars, wheels. All car wheels are round and have similar texture. Also, they are all framed similarly with fenders on top and roadbed on bottom. The goal is to take advantage of this information to form a robust wheel detector that can be used as part of a vehicle detector. An application of this work is for object avoidance. The algorithm can detect a wheel in the blind spot, or whether a wheel is getting too close for comfort.

Organization of this thesis

Chapter one describes about the background and objectives of this work as well as the objectives, innovation and literature review about vehicle detection in still images are presented here.

Chapter two is about some image processing basics (like histogram, color model, contrast enhancement).

Chapter three is focused on describing methodologies concerned in this work. The new method is also proposed in this chapter.

Experimental result and discussions are in chapter four.

Finally chapter five draws the conclusion.

Chapter 2


Digital Imaging

Digital imaging or digital image acquisition is the creation of digital images, typically from a physical scene. The term is often assumed to imply or include the processing, compression, storage, printing, and display of such images. The most usual method is by digital photography with a digital camera but other methods are also employed.

Digital Imaging Methods

A digital photograph may be created directly from a physical scene by a camera or similar device. Alternatively, a digital image may be obtained from another image in an analog medium, such as photographs, photographic film, or printed paper, by an image scanner or similar device. Many technical images—such as those acquired with tomographic equipment, side-scan sonar, or radio telescopes—are actually obtained by complex processing of non-image data. Weather radar maps as seen on television news are a commonplace example. The digitalization of analog real-world data is known as digitizing, and involves sampling (discretization) and quantization. Finally, a digital image can also be computed from a geometric model or mathematical formula. In this case the name image synthesis is more appropriate, and it is more often known as rendering.

Image representation

Represent an image as a 2D array. Indices represent the spatial location. Values represent light intensity.

Digital Image Processing

Any 2D mathematical function that bears information can be represented as an image. A digital image is an array of real or complex numbers represented by a finite number of bits. Digital image processing generally refers to processing of a 2D picture by a digital computer.

Digital image processing focuses on two major tasks

  • Improvement of pictorial information for human interpretation
  • Processing of image data for storage, transmission and representation for autonomous machine perception

Image Representation and modeling

The goal of image modeling or representation is to find proper ways to mathematically describe and analyze images. It is therefore the most fundamental step in image processing. An image could represent luminance of objects in a scene (image taken by camera), the absorptioncharacteristics of the body tissue or material particles (X-ray imaging), radar cross-section of a target (radar imaging), the temperature profile of a region (infrared imaging), the gravitational field in an area (geophysical imaging).

 Image Model

An image can be represented by a matrix U where each element ui,jfor 0 ≤ i ≤ N (row) and 0 ≤ j ≤ M (column) are called the picture elements or pixels.

The image resolution is the size MxN (width x height) in pixels. The spatial resolution is the size covered by a pixel in the real world.

Types of Digital Image

The images types we will consider are: 1) binary, 2) gray-scale, 3) color, and 4) multispectral.

  • Binary Images

                 Binary images are the simplest type of images and can take on two values, typically black and white, or 0 and 1. A binary image is referred to as a 1-bit image because it takes only 1 binary digit to represent each pixel. These types of images are frequently used in applications where the only information required is general shape or outline, for example optical character recognition (OCR).

                 Binary images are often created from the gray-scale images via a threshold operation, where every pixel above the threshold value is turned white (‘1’), and those below it are turned black (‘0’).

  • Gray-scale images

                 Gray-scale images are referred to as monochrome (one-color) images.

They contain gray-level information, no color information. The number of bits for each pixel determines the number of different gray levels available. The typical gray-scale image contains 8bits/pixel data, which allows us to have 256 different gray levels. In applications like medical imaging and astronomy, 12 or 16bits/pixel images are used. These extra gray levels become useful when a small section of the image is made much larger to discern details.

  • Color Images

Color images can be modeled as three-band monochrome image data, where each band of data corresponds to a different color. The actual information stored in the digital image data is the gray-level information in each spectral band.

Typical color images are represented as red, green, and blue (RGB images). Using the 8-bit monochrome standard as a model, the corresponding color image would have 24-bits/pixel (8-bits for each of the three color bands red, green, and blue).

  • Multispectral images

Multispectral images typically contain information outside the normal human perceptual range. This may include infrared, ultraviolet, X-ray, acoustic, or radar data. These are not images in the usual sense because the information represented is not directly visible by the human system. However, the information is often represented in visual form by mapping the different spectral bands to RGB components.

Image enhancement

The goal of image enhancement is to emphasize certain image features for subsequent analysis or for image display. It deals with contrast manipulation, edge detection, pseudo coloring, noise filtering, sharpening, and magnifying.

Image analysis

It is concerned with making quantitative measurements from an image to produce a description of it. Image analysis technique deals with extraction of certain features that aid in identification of the object, navigation and tracking of objects. It is the main tools used in computer vision.

Linear Contrast Stretching

A linear mapping that enhances the contrast of an image without removing any details spreads the visual information available across a greater range of gray scale intensities.

The left image appears washed-out (most of the intensities are in a narrow band due to poor contrast).  The right image maps those values to the full available dynamic range.

Image Histogram

  • The histogram of an image is a table containing (for every gray level K) the probability of level K actually occurring in the image
  • The histogram could also be viewed as a frequency distribution of gray level within the image.

The histogram on the above is representative of an “under-exposed” image.  It has very few “bright” pixels and doesn’t make good use of the full dynamic range available.

The histogram on the above is representative of an “over-exposed” image.  It has very few “dark” pixels and doesn’t make good use of the full dynamic range available.

The histogram on the above is representative of an “poor contrast” image.  It has very few “dark” and very few “light” pixels.  It doesn’t make good use of the full dynamic range available.

The histogram on the above is representative of an image with good contrast. It makes good use of the full dynamic range available.

Cumulative Distribution Function

The CDF of an image is a table containing (for every gray level K) the probability of a pixel of level K OR LESS actually occurring in the image

The CDF can be computed from the histogram as:

CDF is a monotonically increasing function.

The CDF of an image having uniformly distributed pixel levels is a straight-line with slope 1 (using normalized gray levels). The derivative of the CDF is constant.

 Histogram Equalization                  

Is a process of automatically distribute pixel values evenly throughout the image

  • Each gray level should appear with identical probability in the image.
  • Often enhances an image, but not always.

Consider a discrete gray-scale image {x} and let ni be the number of occurrences of gray level i. The probability of an occurrence of a pixel of level i in the image is L being the total number of gray levels in the image, n being the total number of pixels in the image, and being in fact the image’s histogram for pixel value i, normalized to [0,1].

Let us also define the cumulative distribution function corresponding to px as

This is also the image’s accumulated normalized histogram.

We would like to create a transformation of the form y = T(x) to produce a new image {y}, such that its CDF will be linearized across the value range, i.e.

for some constant K. The properties of the CDF allow us to perform such a transform. It is defined as

Notice that the T maps the levels into the range [0,1]. In order to map the values back into their original range, the following simple transformation needs to be applied on the result:

Histogram Modeling

The histogram of an image represents the relative frequency of occurrence of the various gray levels in the image. The histogram modeling techniques modify an image globally so that its histogram has a desired shape.

Histogram Specification

Histogram specification is a way to transfer lighting of one image to another. That is, convert the histogram of one image to another without changing the spatial arrangements of its pixels.

 Color and Color Model

Color is the visual perceptual property corresponding in humans to the categories called red, green, blue and others. Color derives from the spectrum of light interacting in the eye with the spectral sensitivities of the light receptors. A color model is an abstract mathematical model describing the way colors can be represented as tuples of numbers, typically as three or four values or color components.

Color representation

There are the three primary colors of red, yellow and blue. Then there are secondary colors of green, orange and purple. Additionally, there are tertiary colors that are combinations of the first two sets.

Colors and Electromagnetic Spectrum

Wavelength of visible light: 350-789mm

RED = 700nm, GREEN = 546.1nm and BLUE = 435.5nm

Three Colors Theory

Thomas Young (1802) stated that any color can be reproduced by mixing an appropriate set of three primary colors. Light source uses additive color models. Light absorption uses subtractive color models. The HVS uses three kinds of cones with response peak in the yellow-green, the green and the blue regions with significant overlap. The human eye cannot resolve the components of a color mixture; therefore monochromatic colors are not unique for the HVS. The HVS is sensitive to dozens of grey levels and thousands of colors.

Color Models

  • Color models attempt to mathematically describe the way that humans perceive color
  • The human eye combines 3 primary colors (using the 3 different types of cones) to discern all possible colors.
  • Colors are just different light frequencies
    • red – 700nm wavelength
    • green – 546.1 nm wavelength
    • blue – 435.8 nm wavelength
  • Lower frequencies are cooler colors

Primary Colors

  • Primary colors of light are additive
  • Primary colors are red, green, and blue
  • Combining red + green + blue yields white
  • Primary colors of pigment are subtractive
  • Primary colors are cyan, magenta, and yellow
  • Combining cyan + magenta + yellow yields black

RGB color model

The RGB color model is an additive color model in which red, green, and blue light are added together in various ways to reproduce a broad array of colors. The name of the model comes from the initials of the three additive primary colors, red, green, and blue.The main purpose of the RGB color model is for the sensing, representation, and display of images in electronic systems, such as televisions and computers, though it has also been used in conventional photography. Before the electronic age, the RGB color model already had a solid theory behind it, based in human perception of colors.

 CMY color model

It is possible to achieve a large range of colors seen by humans by combining cyan, magenta, and yellow transparent dyes/inks on a white substrate. These are the subtractive primary colors. Often a fourth black is added to improve reproduction of some dark colors. This is called “CMY” or “CMYK” color space.

The cyan ink absorbs red light but transmits green and blue, the magenta ink absorbs green light but transmits red and blue, and the yellow ink absorbs blue light but transmits red and green. The white substrate reflects the transmitted light back to the viewer. Because in practice the CMY inks suitable for printing also reflect a little bit of color, making a deep and neutral black impossible, the K (black ink) component, usually printed last, is needed to compensate for their deficiencies. The dyes used in traditional color photographic prints and slides are much more perfectly transparent, so a K component is normally not needed or used in those media.

 YIQ Color Model

  • Luminance (Y), In phase (I), and Quadrature (Q)
  • Used for TV broadcasts – backward compatible with monochrome TV standards
  • Luminance is BW component
  • Human visual system is more sensitive to changes in intensity than in color.
  • In NTSC, bandwidth allocation of YIQ is 4MHz, 1.5 MHz, and 0.6 MHz respectively.

HSI Color Model

Based on human perception of colours,Colour is “decoupled” from intensity.

  • HUE

–     A subjective measure of colour.

–     Average human eye can perceive ~200 different colours

  • Saturation

–     Relative purity of the colour.  Mixing more “white” with a colour reduces its saturation.

–     Pink has the same hue as red but less saturation

  • Intensity

–     The brightness or darkness of an object

In color image processing, RGB images are often converted to HSI and then the I component is manipulated.  The image is then converted back to RGB.

Converting between RGB and HSI

If B is greater than G, then H = 3600 – H


Contrast is the difference in luminance and/or color that makes an object (or its representation in an image or display) distinguishable. In visual perception of the real world, contrast is determined by the difference in the color and brightness of the object and other objects within the same field of view. Because the human visual system is more sensitive to contrast than absolute luminance, we can perceive the world similarly regardless of the huge changes in illumination over the day or from place to place. The maximum contrast of an image is the contrast ratio or dynamic range.

Contrast is also the difference between the color or shading of the printed material on a document and the background on which it is printed, for example in optical character recognition.

Definitions of Image Contrast

There are many possible definitions of contrast. Some include color; others do not. Travnikova laments, “Such a multiplicity of notions of contrast is extremely inconvenient. It complicates the solution of many applied problems and makes it difficult to compare the results published by different authors.”

Various definitions of contrast are used in different situations. Here, luminance contrast is used as an example, but the formulas can also be applied to other physical quantities. In many cases, the definitions of contrast represent a ratio of the type.

The rationale behind this is that a small difference is negligible if the average luminance is high, while the same small difference matters if the average luminance is low (see Weber–Fechner law). Below, some common definitions are given.

The Weber contrast is defined as withand representing the luminance of the features and the background luminance, respectively. It is commonly used in cases where small features are present on a large uniform background, i.e. the average luminance is approximately equal to the background luminance.

The Michelson contrast (also known as the Visibility) is commonly used for patterns where both bright and dark features are equivalent and take up similar fractions of the area. The Michelson contrast is defined as withand representing the highest and lowest luminance. The denominator represents twice the average of the luminance.

Category of Contrast Enhancement Techniques

Category of contrast enhancement techniques includes –

1)      Direct method and indirect method.

2)      Global (Global Contrast Enhancement Technique) and Local technique.

Direct and Indirect Method

Direct method defines a contrast measure and improves it. Various definitions of contrast are used in different situations.

Indirect methods are based on Histogram Analysis that is used for controlling image contrast and brightness. Histogram modification techniques fall in this category. It exploits through the under-utilized regions of the dynamic range and don’t define a specific contrast term. This technique modifies the image through some pixel mapping.

Global and Local Technique

The contrast can be enhanced either globally or locally.

Global method (Global Contrast Enhancement) is a single mapping derived from the image is used.

Local method is the neighborhood of each pixel is used to obtain a local mapping function.

Chapter 3


The basic algorithm used for vehicle detection involves the following procedure:

First, we take still color images with side view of vehicle. We need some restrictions for good result, such as full side view of single vehicle. The full side view image is needed for accurate and effective result. The full side view image would like as follows one.

The first step of this algorithm is to transform the image from color image to gray scale, namely we convert RGB color space to Gray Scale image using the following equation-

After that the edge detection algorithm is applied to the gray image to find the edge-map of the image. In this method canny edge detector is used to find the edges of the image.

Next step is to find the circular objects in the image. Hough circle detection algorithm is used to find the circles in the image. This circle detection algorithm is used, because every vehicle has wheel in each side of the vehicle and we are using side view image of vehicles. We’ll get several detected circle depending on the environment and the vehicle. These circles are possible candidate of vehicles wheels. Other environmental objects and shapes are also included.

Next step is to find the vehicle wheels from the candidate circles. If there are only two circles then they are the possible wheels of the vehicle. For more than two circles we have to remove other circles based on some criteria. For this we pick two circles each time and calculate the radius difference, horizontal position and distance from each other. The radius difference of both circles should be minimum, as we know both wheel will have same radius. We use a threshold value for the difference. Pair of circles having radius difference greater threshold value will be rejected. Also the horizontal position of circles is measured. The Y-value of both circles should be minimum. We also use a threshold value for this case. Difference greater threshold value will be marked as rejected. And we also calculate the distance between the circles. We know that pair of wheels should not overlap each other and there is a minimum distance between them. Pair of circle candidates is selected based on the above criteria. These pair of circles is the most possible wheels.

The above discussion specifies the vehicle detection algorithm. This algorithm is found highly accurate and efficient one if full side view image (such as the input image) is available. It is used the basic image processing and circle detection method. This algorithm is easy to implement.

Vehicle Detection Mechanism

This vehicle detection algorithm is implemented followed by wheel detection. As vehicle wheel can be detected in side view of image, vehicle detection mechanism is highly dependent on circle detection. In this section, we described circle detection technique at first and then vehicle detection mechanism is described.

Edge Detection

This algorithm highly based on the edge-mapped image. We can find the circular shape objects in the edge mapped image. Canny edge detection algorithm is used in this algorithm for better result. We find the edge image from gray image of the original input image. In some case we need to histogram equalized the gray scale image to get better result. Better edge-map image relies on sharp edges of the original image. The shadow and edges of vehicles make it better for successful vehicle detection.

Circle Detection

This algorithm is also based on circle detection on the input image. After getting the edge mapped image we apply Hough circle detection technique to find the circles in the image. We know that wheels of the vehicle must be circular in shape. As we are using side view images, this technique make this algorithm highly accurate and efficient. Also we know that the environment may contain other circular shape objects. We need to eliminate those objects. To eliminate those objects we apply some constraints on the detected circles based on the wheel positions on the vehicles. We get the possible pair of wheels in the side view image.

Chapter 4


 Experiment Result

The proposed method is implemented and tested in MATLAB R2012a. The computer is Intel Core i3 2.4GHz with 6GB RAM and Windows 7. The efficiency of the proposed method is evaluated by both visual and numerical inspection.

  • 20 car images (various car images from side view)
  • 30 motorcycle images (various motorcycle images captured from side view)

Analysis methods

  • Visual inspection (by expert)
  • Numerical Inspection  (using the calculation time and success rate)

20 car and 30 motorcycle images were analyzed and evaluated by both visual and numerical inspection for inspecting the efficiency of the proposed method.

For visual inspection of the vehicle images 20 car images and 30 motorcycle images were analyzed by expert to find the most accurate result. The resultant detected wheels are displayed over the original image. Result of the initial study is given in the following figures.

The visual detection result of motorcycle images are displayed below.

The radius search range provided here 10-100 for car and 20-100 for motorcycle. The threshold for radius difference is 3, horizontal distance is 5.

Vehicle type

Number of vehicle

Other circular object detected

Number of detected vehicle

Detection percentage











 Table 1: Vehicle detection result


The proposed vehicle detection algorithm can be applied only in side view of vehicle images as wheels are exposed only in the side view of the vehicle images. Though there are many other circular objects in the environment, this method can successfully remove those unwanted objects. The success rate is quite impressive.

Chapter 5


Vehicle detection is a complex and challenging task due to the complex nature of images. It is a preliminary step in the analysis of traffic monitoring and control system. In this work the preliminary step to vehicle detection is introduced by detecting wheel from a side view of vehicle images. Color images are converted to gray-scale images then edges of the images are calculated. By applying the Hough circle detection algorithm car wheels are detected from the edge maps of vehicle images. The wheel candidates are chosen from detected circles. The wheel candidates are then tracked. Initial findings show promising results; however, further work is required to evaluate the performance of the proposed vehicle detection method.

Achievement of this thesis

  • Better vehicle detection by detecting the wheels of the vehicles from the side view of vehicles.
  • Wheels can be detected in a variety of conditions and on a variety of vehicle types. Some conditions are problematic but better non-road model will improve them.

Initial finding show promising results; however, further work is required to evaluate the performance of the proposed vehicle detection method.


Of course as a human work this system also has limitations.

  1. The main limitation is that, it can detect the vehicle from the side view of a still image. The wheel should be exposed so that the wheels can be detected. Also wheel should be circular in shape.
  2. Calculation time is little high, so it can’t be applied in real time traffic monitoring. The cause of high calculation time is for various radius of the wheel of the vehicles.
  3. Various categories of vehicles cannot be identifiedby this time. Future work of this title should do the work.

 Future work

The methodology proposed in this thesis can be further reviewed for vehicle detection purpose. Here are some good scopes or good challenges for future work on this system.

  1. The proposed methodology can be further extended to detect various kinds of vehicles.
  2. Detection of multiple vehicles in a single image will be a challenge as we have to differentiate the various wheels of various vehicles.
  3. Classification of vehicles can be possible by differentiating the wheel features of various vehicles and create a detection mechanism.


[1] Z. Sun, G. Bebis, and R. Miller, “On-road vehicle detection: A review,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 5, pp. 694–711, May 2006.

[2] O. Achler and M. Trivedi “Camera Based Vehicle Detection, Tracking, and Wheel Baseline Estimation Approach”, Intelligent Transportation Systems, 2004. Proceedings. The 7th International IEEE Conference, pp. 743-748, Oct. 2004

[3] N. Matthews, P. An, D. Charnley, and C. Harris, “Vehicle detection and recognition in greyscale imagery,” Control Engineering Practice, vol. 4, pp. 473–479, 1996.

[4] S. Gupte et al., “Detection and classification of vehicles,” IEEE Trans. Intell.Transport. Syst., vol. 1, no. 2, pp. 119–130, Jun. 2000.

[5] Luo-Wei Tsai, Jun-Wei Hsieh, and Kuo-Chin Fan, “Vehicle Detection Using Normalized Color and Edge Map” in IEEE Transaction on Image Processing, Vol. 16, No. 3, March 2007

[6] OferAchler and Mohan M. Trivedi, “Vehicle Wheel Detector using 2D Filter Banks” in IEEE Intelligent Vehicles Symposium, Parma, Italy, June 2004

[7] Z. Sun, G. Bebis, and R. Miller, “On-road vehicle detection using Gabor filters and support vector machines,” presented at the IEEE Int. Conf. Digital Signal Processing, Santorini, Greece, Jul. 2002.

[8] C. Papageorgiou and T. Poggio, “A trainable system for object detection,” International Journal of Computer Vision, vol. 38, no. 1, pp. 15–33, 2000.

[9] M. Bertozzi, A. Broggi, and S. Castelluccio, “A real-time oriented system for vehicle detection,” J. Syst. Arch., pp. 317–325, 1997.

[10] Linda G. Shapiro and George C. Stockman, “Computer Vision”, pp 279-325, New Jersey, Prentice-Hall (2001).

[11] J Canny, “A Computational Approach to Edge Detection”, IEEE Transactions on Pattern Analysis and Machine Intelligence archive, Vol. 8 Issue 6, pp. 679-698, June 1986

[12] DimitriosIoannoua, Walter Hudab and Andrew F. Lainec, “Circle recognition through a 2D Hough Transform and radius histogramming”, Image and Vision Computing Vol. 17 (1999),

pp. 15–26

[13] J. C. Rojas and J. D. Crisman, “Vehicle detection in color images,” in Proc. IEEE Conf. Intelligent Transportation System, Nov. 9–11, 1997, pp. 403–408.

[14] Soo-Chang Pei and Ji-HweiHorng, “Circular arc detection based on Hough transform”, Pattern Recognition Letters 16, pp. 615-625, June 1995

[15] Thomas, S.M. and Y.T. Chan (1989). “A simple approach for the estimation of circular arc center and its radius.Comput”.Graphics Image Process. 45, 362-370.

[16] H.K. Yuen, J. Princen, J. Illingworth, J. Kittler, “Comparative study of Hough Transform methods for circle finding”, Image and Vision Computing 8 (1990) 71–77.

[17] V. Kastinaki et al., “A survey of video processing techniques for traffic applications,” Image, Vis., Comput., vol. 21, no. 4, pp. 359–381, Apr. 2003.

Vehicle Detection