Road To Pixels
Welcome aboard. With the growing technologies out in the world, we have seen how important Image Processing has become. This repository provides a complete understanding of the practical implementation of all the concepts to be known for a developer to start their Image Processing journey.
Contents
- Basics with Images
- Successive Rotations
- Interpolations
- Interpolations-Inverse Mapping
- Basic Transformations
- Perspective Transofrmation
- Estimating the Transformation
- Log and Contrast Stretching
- Shading Correction
- Laplacian
- Laplacian+Gaussian
- Laplacian, Sobel, CannyEdge
- Sobel-X and Y
- Histogram Equalisation
- Normalize Histogram
- Image Temperature
- Box Filter
- GaussianFilter+Kernels
- Morphological Processing
- Morphological Text Processing
- Morphological Fingerprint Processing
- Morphological Outline
- Capture Video Frames
- Video background Subtraction
- VideoCapture_GoogleColab
- Contours-OpenCV
- Fitting Polygons
- Hough Lines
- Adaptive+Gaussian Thresholding
- OTSU Thresholding
- Grabcut
- Discrete Fourier Transformation
- OpenCV KMeans
- Object Movement Tracking
- Live Hand Gesture Recognition
Before we jump into the concepts, let us once have a look at the definition of Image Processing.
A Glance into Image Processing
Image processing is often viewed as arbitrarily manipulating an image to achieve an aesthetic standard or to support a preferred reality. However, image processing is more accurately defined as a means of translation between the human visual system and digital imaging devices. The human visual system does not perceive the world in the same manner as digital detectors, with display devices imposing additional noise and bandwidth restrictions. Salient differences between the human and digital detectors will be shown, along with some basic processing steps for achieving translation. Image processing must be approached in a manner consistent with the scientific method so that others may reproduce, and validate one's results. This includes recording and reporting processing actions and applying similar treatments to adequate control images.Src
There are two types of methods used for image processing namely, analog and digital image processing. Analog image processing can be used for hard copies like printouts and photographs. Various fundamentals of interpretation are used by the Image Analysts along with the visual techniques. Digital image processing deals with the manipulation of digital images through a digital computer. It is a subfield of signals and systems but focuses particularly on images. The three general phases that all types of data have to undergo while using digital techniques are
- Pre-processing
- Enhancement and Display
- Information Extraction.
Fundamental Steps in Digital Image Processing - Rafael Gonzalez - 4th Edition Src
Important point to note while going through any concept is that the image is considered on a greyscale since color increases the complexity of the model. One may want to introduce an image processing tool using gray level images because of the format of gray-level images because the inherent complexity of gray-level images is lower than that of color images. In most cases. after presenting a gray-level image method, it can be extended to color images.
For getting deeper insights into any of the concepts, I suggest going through Digital Image Processing, Rafael C. Gonzalez • Richard E. Woods, 4th Edition
From here on I will be referring Digital Image Processing as DIP.
Disclaimer: I am not the original author of the images used. They have been taken from various Image Processing sites. I have mentioned all of the referenced sites in resources. Pardon if I missed any.
The following is the order I suggest to look into the concepts.
1. Basics with Images - Averaging Images
Image averaging is a DIP technique that is used to enhance the images which are corrupted with random noise. The arithmetic mean of the intensity values for each pixel position is computed for a set of images of the same view field. The basic formula behind it is.
2. Successive Rotations - Code
The images are rotated using the self-defined code for rotation instead of the OpenCV inbuilt function. When an image is rotated by 45 degrees for 8 times, it does not produce the same result as when it is rotated by 90 degrees for 4 times. This is because, when an image is rotated 45 degrees, during the rotation more pixels values for the new position of the pixels are to be calculated. And calculating these new pixel positions and their intensities uses interpolation which is an approximation method. So when an image is rotated by 90 degrees there is a smoother transition since fewer no of approximations are to be made for the new pixel positions and their intensities.
A clear example is shown below
Rotated by 45 deg - 8 times | Rotated by 90 deg - 4 times |
---|---|
3. Interpolations - Code
Interpolation is used in tasks such as zooming, shrinking, rotating, and geometrically correcting digital images. It is the process of using known data to estimate values at unknown locations. So for giving the chance to estimate values, we will do some transformation, here it is a rotation by 45 degrees. The 3 interpolations we see here are:
Nearest Neighbour | Bilinear | Bicubic |
---|---|---|
Here you can see a slight variation between the 3 images. The smoothness gets better from left to right. Since Bicubic interpolation uses a higher-order equation it can capture features in-depth.
4. Interpolation-Inverse Mapping - Code
As mentioned here, there are two methods of mapping, the first, called forward mapping, scans through the source image pixel by pixel, and copies them to the appropriate place in the destination image. The second, reverse mapping, goes through the destination image pixel by pixel and samples the correct pixel from the source image. The most important feature of inverse mapping is that every pixel in the destination image gets set to something appropriate. In the forward mapping case, some pixels in the destination might not get painted and would have to be interpolated. We calculate the image deformation as a reverse mapping.
Original | Nearest Neighbour - Inverse Mapping |
---|---|
5. Basic Transformations - Code
We have seen the basic transformations like rotation and scaling. Now let's see one more basic transformation known as translation.
Original | Translation |
---|---|
6. Perspective Transformation - Code
The perspective transformation deals with the conversion of a 3D image into a 2D image for getting better insights about the required information. The 3D object co-ordinates are changed into the co-ordinates wrt world frame of reference and according to camera coordinate frame reference then continued by changing into Image Plane 2D coordinates and then to the pixel coordinates.
Distorted Image | OpenCV - Perspective Transf Function | Manual Correction |
---|---|---|
7. Est. Transformation - Code
This is just an example of using custom transformations for the required purpose. In the below example I have tried to extract the root part from the image.
Original | Transformed |
---|---|
8. Log and Contrast Stretching - Code
One of the grey-level transformations is Logarithmic Transformation. It is defined as s = c*log(r+1)
, where 's' and 'r' are the pixel values of the output and the input image respectively and 'c' is a constant.
Original | Log-Transformed |
---|---|
Contrast Stretching is a simple image enhancement technique that attempts to improve the contrast in an image by stretching the range of intensity values it contains to span a desired range of values.
Original | Contrast Stretched |
---|---|
9. Shading Correction - Code
Shading Correction is used for correcting the parts of an image which are having some faults due to multiple reasons like, camera light obstruction. So correcting the image for the required purpose is essential. So in this example, we have used a faulty image of a chessboard and corrected the image. Gaussian Blur is used to correct the shading in the corner of the image.
Original | Corrected Image |
---|---|
10. Laplacian - Code
A Laplacian filter is an edge detector which computes the second derivatives of an image, measuring the rate at which the first derivatives change. That determines if a change in adjacent pixel values is from an edge or continuous progression.
A laplacian filter or kernel looks like this:
[0, 1, 0]
[1, -4, 1]
[0, 1, 0]
But a point to note is that Laplacian is very sensitive to noise. It even detects the edges for the noise in the image.
Original | Laplacian Filter |
---|---|
11. Laplacian+Gaussian - Code
As you can see from the above example, the Laplacian kernel is very sensitive to noise. Hence we use the Gaussian Filter to first smoothen the image and remove the