Name: Machine perception - Assignment 2: Recognition using histograms, convolution, and image ﬁltering Solved
SKU: 93754
Availability: InStock

Description

5/5 - (1 vote)

Machine perception
Create a folder assignment2 that you will use during this assignment. Unpack the content of the assignment2.zip that you can download from the course webpage to the folder. Save the solutions for the assignments as Python scripts to assignment2 folder. In order to complete the exercise you have to present these ﬁles to the . Some assignments conin questions that require sketching, writing or manual calculation. Write these answers down and bring them to the presention as well. The sks that are marked with (cid:70) are optional. Without completing them you can get at most 75 points for the exercise. The maximum tol amount of points for each assignment is 100. Each optional sk has the amount of additional points written next to it.
Introduction
This exercise conins two exercises. In the ﬁrst one you will familiarize yourself with several methods of histogram comparison on the domain of image retrieval. In the sec- ond assignment you will learn about convolution, which is a basic operation in image processing.
Exercise 1: Global approach to image description
In the previous assignment we used histograms created from a single channel images. Now we will extend this to multi-dimensional histograms and use histograms as global image descriptors in order to compare images.
(a) Firstly, you will implement the function myhist3 that computes a 3-D histogram from a three channel image. The images you will use are RGB but the function should also work on other color spaces. The resulting histogram is stored in a 3-D matrix. The size of the resulting histogram is determined by the parameter n_bins. The bin range calculation is exactly the same as in the previous assignment, except now you will get one index for each image channel. Iterate through the image pixels and increment the appropriate histogram cells. You can create an empty 3-D numpy array with H = np.zeros((n_bins,n_bins,n_bins)). Take care that you normalize the resulting histogram.
(b) In order to perform image comparison using histograms we need to implement some disnce measures. These are deﬁned for two input histograms and return a sin-
gle scalar value that represents the similarity (or disnce) between the two his- tograms. Implement a function compare_histograms that accepts two histograms and a string that identiﬁes the disnce measure you wish to calculate. You can srt with the L2 metric.
The L2 metric (commonly known as Euclidean disnce) treats histograms with N bins as points in N -dimensional space. For histograms h1 in h2 (that of course must have the same number of bins) the L2 disnce is deﬁned as:
L2 =
(cid:104) (cid:88)
Also implement the following measures that are more suible for histogram com- parison:
• Chi-square disnce χ2 = 1 2
(h1(i)−h2(i))2 h1(i)+h2(i)+ε0
, where ε0 is a very small consnt
• Intersection I = 1 − (cid:80)
Try to avoid looping over histogram values and instead use vector operations on entire matrices at once.
(c) Test your function with the following images:
• daset/object_01_1.png,
• daset/object_02_1.png,
• daset/object_03_1.png.
Question: Which image (object_02_1.png or object_03_1.png) is more similar to image object_01_1.png considering the L2 disnce? How about the other three disnces? We can see that all three histograms conin a strongly expressed com- ponent (one bin has a much higher value than the others). Which color does this bin represent?
(d) You will now implement a simple image retrieval system that will use histograms. Write a function that will accept the path to the image directory and the parameter n_bins and then calculate RGB histograms for all images in the directory as well as transform them to 1-D arrays. Store the histograms in an appropriate da structure. Select some image from the directory daset/ and compute the disnce between its histogram and all the other histograms you calculated before. Sort the list ac- cording to the calculated similarity and display the reference image and the ﬁrst ﬁve most similar images to it. Also display the corresponding histograms. Do this for all four disnce measures that you implemented earlier.
Question: Which disnce is in your opinion best suited for image retrieval? How does the retrieved sequence change if you use a diﬀerent number of bins? Is the execution time aﬀected by the number of bins?
(e) You can get a better sense of the diﬀerences in the disnce values if you plot all of them at the same time. Use the function plt.plot() to display image indices on the x axis and disnces to the reference image on the y axis. Display both the unsorted and the sorted image sequence and mark the most similar values using a circle (see pyplot documention).
(f) (cid:70) (10 points) This simple retrieval system is strongly inﬂuenced by dominant colors that carry no discriminative information. Analyze the presence of diﬀerent colors by summing up all image histograms bin-wise and displaying the resulting histogram. Which bins dominate this histogram?
To address this issue, you will implement a simple frequency-based weighting tech- nique, similar to the ones that are employed in document retrieval systems. Use the combined frequency histogram you calculated to determine the weight for each bin. The weights should be lower for bins that are strongly represented in the frequency histogram and vice versa One way of computing this weight is to use exponential function wi = e−λF (i), where F (i) represents a frequency of the i-th bin and λ is a scaling consnt that you have to set. There are also other ways of computing weights that you can experiment with. Before calculating histogram similarity, you should multiply each histogram with the weights vector bin-wise (and normalize the result). Finally, you can compare the retrieval process for the weighted and the unweighted histograms. Report your observations. Did the weighting help with retrieving relevant results?
Exercise 2: Convolution
A basic operation that is used in image processing is called convolution. For easier un- dersnding, you will ﬁrst implement the 1-D version of the operation. Convolution of a kernel g(x) with a signal I(x) is deﬁned by the following equation:
Ig(x) = g(x) (cid:63) I(x) =
∞ (cid:90)
−∞
g(u)I(x − u)du,
or in the case of discrete signals (i.e. images):
Ig(i) = g(i) (cid:63) I(i) =
∞ (cid:88)
−∞
g(j)I(i − j)dj.
A nice visualization of convolution can be seen on Wikipedia1. In practice, the kernel is of ﬁnite size, thus the sum only runs from the leftmost element of the rightmost element of the kernel. Intuitively, the kernel is overlayed on the signal and the overlapping region is multiplied element- wise with signal, then summed. This sum is the result for position i.
1http://en.wikipedia.org/wiki/Convolution
(a) Compute the convolution between the signal and kernel below (k (cid:63) f ) by hand.
f = 0
0
0
0
0
Question: Can you recognize the shape of the kernel? What is the sum of the elements in the kernel? How does the kernel aﬀect the signal?
(c) (cid:70) (5 points) Improve the function simple_convolution by also addressing the edges of the signal. To do this, implement one of the common methods for padding the signal so the operation can be performed on all the signal elements. Take care that the input and the output signals are the same length.
(d) Write a function that calculates a Gaussian kernel. Use the deﬁnition:
g(x) =
√
1 2πσ
exp(−
x2 2σ2 ).
Question: The ﬁgure below shows two kernels (a) and (b) as well as signal (c). Sketch (do not focus on exact proportions of your drawing but rather on the under- snding of what you are doing) the resulting convolved signal of the given input signal and each kernel. You can optionally also implement a convolution demo based on the signals and your convolution code, but the impornt part here is your un- dersnding of the general idea of convolution.
Exercise 3: Image ﬁltering
This exercise will teach you how to use convolution on 2-D signals to achieve image ﬁltering. Filtering can be used to smooth or sharpen images, or to remove cerin types
of noise from the image. You will also experiment with some ﬁlters that are not based on convolution.
(a) An impornt property of the Gaussian kernel is its separability in multiple dimen-
sions. A Gaussian kernel in 2-D space can be written as:
G(x, y) =
)
1 2πσ 1 2πσ
√
= [
exp(−
x2 + y2 σ2 x2 2σ2 )][
√
1 2πσ
exp(−
y2 2σ2 )]
= g(x)g(y),
which is a product of two 1D Gaussian kernels, each one in its own dimension. If we again write a continuous convolution
G(x, y) ∗ I(x, y) =
(cid:90)
(cid:90)
g(u)g(v)I(x − u, y − v)dudv
v
u (cid:90)
=
(cid:90)
g(u)(
g(v)I(x − u, y − v)du)dv
u
v = g(x) ∗ [g(y) ∗ I(x, y)],
we can see that we get the same result if we ﬁlter a 2-D signal using a single 2-D Gaussian kernel or if we do it using two 1D Gaussian kernels that represent the separated components of the former 2-D Gaussian kernel. This technique can be used to translate a slow n-D ﬁltering operation to a fast sequence of 1-D ﬁltering operations.
Write a function gaussfilter that generates a Gaussian ﬁlter and applies it to a 2-D image. You can use the function cv2.filter2D() to perform the convolution using the desired kernel. Generate a 1-D Gaussian kernel and ﬁrst use it to ﬁlter the image along the ﬁrst dimension, then convolve the result using the same kernel, but transposed.
Hint: Numpy arrays have an attribute named T, which is used to access the transpose of the array, e.g. k_transposed = k.T.
Test the function by loading the image lena.png and converting it to grayscale. Then, corrupt the image with Gaussian noise (every pixel value is oﬀset by a ran- dom number sampled from the Gaussian distribution) and separately with salt- and-pepper noise. You can use the functions gauss_noise and sp_noise that are included with the instructions (a2_utils.py). Use the function gaussfilter to try and remove noise from both images.
Question: Which noise is better removed using the Gaussian ﬁlter?
(b) Convolution can also be used for image sharpening. Look at its deﬁnition in the
lecture slides and implement it. Test it on the image from ﬁle museum.jpg.
(c) Implement a nonlinear median ﬁlter. While a Gaussian ﬁlter locally computes a weighted average, the median ﬁlter sorts the signal values in the given ﬁlter window and uses the median value of the sorted sequence as the result. Implement a simple median ﬁlter as a function simple_median that kes the input signal I and the ﬁlter width w and returns the ﬁltered signal.
Create a 1-D signal corrupted with salt and pepper noise and ﬁlter it using sim- ple_median. Display the result using diﬀerent window sizes. Also try to remove the noise using the Gaussian ﬁlter. What does the result look like and why?
Question: Which ﬁlter performs better at this speciﬁc sk? In comparison to Gaussian ﬁlter that can be applied multiple times in any order, does the order matter in case of median ﬁlter? What is the name of ﬁlters like this?
(d) (cid:70) (10 points) Implement a 2-D version of the median ﬁlter. Test it on an image that was corrupted by Gaussian noise and on an image that was corrupted by salt and pepper noise. Compare the results with the Gaussian ﬁlter for multiple noise intensities and ﬁlter sizes.
Question: What is the computional complexity of the Gaussian ﬁlter operation? How about the median ﬁlter? What does it depend on? Describe the computional complexity using the O(·) notion (you can assume n log n complexity for sorting).
(e) (cid:70) (5 points) Implement the hybrid image approach that was presented at lectures. To do this you also have to implement the Laplacian ﬁlter, ﬁlter two images (one with Gaussian and one with Laplacian ﬁlter) and combine them together. You can use images cat1.jpg and cat2.jpg as a reference since they are both of the same size. Do not convert them to grayscale, simply apply the same operations to all three channels and display the result as a color image.

[SOLVED] Machine perception - Assignment 2: Recognition using histograms, convolution, and image ﬁltering

If Helpful Share:

Description

Related products

Machine perception – Assignment 3: Edges and Hough transform

Machine perception – Assignment 6: Reduction of dimensionality and recognition

Machine perception – Assignment 1: Basic image processing and histograms

Related in this category

More in this category

Machine perception – Assignment 6: Reduction of dimensionality and recognition

Machine perception – Assignment 3: Edges and Hough transform

Machine perception – Assignment 5: Epipolar geometry and triangulation

Machine perception – Assignment 1: Basic image processing and histograms

Machine perception – Assignment 4: Feature points, matching, homography