Description
Machine perception
2021/2022
Create folder assignment4 that you will use during this assignment. Unpack the content of the assignment4.zip that you download from the course webpage the folder. Save the solutions for the assignments as Python scripts assignment4 folder. In order complete the exercise you have present these files the teaching assistant. Some assignments contain questions that require sketching, writing manual calculation. Write these answers down and bring them the presentation as well. The tasks that marked with (cid:70) optional. Without completing them you get at points for the exercise. The maximum total amount of points for each assignment is 100. Each optional task has the amount of additional points written next it.
Introduction
This assignment will deal with automatic searching for correspondence points between images. Correspondences key step when dealing with the task of aligning images, for example building panorama from multiple images. Methods that find correspondences between images usually start by finding feature points in them. Feature points denote regions in the image that have high chance of re-detection in image where the same scene is captured from slightly different angle viewing conditions.
Exercise Feature points detectors
In this exercise you will implement frequently used feature point detectors: the Hessian algorithm and the Harris algorithm.
The Hessian detector is based second derivatives matrix (also named Hessian matrix, hence the name of the algorithm) at point in the image:
=
(cid:20) Ixx(x, σ) Ixy(x, σ) Ixy(x, σ) Iyy(x, σ)
(cid:21)
,
where σ explicitly states that the second derivative is computed smoothed image. Hessian detector selects point as feature point if the determinant of the Hessian matrix exceeds given threshold value t. If we want create variant of the Hessian detector that is scale independent (independent of the Gaussian filter that is used smooth the image) we have include normalization factor of σ4 and thus we get feature detection rule:
det(H(x, = σ4(Ixx(x, σ)Iyy(x, σ) − Ixy(x, σ)2) > t.
Implement function hessian_points, that computes Hessian determinant using the equation for each pixel of the input image. As this computation very slow if done pixel by pixel, you have implement it using vector operations (without explicit for loops). Test the function using image from test_points.jpg as your input not forget convert it grayscale) and visualize the result. Extend the function hessian_points by implementing non-maximum suppres- sion post-processing step that only retains responses that higher than all the neighborhood responses and whose value higher than given threshold value thresh. The function should return the list of points that satisfy these properties. Create function that you will use plot the detected points (use plt.plot()) over the input image. Load the image test_points.jpg, process it and visualize the result. the input parameters of the Hessian detector thresh = 100 and σ = and then change their values get feeling for the effect of the parameters.
Question: What kind of structures in the image detected by the algorithm? How does the parameter σ affect the result?
Implement the Harris feature point detector. This detector is based the auto- correlation matrix that measures the level of self-similarity for pixel neigh- borhood for small deformations. At the lectures you have been told that the Har- ris detector chooses point for feature point if both eigenvalues of the auto-correlation matrix for that point large. This means that the neighbor- hood of contains well defined rectangular structures – i.e. corner. Auto- correlation matrix computed using the first partial derivatives at that subsequently smoothed using Gaussian filter:
(cid:20) G(x, ˜σ) ∗
C(x, σ, ˜σ) = σ2
G(x, ˜σ) ∗ IxIy(x, σ) G(x, ˜σ) ∗ where ∗ stands for convolution. Computing eigenvalues and of matrix C(x, σ, ˜σ) is expensive, therefore we will use the following relations1
x(x, σ) G(x, ˜σ) ∗ IxIy(x, σ) σ)
(cid:21)
,
det(C) = trace(C) = +
the following text we will omit the parameters of auto-correlation matrix C(x, σ, ˜σ) for clarity
and write instead.
compute the ratio = If we assume that
trace2(C) det
=
+
=
(rλ2 + rλ2λ2
=
+
,
we express the feature point condition for as:
det(C) − αtrace2C > t.
In practice we use ˜σ = 1.6σ, α = for parameter values. Implement the condition without using explicit computation of matrix and without for loops, as you for the core part of the Hessian detector.
Implement the feature point detector as function harris_points that computes values of equation for all pixels, performs the non-maximum suppression post- processing step as well as thresholding using threshold thresh. The function should return the list of points that satisfy these properties.
Load the image test_points.jpg and compute the Harris feature points. Compare the result with the feature points detected by the Hessian detector. Experiment with different parameter values. the feature points of both detectors appear the same structures in the image?
Exercise Matching local regions
of the uses of feature points is searching for similar structures in different images. this we will need descriptors of the regions around these points. In this assignment you will implement some simple descriptors as well as their matching.
Use the function simple_descriptors calculate descriptors for list of feature points. Then, write function find_correspondences that, given lists of scriptors, calculates the similarities between all descriptors in both lists. Use the Hellinger distance. Finally, for each descriptor from the first list, find the similar descriptor from the second list. good idea might use the same list (perhaps shuffled) for both inputs sanity check your code.
Note: Use your own functions gauss and gaussdx from previous assignments in order for the descriptor function work correctly.
Write script that loads images graf/graf1_small.jpg and graf/graf2_small.jpg, runs the function find_correspondences and visualizes the result. Use the function display_matches from the supplementary material for visualization. Experiment with different parameters for descriptor calculation and report the changes that occur.
Implement simple feature point matching algorithm. Write function find_matches that is given images as input and returns list of matched feature points from image image The function should return list of index pairs, where the first element is the index for feature points from the first image and the second element is the index for feature points from the second image.
Follow the algorithm below:
• Execute feature point detector get stable points for both images (you
experiment with both presented detectors),
• Compute simple descriptors for all detected feature points
• Find best matches between descriptors in left and right images using the Hellinger distance, i.e. compute the best matches from the left right image and then the other way around. In post-processing step only select symmetric matches. symmetric match is match where feature point in the left image is matched point in the right image and at the same point in left image. This way we get of point pairs where each point from the left image is matched exactly point in the right image as well as the other way around.
in the right image is matched the point
Use the function display_matches from the supplementary material display all the symmetric matches. Write script that loads images graf/graf1_small.jpg and graf/graf2_small.jpg, runs the function find_matches and visualizes the result.
Question: What you notice when visualizing the correspondences? How robust the matches?
(d) (cid:70) points) Incorrect matches occur when matching descriptors. Suggest and implement simple method for eliminating at least some of these incorrect matches. You use the ideas that were presented at the lectures test your own ideas. Either way, you need explain the idea behind the method as well as well as demonstrate that the number of incorrect matches is lowered when using the posed method.
(cid:70) points) Implement local feature detector of your choice (e.g. SIFT, SURF, BRIEF, HOG). Test it other assignments and report the changes (increased robustness etc.).
(f) (cid:70) points) Record video with your phone webcam. Use OpenCV detect keypoints of your choice, display them using cv2.drawKeypoints and save the video with the displayed keypoints. The video must demonstrate the keypoint detector’s robustness rotation and scale change. Make the video at least long.
Exercise Homography estimation
In this assignment we dealing with planar images, we try and estimate homog- raphy matrix that maps image another using planar correspondences. In this exercise you will implement algorithm that computes such transformation using imization of the mean square error. For additional information about the method, consult the lecture notes as well as the course literature (in the literature the term direct linear transform (DLT) is frequently used describe the idea).
We will start with short overview of the minimization of mean square error simpler case of similarity transform estimation. The similarity transform is linear transform that accounts for translation, rotation, and scale. The transformation of point written as x(cid:48) = f where = is vector of parameters that define the transform such that (cid:20) x(cid:48) y(cid:48)
(cid:20) xp1 − yp2 + xp2 + yp1 +
x(cid:48) =
=
(cid:21)
(cid:21)
.
Question: Looking at the equation above, which parameters account for translation
and which for rotation and scale?
Question: Write down sketch of algorithm determine similarity transform For details
from of point correspondences = x(cid:48) consult the lecture notes.
. . . (xn, x(cid:48)
x(cid:48)
You will implement similar, but complex algorithm for homography esti- mation. For reference point xr in the first image we compute corresponding point xt in the second image as:
xr
Hxr = xt
=
; where
xt yt zt
z(cid:48) t
x(cid:48) t y(cid:48) t z(cid:48) t
=
,
x(cid:48) t z(cid:48) t y(cid:48) t z(cid:48) t
where points xr and xt written in homogeneous coordinates Using the equations we get system of linear equations with unknowns
h11xr + + h31xr + + h21xr + + h31xr + +
= xt
= yt,
that transformed
h11xr + + − xth31xr − xth32yr − xt = 0 h21xr + + − yth31xr − yth32yr − yt = 0.
If we want estimate the parameters that determine homography, we need at least pairs of matched feature points. As some matches imprecise, we increase the accuracy of our estimate by using larger number of matches x(cid:48) This way we get overdetermined system of equations:
. . . , (xn, x(cid:48)
0
0
0
0
0 −xt1xr1 −xt1yr1 −xt1 xr1 0 xr1 −yt1xr1 −yt1yr1 −yt1 0 0 0 −xt2xr2 −xt2yr2 −xt2 xr2 0 xr2 −yt2xr2 −yt2yr2 −yt2 0 0 … … … … … xrn yrn 0 0
… 0 −xtnxrn −xtnyrn −xtn 0 xrn yrn −ytnxrn −ytnyrn −ytn
… 0
… 0
…
Ah = 0
=
,
0 0 0 0 0 … 0 0
that (similarly estimation of similarity transform) solved as minimization of mean square error. If the matrix is square matrix then we get exact solution of the system. In case of over-determined system (e.g., > the matrix is not square. This problem is usually solved using matrix pseudo-inverse AT that is square and therefore split eigenvectors and eigenvalues. solution of such system is the unit eigenvector of AT that corresponds the lowest eigenvalue (using function eig). The same solution obtained efficiently using the singular-value
2Homogeneous coordinates for point in space obtained by adding third coordinate and
setting it
decomposition (SVD) as eigenvector that corresponds the lowest eigenvalue of (using function svd).
svd= UDVT =
… 0
. . . 0 … . . . · · ·
…
. . . … . . . · · ·
T
vector that contains the parameters of the homography matrix is obtained from the last column of matrix and by normalizing the vector with the value of the last element in
=
. . . ,
.
Write function estimate_homography, that approximates homography between images using given of matched feature points following the algorithm below.
• Construct matrix using the equation • Perform matrix decomposition using the SVD algorithm: = svd(A). • Compute vector using equation • Reorder the elements of matrix (e.g. using the function reshape).
Load the New York cityscape images (newyork/image1.jpg and newyork/im- age2.jpg), and hand-annotated correspondence pairs in the file newyork.txt3. Use the function display_matches display the pairs of points. Using these pairs estimate the homography matrix from the first image the second image and use function cv2.warpPerspective transform the first image the plane of the second image using the given homography, then display the result. Also test your gorithm with images graf/graf1.jpg, graf/graf2.jpg and points from graf.txt. Note: You check the correctness of your implementation by comparing with the file newyork/H.txt
When you get the correct result for the given points, repeat the procedure by ating of matching points automatically using the find_matches function that you have implemented in the previous assignment. Select the first best matches and use them compute homography. Visualize the result by transforming the first image the space of the second image as well as displaying the matches tween the images. How well have you managed estimate the homography? Does the estimate improve if you use more/less matching pairs? What happens if you include imperfect matches in the the parameters (feature point detector, parameters, number of selected points, etc.) get the best performance of the method.
Note: This does not need produce perfect results every because DLT is not robust outliers.
3Use read_data function read the points. Reshape the data into columns. The first columns contain and coordinates of the points in the first image and the second columns contain the corresponding points for the second image.
(d) (cid:70) points) Implement and test robust homography estimation algorithm using iterative reweighed least squares approach that you have heard about at lec- tures. Test the robustness of the algorithm the image pairs from the previous task. Experiment with different numbers of correspondences used as input the estimation.
(cid:70) points) Write your own function for mapping points using the homography matrix. It should work like OpenCV’s warpPerspective(). It should accept image and homography matrix and return the input image as remapped by the homography matrix. The size of the output should match the size of the input image.
Hint: You will need use homogeneous coordinates.
References
D. Forsyth and Ponce. Computer Vision: Modern Approach. Prentice Hall,



