Description
This lab has four parts. For each part, implement the outlined functions in lab3.py and run them in the notebook lab3.ipynb. Exploratory questions are embedded directly at the end of the sections; write your answers directly into the notebook. Expected outputs are supplied as reference, but your outputs do not need to match exactly.
Upload to LumiNUS your completed lab3.py and lab3.ipynb by zipping them into a file named AXXX1_AXXX2.zip, where AXXX is the student number of the group members. Submit one file per group. Missing files, incorrectly formatted code that does not run, etc. will be penalized.
Objective
This lab covers keypoint detection and feature descriptors. We will explore the Harris Corner Detector, a simplified version of the SIFT descriptor and two applications for keypoint matching: image stitching and symmetry detection.
Unfortunately, we need to change the coordinate convention (x- and y-axis are swapped compared to Lab 2, see Figure on the right) to stay consistent with the conventions of the OpenCV functions which are used in this lab.
Part 1: Keypoint Detection, Description, and Matching (35%)
This part covers foundations of keypoint descriptors and matching which will be used in later parts. We provide the wrapper describe_keypoints which calls either the naΓ―ve descriptor or the simplified version of SIFT.
β harris_corners() detects Harris corner keypoints. For each pixel in an imsage,
1. Compute image gradient πΌπ₯, πΌπ¦ and compose the second moment matrix H. Note that the elements A, B, and C of matrix H are obtained by summing the elements in some window W:
i. π» = [π΄π΅ π΅πΆ ] π΄ π΅ π πΌπ₯πΌπ¦ πΆ
Consider an efficient implementation of the window-wise summation based on convolution (i.e., do not use for-loops). Weight the elements uniformly in the window for the summation. Hint: Consider using built-in functions skimage.filters.sobel_v, sobel_h and scipy.ndimage.filters.convolve.
2. Compute the corner response π
= πππ‘(π»)β π (π‘π(π»))2 with π = 0.04.
3. Find local maxima using the provided non-maximum suppression; the local maxima in the response map corresponds to the keypoints.
β naive_descriptor() computes a local descriptor based on the normalized pixel intensity,
πΌ π
i.e. πΉ(π₯, π¦) = . πΌ(π₯, π¦) is the intensity at a keypoint located at (π₯, π¦) and π and π are the
π
mean and standard deviation of the pixel intensities in a patch of 5×5 centered at (π₯, π¦). The 0.0001 term in the numerator improves numerical stability.
β simple_sift() is a simplified version of the full SIFT descriptor which is not rotationally invariant. For each keypoint, take a 16 Γ 16 patch of pixels around the keypoint and compute the image gradient magnitude and orientation at each pixel. For efficiency, consider precomputing the image gradients for the entire image in advance. Divide the patch into a 4Γ4 grid of cells, where each cell has 4 Γ 4 = 16 pixels (see Figure 1a).
Figure 1. Schematic visualization of estimating our simplified SIFT descriptor.
For each cell, create a histogram of the gradient orientations with 8 bins of 45Β° each (e.g., 0Β° to 44Β° for bin 0, 45Β° to 89Β° for bin 1, etc.). Each pixelβs contribution to the histogram is weighted by the pixelβs gradient magnitude and the weight from a Gaussian kernel of 16 Γ 16 centered on the keypoint coordinate. Use the provided make_gaussian_kernel. Concatenate the histograms from each cell to form a 4Γ4Γ8=128-dimension vector (see Figure 1b). Normalize the vector to unit length by dividing by the vectorβs magnitude [link].
β top_k_matches()finds the k top matches between two feature descriptor sets πΉ1 =
{πΉ11,πΉ12 β¦ πΉ1π β¦ πΉ1π} and πΉ2 = {πΉ21,πΉ22 β¦ πΉ2π β¦ πΉ2π}. For each descriptor in πΉ1, find the π descriptors in πΉ2 with the smallest Euclidean distance. Give the output as a list of length-2 tuples. For each tuple, the first element is the descriptor index π from πΉ1 and the second element is a πlong list of length-2 tuples, with indices π in πΉ2 and the Euclidean distance π·(πΉ1π, πΉ2π). Hint: consider using scipy.spatial.distance.cdist().
β ratio_test_match() checks for πΉ1π the ratio between the top 2 matches πΉ2π and πΉ2π. If
π·
< π, then πΉ1π and πΉ2π are a match. The ratio threshold π is given as an argument.
π·(πΉ1π,πΉ2π)
Part 2: Image Stitching (25%)
This part uses the matched descriptor pairs from Part 1 and solves the homography between the two images. The images are then warped and stitched together. For this part, use your SIFT descriptor implementation (simple_sift) to match the keypoints for image fp1.png and fp2.png.
If you need more details to help you with the implementation, refer to this StackOverflow post.
β For implementing the above two functions, use the provided transform_homography function that applies a given homography to a set of points.
Part 3: Mirror Symmetry Detection (25%)
For Parts 3 and 4, use the provided function compute_cv2_descriptor, which uses OpenCVβs SIFT implementation since we require rotation-invariant descriptors. Mirror symmetry is defined by a line of reflection or symmetry line (dashed red line in Fig. 3b). This part performs SIFT keypoint matching to solve for candidate points on the symmetry line. SIFT descriptors are rotationally invariant, but not mirror-invariant, so how can they be matched? We create a virtual set of βmirrorβ descriptors by re-assigning the cell indices and histogram bin indices.
β create_mirror_descriptors()using the provided compute_cv2_descriptor, detect keypoints and compute descriptors from the original image πΌπ denoted as πΉπ =
{πΉπ1,πΉπ2 β¦ πΉππ β¦ πΉππ}.
β shift_sift_descriptor() creates a virtual set of mirror descriptors (see Fig. 3b). Treat the image as if it has been mirrored over a virtual vertical axis (see Figure 3a for conceptual visualization). Shift the cell indices and bin indices of each 8-dimensional vector accordingly. For each 8-dim. vector, as SIFT already shifts the indices to keep the dominant orientation first, the first index (0) will stay in the same position. Remaining bins are reversed, i.e. [0, 1, 2, ..7] remaps to [0, 7, 6, .., 2, 1]. See lab3.py for an example of a SIFT histogram vs. its mirrored version.
β match_mirror_descriptors()matches keypoints between πΉπ and mirrored descriptors
πΉπβ² = {πΉπβ²1,πΉπβ²2 β¦ πΉπβ²π β¦ πΉπβ²π} (see Fig. 3c)
Figure 3. Symmetric feature matching.
β find_symmetry_lines()takes in pairs of matched descriptors and votes for a candidate symmetry line via the Hough transform. For a matched pair of points π and πβ² (see Fig 3d), form a line that intersects the two points; this line is perpendicular to the symmetry line. Solve for the midpoint π = (π₯π, π¦π) between π and πβ² on the intersecting line and compute the angle ππ that the line makes with the x-axis.
β hough_vote_mirror() performs Hough voting with parameter space (π,ΞΈ). Each intersecting point π has one vote defined by ππ and ππ = π₯π Γ πππ (ππ) + π¦π Γ π ππ (ππ) . The line(s) of symmetry in the image will be represented by local maxima in the Hough vote space. Use the function find_peak_params (from Lab 2); num_lines is a parameter which limits the number of local maxima that are returned.
Part 4: Rotation Symmetry Detection (15%)
Keypoint matching and Hough voting can also be used to detect rotational symmetry. For this part, we use the orientation and scale attributes for the cv2.Keypoint class named angle and size.
Figure 5. Rotational symmetry detection: a) A shape looks the same after rotations; b) A pair of matched keypoints (green) vote for one candidate center of rotation (red); c) Hough voting is used to select a maxima point from the candidate centers as the predicted center of rotation.
β Using compute_cv2_descriptor, detect keypoints and compute descriptors from a given image denoted as πΉ = {πΉ1,πΉ2 β¦ πΉπ β¦ πΉπ}.
β Then implement the function match_with_self which uses top_k_matches, giving πΉ as both πΉ1 and πΉ2 since we are matching descriptors on an image to itself to find the best three matches. Eliminate the trivial match (a keypoint is matched with itself) and perform the ratio test on the other two matches. If no match is eliminated, pick the best two.
β find_rotation_centers()estimates, for a matched pair of keypoints ππ = (π₯π, π¦π) and ππ = (π₯π, π¦π), the coordinates for a candidate centre of rotation:
a. Remove βparallelβ matches based on the keypoint orientation i.e. the angle. Wrap around the orientations π·π and π·π for ππ and ππ respectively to ensure they are within [0, 2π). If π·π and π·π are within 1 degree of each other, discard ππ and ππ. The similar orientations suggest βparallelβ keypoints, i.e. there exists no centre of rotation about which ππ can be rotated to coincide with ππ.
b. Form a line of intersection (see red line in Fig. 4) between ππ and ππ and solve for this line the length π and the angle πΎ that it forms with the x-axis.
c. A center of rotation πππ = (π₯π, π¦π) between
ππ and ππ can be defined by angle
π½ = and radius π with coordinates
π₯π = π₯π + π Γ πππ (π½ + πΎ) and π¦π = π¦π + π Γ π ππ(π½ + πΎ). Discard if the center is out of image bounds.
β hough_vote_rotation()does Hough voting on the rotation coordinates with
parameter space (π₯, π¦). Matched keypoints ππ and ππ vote for the associated center location (π₯π, π¦π) with the vote weighted by the keypoint scales π π and π π respectively. The vote weight is defined as π€ = (ππ)2 where π = β |π πβπ π|. The global maxima should
(π π+π π)
correspond to the center of rotation.
β Note: pay attention to our coordinate choice because to get the point (π₯π, π¦π) you will need to call I[yi][xi] in your code.



