[SOLVED] Machine perception - Assignment 1: Basic image processing and histograms

40.00 $

Category:

Description

5/5 - (1 vote)

Machine perception
2021/2022
Create folder assignment1 that you will use during this assignment. Unpack the content of the assignment1.zip that you download from the course webpage the folder. Save the solutions for the assignments as Python scripts assignment1 folder. In order complete the exercise you have present these files the teaching assistant. Some assignments contain questions that require sketching, writing manual calculation. Write these answers down and bring them the presentation as well. The tasks that marked with (cid:70) optional. Without completing them you get at points for the exercise. The maximum total amount of points for each assignment is 100. Each optional task has the amount of additional points written next it.
The purpose of this assignment is familiarize yourself with the working of numpy and OpenCV libraries that will used for the practical part of this course. This assignment will cover reading image data into matrices, manipulating parts of images channels, describing images with histograms, basic thresholding, and morphological operations.
Exercise Basic image processing
Read the image from the file umbrellas.jpg and display it using the following
snippet:
import numpy as np import cv2 from matplotlib import pyplot as plt
= cv2.imread(‘data/umbrellas.jpg’) plt.imshow(I) plt.show()
The loaded image is represented as numpy array. You query its size by using the following command height, width, channels = I.shape. Just as important is the type of the matrix (in numpy accessed with I.dtype). Images usually loaded as matrix of type np.uint8, which represents unsigned integers with bits, effectively giving the range of [0,255]. In this course, only np.uint8 and np.float types will used for representing images. The conversion performed as follows: I_float = I.astype(np.float)
Note: Beware of assignment by reference. The command I_new=I for numpy arrays only creates new reference the same data and does not copy it. You copy the data using I_new=np.copy(I).
Note: The library pyplot used display image data (function plt.imshow(I)). Function plt.show() must then called in order for the window displayed.
Convert the loaded image grayscale. very simple way of doing this is summing up the color channels and dividing the result by effectively averaging the values. The issue, however is that the sum easily reaches beyond the np.uint8 range. We avoid that by casting the data floating point type. You access specific image channel using the indexing syntax like red = I[:,:,0].
Note: If loading images using cv.imread() in fact returns the channel ordering BGR instead of RGB. This fixed using = cv2.cvtColor(I, cv2.COLOR_- BGR2RGB)
and display specific part of the loaded image. Extract only of the channels you get grayscale image. You this by indexing along the first axes, for instance: cutout=I[130:260, 240:450, You display multiple images in single window using plt.subplot().
Grayscale images displayed using different mappings RGB monitor, every value needs mapped RGB triplet). Pyplot defaults color map named viridis, but often it is preferable use grayscale color map. This with additional argument plt.imshow, like plt.imshow(I, cmap=’gray’).
Question: Why would you use different color maps?
(d) You also replace only part of the image using indexing. Write script that inverts rectangular part of the image. This done pixel by pixel in loop by using indexing.
Question: How is inverting grayscale value defined for uint8 ?
Perform reduction of grayscale levels in the image. First read the image from umbrellas.jpg and convert it grayscale. You write your own function for grayscale conversion use the OpenCV function cv2.cvtColor like = cv2.cvtColor(I, cv2.COLOR_RGB2GRAY).
Convert the grayscale image floating point type. Then, rescale the image values that the largest possible value is Convert the image back uint8 and display both the original and the modified image. Notice that both look the same. Pyplot tries maximize the contrast in displayed images by checking their values and scaling them cover the entire uint8 interval. If you want avoid this, you need the maximum expected value when using plt.imshow(), like plt.imshow(I, vmax=255. Use this display the resulting image the change is visible.
Exercise Thresholding and histograms
Thresholding image is operation that produces binary image (mask) of the same size where the value of pixels is determined by whether the value of the corresponding pixels in the source image is greater lower than the given threshold.
Create binary mask from grayscale image. The binary mask is matrix the same size as the image which contains where some condition holds and 0 everywhere else. In this case the condition is simply the original image intensity. Use the image bird.jpg. Display both the image and the mask.
The binary mask creation performed by using the selection syntax of numpy and separately setting all pixels larger and smaller than the threshold 0 respectively:
threshold =
I[I<threshold]=0 i[i=””>=threshold]=1
Alternatively, this also done by using np.where(condition, If the condition holds, the value will replaced by otherwise by Write script that implements both ways. Experiment with different threshold values obtain reasonably good mask of the central object in the image.
Setting the threshold manually tedious. We will use representation of the
image called histogram try and the threshold automatically.
Write function myhist that accepts grayscale image and the number of bins that will used in building histogram. The function should return array that represents the image histogram (the size should equal the number of bins, of course).
The histogram is simply count of pixels with same similar) intensity for all bins. You assume the values of the image within the interval [0,255]. If you use fewer than 255 bins, intensities will have grouped together e.g. if using bins, all values the interval [0,25] will fall into bin 0. You create empty numpy array with = np.zeros(n_bins)
Hint: You want use I.reshape(-1) unroll your image into vector.
Question: The histograms usually normalized by dividing the result by the sum of all cells. Why is that?
Write script that calculates and displays histograms for different numbers of bins using bird.jpg.
(cid:70) points) Modify your function myhist longer assume the uint8 range for values. Instead, it should find the maximum and minimum values in the image and calculate the bin ranges based these values. Write script that shows the difference between both versions of the function.
(d) (cid:70) points) Test myhist function images of the same scene in different lighting conditions. way this is capture several images using your web camera and turn lights and off. Visualize the histograms for all images for different number of bins and interpret the results.
(cid:70) points) Implement Otsu’s method for automatic threshold calculation. It should accept grayscale image and return the optimal threshold. Using normalized histograms, the probabilities of both classes easy calculate. Write script that shows the algorithm’s results different images.
Exercise Morphological operations and regions
While thresholding in some cases give you good mask of the object, it is still just global technique that produce artifacts such as holes in the object unwanted noise the background. Such artifacts best removed before further processing. Morphological operations used for removing them.
We will perform basic morphological operations the image mask.png, erosion and dilation. We will also experiment with combinations of both operations, named opening and closing.
Use the following snippet and write script that performs both operations with different sizes of the structuring element. Also combine both operations sequentially and display the results.
= = np.ones((n,n), np.uint8) # create square structuring element I_eroded = cv2.erode(I, I_dilated = cv2.dilate(I,
Question: Based the results, which order of erosion and dilation operations produces opening and which closing?
Try clean up the mask of the image bird.jpg using morphological operations as shown in the image. Experiment with different sizes of the structuring element. You also try different shapes, like cv2.getStructuringElement(cv2.MORPH_- ELLIPSE,(n,n)).
(cid:70) points) Write function immask that accepts channel image and binary mask and returns image where pixel values black if the corresponding pixel in the mask is equal 0. Otherwise, the pixel value should equal the corresponding image pixel.
Hint: You use the following command merge image channels back channel image: rgb = np.dstack((r,g,b))
(d) Create mask from the image in file eagle.jpg and visualize the result with immask (if available, otherwise simply display the mask). Use Otsu’s method if available, else use manually threshold.
Question: Why is the background included in the mask and not the object? How would you fix that in general? (just inverting the mask if necessary doesn’t count)
Another way process mask is extract connected components. this you use the function cv2.connectedComponentsWithStats1 that accepts binary image and returns information about the connected components present in it. Write script that loads the image coints.jpg, calculates mask and cleans it up using morphological operations. Your goal is get the coins as precisely as possible. Then, using connected components, remove the coins whose is larger than 700 pixels from the original image (replace them with white background). Display the results.
1Documentation available at https://bit.ly/3uFKYTY</threshold]=0>