In this post we will continue reading and experimenting with the contents of the PluralSight course “Building Image Processing Applications Using scikit-image” by Janani Ravi.
Please note that the course uses the Jupyter notebook to hold the code and results. In this post we will write modified code using the VSCode IDE and a Python script using GitHub Copilot. I would like to disclose that I am a Microsoft employee and have been using VSCode and Python for several years.
Let’s start with our setup.
# **** folder of interest **** C:\Documents\_Image Processing\scikit-image-building-image-processing-applications\02\demos # **** open file of interest using VSCode **** (base) C:\Documents\_Image Processing\scikit-image-building-image-processing-applications\02\demos>code BlockViewsOnImageArrays.py # **** execute python script of interest **** (base) C:\Documents\_Image Processing\scikit-image-building-image-processing-applications\02\demos>python BlockViewsOnImageArrays.py
We start by getting to a folder of interest. In such a folder we will use the IDE VSCode to create and edit a Python script. As we progress we will run the script and display images.
# **** **** import numpy as np # numpy is primary library for numeric array (and matrix) processing from matplotlib import pyplot as plt # pyplot is sub-library of matplotlib, pyplot is for plotting # **** **** import skimage.io # skimage is scikit-image library for image processing from skimage import color # skimage.color is sub-library for converting color spaces from skimage.util import view_as_blocks # skimage.util is sub-library for various generic utilities (like view_as_blocks) # **** read three_dogs image **** three_dogs = skimage.io.imread(fname='./images/pexels-3-dogs.jpg') # **** plot three_dogs RGB image **** plt.imshow( three_dogs, interpolation='nearest') # plot image, set interpolation to nearest plt.title('three_dogs - RGB') # set image title plt.show() # show image
We start by importing libraries of interest.
We then read in a JPG color image containing three dogs. The image is then displayed.
# **** convert three_dogs to grayscale **** three_dogs = color.rgb2gray(three_dogs) # **** plot three_dogs grayscale image **** plt.imshow( three_dogs, cmap='gray') # plot image, set colormap to gray plt.title('three_dogs - grayscale') # set image title plt.show() # show image
We then convert the RGB image into grayscale. The grayscale image is then displayed.
# **** display shape of three_dogs (2D grayscale) **** print(f'three_dogs.shape: {three_dogs.shape}') # **** assign block spape 4 x 4 **** block_shape = (4, 4) # **** view three_dogs as blocks **** three_dogs_blocks = view_as_blocks( three_dogs, block_shape=block_shape) # **** display shape of three_dogs_blocks (H/4, W/4, 4, 4) **** print(f'three_dogs_blocks.shape: {three_dogs_blocks.shape}') # **** reshape three_dogs_blocks) **** flattened_blocks = three_dogs_blocks.reshape( three_dogs_blocks.shape[0], three_dogs_blocks.shape[1], -1) # **** print shape of three_dogs_blocks **** print(f'shape of the blocks image: {three_dogs_blocks.shape}') # **** print shape of flattened image **** print(f'shape of the flattened image: {flattened_blocks.shape}') # **** mean-pooling: find the mean for each block **** mean_blocks = np.mean(flattened_blocks, axis=2) # **** plot mean_blocks **** plt.imshow( mean_blocks, interpolation='nearest', # plot image, set interpolation to nearest cmap='gray') # plot image, set colormap to gray plt.title('mean_blocks - grayscale') # set image title plt.show() # show image
In this step we take the original grayscale image and take 4 x 4 blocks to perform a pooling operation. In this case we will obtain the mean from each 4 x 4 block. The result of the operation is then displayed.
# **** max-pooling: find the max for each block # max_pooling is used to find the most prominent feature in each block **** max_blocks = np.max(flattened_blocks, axis=2) # **** plot max_blocks **** plt.imshow( max_blocks, interpolation='nearest', # plot image, set interpolation to nearest cmap='gray') # plot image, set colormap to gray plt.title('max_blocks - grayscale') # set image title plt.show() # show image
In this step we repeat the pooling operation, but this time we obtain the max value of each 4 x 4 block. The resulting image is displayed.
# **** median-pooling: find the median for each block **** median_blocks = np.median(flattened_blocks, axis=2) # **** plot median_blocks **** plt.imshow( median_blocks, interpolation='nearest', # plot image, set interpolation to nearest cmap='gray') # plot image, set colormap to gray plt.title('median_blocks - grayscale') # set image title plt.show() # show image
Finally we repeat the pooling operation for the last time. In this case we obtain the median value in each 4 x 4 block. The resulting image is displayed.
With practice we can see that different operations extract different features from the original images.
The output for the screen capture without the images follows:
(base) C:\Documents\_Image Processing\scikit-image-building-image-processing-applications\02\demos>python BlockViewsOnImageArrays.py three_dogs.shape: (344, 516) three_dogs_blocks.shape: (86, 129, 4, 4) shape of the blocks image: (86, 129, 4, 4) shape of the flattened image: (86, 129, 16)
Hope you learned from this exercise. I know I did.
The complete code for this post is in my GitHub repository.
Enjoy;
John