Last week I was reading a post on Medium “First Steps in Data Science with Python NumPy” by Kshitij Bajracharya.

What called my attention is his opening statement “*I’ve read that the best way to learn something is to blog about it*”. I believe Kshitij hit it right on. The reason I agree is that I have been a believer in “If you can’t explain it simply, you don’t understand it well enough”. This quote is attributed to Albert Einstein.

My other reason is that I am also a big fan of “The Feynman Technique: The Best Way to Learn Anything”. In a nutshell you start by writing the concept followed by an explanation. Pretend you are teaching / explaining it to a student. When you run into issues (and you will), go back and learn / polish the missing points. Repeat until the description is flawless and simple. You can Goggle “Feynman technique” to get several descriptions on the simple steps of this technique. One way or the other you need to write your explanation and then use your notes to explain the subject to your imaginary student. By writing the code and blog one ends up cycling several times over the material; which is exactly what The Richard Feynman Technique is all about.

The post has three problems. Allow me to cover each.

**Problem 1**

Your mission, should you decide to accept it is (yep, I watched Mission Impossible – Fallout over the weekend), given five cylindrical containers with different radius and heights ranging between 5 and 25 centimeters is to find out:

a) The volume of water that each container can hold?

b) The total volume of water that all containers can hold?

c) Which container can hold the most volume and how much?

d) Which container can hold the least volume and how much?

e) What is the mean, median and standard deviation of the volumes of water that can be contained in the containers?

The following code starts with the necessary import and the information regarding the volumes:

# **** **** import numpy as np # **** problem 1 **** radiusAndHeight = 10 minDim = 5 maxDim = 25 print("radiusAndHeight:", radiusAndHeight) print(" minDim:", minDim) print(" maxDim:", maxDim) print()

Next we seed the random number generator and create a set of numbers that represent the radius and height of each of the five cylinders. This is performed as follows:

# **** generate the list of values to use **** np.random.seed(0) values = np.random.randint(minDim, maxDim + 1, (minDim, int(radiusAndHeight / minDim)), 'int64') print(" values:", values, "values.ndim:", values.ndim) print("values.size:", values.size, "values.dtype:", values.dtype, "values.shape:", values.shape, "\n") np.random.seed(0) values = np.random.randint(minDim, maxDim + 1, radiusAndHeight, 'int64') print(" values:", values, "values.ndim:", values.ndim) print("values.size:", values.size, "values.dtype:", values.dtype, "values.shape:", values.shape, "\n")

The code shows two different ways that we could have generated the random values for the cylinders. The idea behind this is to illustrate that the data may come in different formats and one should be able to manipulate / manage it until you get it the way you need to obtain the required results.

The following code takes the list of random values and separates them into five pairs, one for each container, representing the radius and the height of the cylinders:

# **** reshape the values into containers **** numRows = 5 numCols = 2 print("numRows:", numRows) print("numCols:", numCols) containers = values.reshape(numRows, numCols) print(" containers:", containers, "containers.ndim:", containers.ndim) print("containers.size:", containers.size, "containers.dtype:", containers.dtype, "containers.shape:", containers.shape, "\n")

Things are looking better, but for simplicity we might want to get all the radiuses into one array and the heights into a separate one so we can perform the volume calculations. This operation may be performed by slicing as follows:

# **** slice the containers **** radius = containers[:,0] height = containers[:,1] print("radius:", radius) print("height:", height) print()

In the height array we have all the heights for the volumes and in the radius array the radiuses for them. We are ready to compute the volumes by recalling that the volume formula for a cylinder is PIE * r^2 * height. This operation can be perform with the two arrays as follows:

# **** now we have the radius and height of each cylinder; let's compute the volumes **** volume = np.pi * (radius ** 2) * height print("volume:", volume) print() # **** compute the total volume of all cylinders **** totalVolume = volume.sum() print(" totalVolume:", totalVolume)

The volume array holds the volumes associated with the five cylinders. This is the answer to the first question. We could compute the total volumes to verify the result using the following approach:

# **** let's verify the result by computing the total volume with an alternate approach **** radiusSquared = np.square(radius) dotProduct = np.dot(radiusSquared, height) totalVolumeByDotProduct = np.pi * dotProduct print("totalVolumeByDotProduct:", totalVolumeByDotProduct) print()

The next questions want us to determine the cylinder with the maximum and minimum volumes. We can get the results as follows:

# **** which cylinder has the maximum volume **** maxVol = volume.max() indexOfMaxVol = volume.argmax() print("indexOfMaxVol:", indexOfMaxVol, "value:", volume[indexOfMaxVol]) # **** which cylinder has the minimum volume **** minVol = volume.min() indexOfMinVol = volume.argmin() print("indexOfMinVol:", indexOfMinVol, "value:", volume[indexOfMinVol]) print()

The problem wishes us to calculate some statistics on the five volumes. This can be accomplished by:

# **** finally we can calculate the mean, median and standard deviation **** volumeMean = np.mean(volume) volumeMedian = np.median(volume) volumeStdDev = np.std(volume) print(" volumeMean:", volumeMean) print("volumeMedian:", volumeMedian) print("volumeStdDev:", volumeStdDev) print()

**Problem 2**

Now let’s take a look at the statement for the second problem.

Twenty five cards numbered 1 through 25 are randomly distributed equally amongst 5 people. Find the sum of cards for each person such that for the 1st person, the sum is the value of 1st card minus the sum of rest of the cards (2, 3, 4, and 5); for the 2nd person, the sum is the value of 2nd card minus the sum of rest of the cards (1, 3, 4 and 5), and so on. The person for whom the sum of the cards is greatest will be the winner. Find the winner.

Let’s start by generating an array of 25 cards using the following code:

# **** finally we can calculate the mean, median and standard deviation **** # **** problem 2 **** # **** generate an array of 25 cards **** numbers = np.arange(1, 25 + 1) print(" numbers:", numbers)

We can now shuffle the cards in the array using:

# **** shuffle the cards **** np.random.shuffle(numbers) print(" numbers:", numbers) print()

We now have to hand each person five cards. This can be performed as follows:

# **** distribute the cards to 5 people **** reshapedNums = numbers.reshape(5, 5) print("reshapedNums:", reshapedNums) print()

The Identity matrix is a square matrix with the values in the main diagonal set to 1 and the rest of the elements set to 0. An identity matrix of five by five can be generated as follows:

# **** build an identity 5x5 matrix **** I = np.eye(5, dtype=int) print("I:\n", I) print()

We can now generate a new 5×5 matrix with only the diagonal values extracted from the reshaped matrix. This is done with the following code:

# **** generate a matrix with the values from the reshaped matrix # found in the main diagonal **** diagonalMatrix = np.multiply(reshapedNums, I) print("diagonalMatrix:\n", diagonalMatrix) print()

Let’s generate a Unit matrix of 5×5 using the following python code:

# **** generate the 5x5 unit matrix **** U = np.ones((5,5), dtype=int) print("U:\n", U) print()

Now let’s generate a 5×5 matrix with all the values set to -1 with the exception of the main diagonal as illustrated by the following code:

# **** set all values to -1 with the exception of the elements in the main diagonal **** IMinusU = I - U print("IMinusU:\n", IMinusU) print()

By multiplying the reshapedNums by the IMinusU matrices we are able to generate a matrix with all the negative numbers we are looking for as illustrated by:

# **** multiply the matrices to get the actual random numbers in the matrix **** negDiagMatrix = np.multiply(reshapedNums, IMinusU) print("negDiagMatrix:\n", negDiagMatrix) print()

Now we can combine the last two matrices by using:

# **** combine the matrices **** combinedMatrix = np.add(diagonalMatrix, negDiagMatrix) print("combinedMatrix:\n", combinedMatrix) print()

The combinedMatrix seems to match the requirements for the game. The first person (row) has the first value positive and the rest negative, the second person (row) has the second column positive and the rest negative, and so forth.

The next step is to compute the sums of all the columns per row. This can be performed as follows:

# **** sum the rows **** sumMatrix = combinedMatrix.sum(axis = 1) print("sumMatrix:\n", sumMatrix) print()

Now for the winner:

# **** determine the winner **** winner = np.argmax(sumMatrix) print("winner:", winner, "value:", sumMatrix[winner]) print()

**Problem 3**

You are given a rope of length 5 meters. Cut the rope into 9 equal length parts.

The idea is to cut a five meter rope into nine equal length parts. This can be accomplished with the following code:

# **** problem 3 **** start = 0 end = 5 numPoints = 10 print(" start:", start) print(" end:", end) print("numPoints:", numPoints) print() # **** determine where to cut the rope **** cuts = np.linspace(start, end, numPoints) print(" cuts:", cuts) print()

An alternate approach could be performed with the following code:

# **** altername cuts **** alternateCuts = np.arange(0.0, 5.0, 5.0 / 9.0) print("alternateCuts:", alternateCuts) print()

Both arrays indicate that the **first** segment should be from 0 to 0.55555556 meters. The first array indicates that the last **cut** should be done at the end of the 5.0 meters and the second array indicates that the last **cut** should be performed at 4.44444444 meters. Both approaches produce the same effects.

To make both arrays only hold the places where the actual cuts should be performed on the 5 meter rope we could use the following code:

# **** the arrays now contain only the actual cut locations **** cuts = cuts[1:-1] print(" cuts:", cuts) print() alternateCuts = alternateCuts[1:] print("alternateCuts:", alternateCuts)

Now let’s get the entire output generated by the script:

$ python problems.py radiusAndHeight: 10 minDim: 5 maxDim: 25 values: [[17 20] [ 5 8] [ 8 12] [14 24] [23 9]] values.ndim: 2 values.size: 10 values.dtype: int64 values.shape: (5, 2) values: [17 20 5 8 8 12 14 24 23 9] values.ndim: 1 values.size: 10 values.dtype: int64 values.shape: (10,) numRows: 5 numCols: 2 containers: [[17 20] [ 5 8] [ 8 12] [14 24] [23 9]] containers.ndim: 2 containers.size: 10 containers.dtype: int64 containers.shape: (5, 2) radius: [17 5 8 14 23] height: [20 8 12 24 9] volume: [ 18158.40553775 628.31853072 2412.74315796 14778.05184249 14957.12262374] totalVolume: 50934.6416927 totalVolumeByDotProduct: 50934.6416927 indexOfMaxVol: 0 value: 18158.4055377 indexOfMinVol: 1 value: 628.318530718 volumeMean: 10186.9283385 volumeMedian: 14778.0518425 volumeStdDev: 7199.75824545 numbers: [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25] numbers: [12 22 20 19 3 23 21 17 11 1 4 5 16 9 14 10 6 18 15 8 25 2 13 7 24] reshapedNums: [[12 22 20 19 3] [23 21 17 11 1] [ 4 5 16 9 14] [10 6 18 15 8] [25 2 13 7 24]] I: [[1 0 0 0 0] [0 1 0 0 0] [0 0 1 0 0] [0 0 0 1 0] [0 0 0 0 1]] diagonalMatrix: [[12 0 0 0 0] [ 0 21 0 0 0] [ 0 0 16 0 0] [ 0 0 0 15 0] [ 0 0 0 0 24]] U: [[1 1 1 1 1] [1 1 1 1 1] [1 1 1 1 1] [1 1 1 1 1] [1 1 1 1 1]] IMinusU: [[ 0 -1 -1 -1 -1] [-1 0 -1 -1 -1] [-1 -1 0 -1 -1] [-1 -1 -1 0 -1] [-1 -1 -1 -1 0]] negDiagMatrix: [[ 0 -22 -20 -19 -3] [-23 0 -17 -11 -1] [ -4 -5 0 -9 -14] [-10 -6 -18 0 -8] [-25 -2 -13 -7 0]] combinedMatrix: [[ 12 -22 -20 -19 -3] [-23 21 -17 -11 -1] [ -4 -5 16 -9 -14] [-10 -6 -18 15 -8] [-25 -2 -13 -7 24]] sumMatrix: [-52 -31 -16 -27 -23] winner: 2 value: -16 start: 0 end: 5 numPoints: 10 cuts: [ 0. 0.55555556 1.11111111 1.66666667 2.22222222 2.77777778 3.33333333 3.88888889 4.44444444 5. ] alternateCuts: [ 0. 0.55555556 1.11111111 1.66666667 2.22222222 2.77777778 3.33333333 3.88888889 4.44444444] cuts: [ 0.55555556 1.11111111 1.66666667 2.22222222 2.77777778 3.33333333 3.88888889 4.44444444] alternateCuts: [ 0.55555556 1.11111111 1.66666667 2.22222222 2.77777778 3.33333333 3.88888889 4.44444444] $

Hope you enjoyed this post. Keep on reading and practicing and remember that you do not know something well enough if you cannot explain it.

John

Follow me on Twitter: **@john_canessa**