Broadcasting is a feature of Python and Numpy. When one is performing array operations, in some cases the shapes of the arguments do not match. The good and bad thing is that Python assumes what you want to happen and does it. In most cases the results are fine, but on occasions Python might do something that you are not expecting. This post discusses to some degree what is broadcasting. The idea is that we will be using it in a future post when doing some regressions for image recognition.
I will cover the cells in a Jupyter notebook. The notebook is in my GitHub repository. Note that I talk about the results but, if interested you will have to navigate to GitHub and run or download the notebook.
For a more complete description of broadcasting you can find it here.
Now let’s turn to the Jupyter notebook. We will cover cell by cell. I will be using an example from a course I took on Coursera.
In cell #1 we put our only import (Numpy) and then declare a numpy array which is a [3, 4] matrix. Each column represents one of four foods in the order: apples, beef, eggs and potatoes. For each food, in the vertical axis we have values for carbohydrates, protein and fat for that particular food. For example, the first column which represents apples, we have in the first row 56.0 grams of carbohydrates, 1.2 grams of protein and 1.8 grams of fat. In the class we were told that the values are for 100 grams samples. If you take a closer look at the second column which is for beef, 100 grams sample contains 104 grams of protein and 135 grams of fat. That adds to 239 grams. I took a quick look on the web and it seems that, the idea for the labels in this example is just to provide a context. The values seem to be off.
import numpy as np # **** on the x-axis: apples, beef, eggs and potatoes # on the y-axis: Carbs, Protein and Fat **** A = np.array([[56.0, 0.0, 4.4, 68.0], [1.2, 104.0, 52.0, 8.0], [1.8, 135.0, 99.0, 0.9]]) print("A: " + str(A)) print("A.shape: " + str(A.shape))
In cell #2 we sum all the values in the matrix. As expected this is not what we need or want. It just produces the sum of all values which I would have expected to be 400 grams. In this case we get about 530.3 grams.
# **** sums all values in A **** cal = A.sum() print("cal: " + str(cal)) print("cal.shape: " + str(cal.shape))
In cell #3 we specify the direction for the sum operation. We would like to get the sums per column which represent each food. Once we have those values we should be able to get the percentages per row which is what we are after.
# **** sum all colums in A **** cal = A.sum(axis=0) print("cal: " + str(cal)) print("cal.shape: " + str(cal.shape))
Cell #4 shows how we can get a set of percentages. We divided A which is a [3, 4] matrix by val which is a [1, 4] vector. The results in the percentage matrix [3, 4] is what we wanted but we should have used a val matrix with a shape of [3, 4]. Numpy figures out what was needed and broadcasted val so the operation would make sense and could be carries out. That is cool.
# **** compute percentages (every column in A was divided by every column in cal) **** percentage = (A / cal) * 100.0 print("percentage:\n" + str(percentage)) print("percentage.shape: " + str(percentage.shape))
Cell #5 makes the operation somewhat more explicit, but note that val is still a [1, 4] vector.
# **** each column in A was divided by the corresponding column in cal ****
percentage = (A / cal.reshape(1, 4)) * 100.0
print(“percentage:\n” + str(percentage))
print(“percentage.shape: ” + str(percentage.shape))
Let’s now take a look at a different example. In cell # 6 we define a vector B with four values. We then add 100 to the vector. Numpy knew what we wanted and broadcasted 100 to a vector [1, 4] with all values set to 100. It then added them and returned the expected results. Of course, if you wanted to just add 100 to the first element in B, you were out of luck.
# **** a different example **** B = np.array([1, 2, 3, 4]) print("B:\n" + str(B)) print("B.shape: " + str(B.shape)) print() B += 100 print("B:\n" + str(B)) print("B.shape: " + str(B.shape))
In cell #7 we declare B using two brackets [[ on the left and two on the right ]]. This syntax is used to declare a vector array with well defined shape; in this case [1, 4]. The result is a row vector.
# ***** **** B = np.array([[1 ,2, 3, 4]]) print("B:\n" + str(B)) print("B.shape: " + str(B.shape)) B += 100 print("B:\n" + str(B)) print("B.shape: " + str(B.shape))
In cell #8 declare B using a different set of brackets. We declare a columnar vector with the same values. The shape is [4, 1] We then add 100 and like in the previous cell, each value in the vector of a different shape, ends up increased by 100.
# **** **** B = np.array([, , , ]) print("B:\n" + str(B)) print("B.shape: " + str(B.shape)) print() B += 100 print("B:\n" + str(B)) print("B.shape: " + str(B.shape))
Let’s now look at a new example. This time we are using a two dimensional matrix. This is illustrated in cell #9. C is declared as a [2, 3] array and D as a [1, 3].
# **** yet one more example **** C = np.array([[1, 2, 3], [4, 5, 6]]) print("C:\n" + str(C)) print("C.shape: " + str(C.shape)) print() D = np.array([[100, 200, 300]]) print("D:\n" + str(D)) print("D.shape: " + str(D.shape))
In cell #10 we sum C and D returning the results in E. Given the results, it is clear that Numpy broadcasted D to a [2, 3] array in order to produce the results in E.
# **** C[2,3] + D[1,3] broadcasted to: C[2,3] + D[2,3] = E[2,3] E = C + D print("E:\n" + str(E)) print("E.shape: " + str(E.shape))
OK, one more example and we are done. We declare a [2, 3] array and populate it with ascending integers starting at 1. This is shown in cell #11.
# **** one last example **** F = np.array([[1, 2, 3], [4, 5, 6]]) print("F:\n" + str(F)) print("F.shape: " + str(F.shape))
In cell #12 we declare a columnar vector with dimensions [2, 1].
G = np.array([, ]) print("G:\n" + str(G)) print("G.shape" + str(G.shape))
In cell #13 we add F and G to produce H. H ends up being a [2, 3] array. For this to work, Numpy broadcasted G to a [2,3] array and then add the elements.
# **** F[2,3] + G[2,1] broadcasted to: F[2,3] + G[2,3] = H[2,3]**** H = F + G print("H:\n" + str(H)) print("H.shape: " + str(H.shape))
If you wish to take a look at the notebook please click here.
Hope you enjoy it. We will eventually get to a linear regression example for a classifier using images. Perhaps in two or three more posts in this category.
Enjoy and keep on learning.
Please follow me on Twitter: @john_canessa