Episode 2: array/vector dimensions
Hi there,
so this might be a bit pedantic, but in preparing for the instructor checkout I stumbled over the following expression in episode 2 (Analyzing Patient Data).
In describing using the axis argument, for example in numpy.mean(data, axis=1), there is a statement about checking the output dimensions:
The expression (40,) tells us we have an N×1 vector, so this is the average inflammation per day for all patients. If we average across axis 1 (columns in our 2D example), we get:
I think stating that it is a N x 1 vector like this is a bit problematic, as it suggest the vector has 2 dimensions, which it actually has not. And it also clashes a bit with the challenges about Stacking Arrays, where is some notion about how to keep or remove dimensions and facilitate stacking - but not that summarizing functions like numpy.mean also remove dimensions.
One reason for me putting this up, is that you cannot stack arrays of shape = (40,) and arrays of shape = (40,1). And np.stack will treat 1-D vectors as column vectors and not row vectors as suggested above.
I think possible solutions might be, either to change:
The expression (40,) tells us we have an N×1 vector, so this is the average inflammation per day for all patients. If we average across axis 1 (columns in our 2D example), we get:
to
The expression (40,) tells us we have a vector of length 40, so this is the average inflammation per day for all patients. If we average across axis 1 (columns in our 2D example), we get:
or mention that numpy.mean etc. also remove the dimensions in the Stacking Array challenge.
I am happy to create a pull-request if you agree :)
Cheers, Simon
To me N x 1 vector doesn't suggest 2 dimensions, but not sure how others view it. I think changing the text to a vector of length 40 would be ok though. Any thoughts on this @maxim-belkin ?