Episode 2: array/vector dimensions

Open SRSteinkamp opened this issue 5 years ago • 1 comments

Hi there,

so this might be a bit pedantic, but in preparing for the instructor checkout I stumbled over the following expression in episode 2 (Analyzing Patient Data).

In describing using the axis argument, for example in numpy.mean(data, axis=1), there is a statement about checking the output dimensions: The expression (40,) tells us we have an N×1 vector, so this is the average inflammation per day for all patients. If we average across axis 1 (columns in our 2D example), we get:

I think stating that it is a N x 1 vector like this is a bit problematic, as it suggest the vector has 2 dimensions, which it actually has not. And it also clashes a bit with the challenges about Stacking Arrays, where is some notion about how to keep or remove dimensions and facilitate stacking - but not that summarizing functions like numpy.mean also remove dimensions.

One reason for me putting this up, is that you cannot stack arrays of shape = (40,) and arrays of shape = (40,1). And np.stack will treat 1-D vectors as column vectors and not row vectors as suggested above.

I think possible solutions might be, either to change: The expression (40,) tells us we have an N×1 vector, so this is the average inflammation per day for all patients. If we average across axis 1 (columns in our 2D example), we get: to The expression (40,) tells us we have a vector of length 40, so this is the average inflammation per day for all patients. If we average across axis 1 (columns in our 2D example), we get: or mention that numpy.mean etc. also remove the dimensions in the Stacking Array challenge.

I am happy to create a pull-request if you agree :)

Cheers, Simon

Jan 19 '21 10:01 SRSteinkamp

To me N x 1 vector doesn't suggest 2 dimensions, but not sure how others view it. I think changing the text to a vector of length 40 would be ok though. Any thoughts on this @maxim-belkin ?

Apr 16 '21 16:04 ldko