training Data Explanation using Spark / Interactive Analysis: pictures to show how work is distributed would be useful

First, some of your tutorial users might not know MapReduce, so showing how the problem is divided into Map and Reduce (and explaining what a combiner is) would be useful, at least for the first few examples.

It would be useful, I think, to show how the data is divided up among a) chunks of work and b) processors (workers?), perhaps with a picture. I think the beauty of Spark is that it hides some of these details, so to say "to make this efficient, you want to launch a lot of chunks of work and distribute them to your workers, but Spark hides all the details of that distribution" would be helpful.

Aug 29 '13 21:08 jowens

John you're my hero right now. Thanks for reporting the issues you ran into. Really.

Aug 30 '13 04:08 andyk

Oh good. At the outset of the tutorial, it was promised they'd all be fixed within minutes, so now I feel better about reporting these. Count on more tomorrow, most of them non-Ingress-related.

Aug 30 '13 05:08 jowens

Oh man, I guess I could learn a lesson about over promising from this. In any case, we will address the issues you've been reporting, even it it does happen after many many minutes.

Aug 30 '13 05:08 andyk