socialsci-workshop icon indicating copy to clipboard operation
socialsci-workshop copied to clipboard

SAFI farms cognitive load

Open mikerspencer opened this issue 4 years ago • 4 comments

The SAFI farms dataset has quite a high cognitive load for those without regional, development or farming knowledge. The lessons may work better for a wider audience with a more general social dataset, e.g. a census.

mikerspencer avatar Apr 23 '21 13:04 mikerspencer

@mikerspencer -- As maintainers, we have often discussed the possibility of changing the dataset for these lessons, however, this is a major change that would need to be approved by the Curriculum Advisory Council. Since you have brought up the issue, we would be happy to relay the suggestion of transitioning to a different dataset.

To ensure we accurately summarize your concerns, could you respond with a list of reasons why you believe the SAFI dataset is a poor fit for these lessons? Additionally, would you have any datasets that you would suggest the lesson using?

Thanks!

atheobold avatar May 09 '21 15:05 atheobold

Hi Allison,

Problems:

  • Dataset is location specific, place names are difficult for those not familiar with the area
  • Specific terms, e.g. mud daub, which are difficult for non-English speakers (or anyone not making houses from mud)
  • Agricultural knowledge required, e.g. irrigation, crop types

Solutions/ideas:

  • What types of data do we need, categorical, numeric, nested?
  • What data structures do we need? single file/table, but also lookups/joins
  • Dataset which doesn't depend on place, unless perhaps global?
  • Topic which is widely accessible

I've previously used subsets of the Stackoverflow developer with success, but generic population census data could be even better.

I appreciate, change will be difficult and there's likely no perfect answer.

mikerspencer avatar May 10 '21 10:05 mikerspencer

I want to vote strongly against stackoverflow developer data. The target of these curricula is social scientists, the data should be survey data, not logs.

the data also needs to be plausibly messy for cleaning with openrefine.

Census could be okay, but I actually prefer incorporating more about reading the meta-data into the spreadsheets lesson (which is supposed to be a precursor to the rest of the lessons) so that over the course of a workshop, people learn enough about the data for it to make sense later.

brownsarahm avatar May 10 '21 18:05 brownsarahm

I agree with @mikerspencer . However, I think the SAFI dataset itself isn't problematic, maybe a more detailed introduction about the background of the study would help social scientists feel more at home. In the lesson introduction, they provide a brief description of the dataset, which I think is helpful, but there needs more detail that is social science specific. I attended a physical/in-person carpentries workshop in Norway, and the facilitator provided pictures of the different types of roofs found in the dataset, added info about the type of family structures in the area, etc - the context she built made the dataset "come to life", so to speak. But I guess this is really up to the workshop holders to give this added info.

USNRyan avatar Jan 09 '24 08:01 USNRyan