python-novice-gapminder icon indicating copy to clipboard operation
python-novice-gapminder copied to clipboard

Downloading the dataset and starting in the right directory

Open fa2k opened this issue 4 years ago • 5 comments

We did the first half of this lesson as an online workshop today. The thing that caused the most problems was to start Jupyter Lab in the terminal / command prompt, and to access the files.

We had added a section on downloading and unzipping the dataset to the installation instructions on the course website (https://uio-carpentry.github.io/2021-03-02-uio-python-online/). I tried to instruct people to go to the desktop during the Running and Quitting episode.

Problems:

  • For some users, the Anaconda Powershell prompt started in C:\>.
  • The desktop was at a network drive for centrally managed PCs. Some people found the data at that path.
  • Others had OneDrive on the desktop, and weren't able to find the data there at all.

Especially in an online workshop, it is hard to deal with these issues. The Setup instructions for this lesson says to unpack in the "root directory", but I think they actually mean the "home directory". But it's not easy for a novice to know what the root or home directory is. Furthermore, if the console starts in C:, they get a permission error when trying to make a new notebook.

It would be great if we could find a bullet-proof procedure to navigate to some directory. This is quite essential when trying to do something useful in Jupyter Lab.

fa2k avatar Mar 02 '21 19:03 fa2k

Thanks for noting this, I am sure others have experienced the same in their own institutional environments - would you be willing to submit a PR with improved instructions?

alee avatar Mar 02 '21 23:03 alee

After some discussion, we decided to try to make a longer set of setup instructions that includes opening the notebook and checking that the data folder is there. This will be on the course website. The plan is that we will have installation help sessions before the workshop where people can come if it the procedure doesn't work for them in an easy way. I think this set of instructions will go in the workshop template repo, so it may not be a PR to this one. But if I'm able to finish that, I'll try to make some changes here too.

fa2k avatar Mar 03 '21 18:03 fa2k

The magic command %cd and os.chdir can do the trick, #559 attempts to address this but may require some revision.

alee avatar Aug 16 '21 22:08 alee

We did a shortened version of the lesson today and provided time before we started for anyone who needed help with setup. We asked that everyone have the dataset downloaded, but I think it would have been helpful to have instructions as @fa2k included in their workshop page. We ended up having to pause in the middle of 'Reading Tabular Data into DataFrames' because students were not able to access their data. It probably took 20 minutes to get everyone settled and back on track. If we include more specific details of how to download the data and where to place it, it might be more smooth sailing when getting into that episode. Another thing we can add during 'Running and Quitting' is have a small section to check that the data folder is available. We used tab completion to help many students find their data folder. So maybe once we introduce Jupyter Notebook we can then have the students check that they can access that folder in a cell. I think most of our issues were with OneDrive.

gracieflores avatar Mar 17 '22 18:03 gracieflores

We ran into a similar issue to this for loading the data (specifically with OneDrive). One of our helpers proposed this fix for the user:

# define a variable to store current directory
data_folder = %pwd 
# concatenate with the data sub-folder.
data_folder = data_folder + '/data'
# Now concatenate the csv file name you want to read in. 
data_oceania = pd.read_csv(data_folder + '/gapminder_gdp_oceania.csv')

Happy to make a PR if this would be useful to include somewhere.

kaitlinnewson avatar Oct 04 '23 19:10 kaitlinnewson