Skip to content

Lab problems Spring 2024

Jon Marshall edited this page Feb 5, 2024 · 21 revisions
  • Lab 6 -- Need to add a section where we use all the variables in their actual units, rather than as scaled with mean zero and unit variance, then run an OLS regression where regress bike rentals on some combo of dummy variables and interval and ratio variables and report it out using something like Statsmodels. Note that the bike rental data has already scaled its variables to range from 0 to 1, but that it encodes categorical variables as integers, and the Modules Team just left them that way as regressors. The lab does two important things--data splitting, scaling for ML problems, and overfitting, and so we should keep that in Lab 6 and maybe just start with the causal interpretation of OLS.

  • Lab 7 -- Fix outdated references to buildings ('Barrows Hall'). Fix solution for "states you have visited" to use a list of visited states and an in operator. Give notice about tileset behavior, e.g. 1) there is no default tileset when you are using Datahub from off campus (at least not for me), and you have to give a custom tileset and attribution; 2) Stamen Toner will run in web applications on registered servers (e.g. Datahub) but if you download your Python notebook as html, you can no longer call the Stamen Toner tiles when you open the html file in a web browser and it will give you a 404 error; 3) sometimes Datahub has problems rendering map layers, and so you may need to restart the kernel and run the whole thing to have it display (and it can disappear again sometimes)

Clone this wiki locally