-
Notifications
You must be signed in to change notification settings - Fork 10
Lab problems Spring 2024
-
All Nashville labs -- remove outlying age data (suspect age == 100)
-
Lab 6 -- Need to add a section where we use all the variables in their actual units, rather than as scaled with mean zero and unit variance, then run an OLS regression where regress bike rentals on some combo of dummy variables and interval and ratio variables and report it out using something like Statsmodels. Note that the bike rental data has already scaled its variables to range from 0 to 1, but that it encodes categorical variables as integers, and the Modules Team just left them that way as regressors. The lab does two important things--data splitting, scaling for ML problems, and overfitting, and so we should keep that in Lab 6 and maybe just start with the causal interpretation of OLS.
-
Lab 7 -- Fix outdated references to buildings ('Barrows Hall'). Fix solution for "states you have visited" to use a list of visited states and an
in
operator. Give notice about tileset behavior, e.g. 1) there is no default tileset when you are using Datahub from off campus (at least not for me), and you have to give a custom tileset and attribution; 2) Stamen Toner will run in web applications on registered servers (e.g. Datahub) but if you download your Python notebook as html, you can no longer call the Stamen Toner tiles when you open the html file in a web browser and it will give you a 404 error; 3) sometimes Datahub has problems rendering map layers, and so you may need to restart the kernel and run the whole thing to have it display (and it can disappear again sometimes) -
Lab 8 -- fix the last map to specify
Choropleth
class rather than GeoJSON; make sure the prompts are clear and have updated links to Folium -
Lab 12 -- where students add the color codes for the race categories in the traffic stop data, use something more pythonic than looping through list and dataframe; probably series.map or series.apply would be best (.map takes a dictionary as an argument)
-
Labs 13 & 14 -- Geopandas and rtree packages are added to Datahub image. You can do pip install instead of !pip install going forward