python

Intro to Data Science Programming

My aim with this course is to give students experience with Python in the context of Data Science.

[Link to course webpage in progress as IU migrates its online assets to an updated platform.]

tools

Tool Note
Jupyter Weekly labs are hosted on Jupyter notebooks, and students are encouraged to use these to test out code for their projects, or just try out/learn from the code in the weekly notebook.
Miniconda Since the Anaconda distribution gets bloated easily, students build pip environments using Miniconda for this class.
GitHub Give students real-world experience with version control. This is useful for project group work, weekly exercises, and it’s helpful for me and TAs to track individual students’ progress.
Streamlit Introduce students to end-to-end development for data science models.
Gradescope This allows me to autograde weekly exercises a bit easier. Students turn in their GitHub repositories — each week, they see a new way to incorporate what they learn into the “data science development pipeline”.
Docker Right now, this just provides the framework needed for the exercise autograders (e.g., Gradescope). Though, in the future, I intend to incorporate this into the course curriculum1.

Footnotes

  1. Docker is a very widely used tool in the tech industry, and thus an incredibly valuable skill to have as a data scientist. But, from what I can tell, it’s often undervalued in higher-ed.↩︎