Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

W02 changes #38

Merged
merged 2 commits into from
Apr 28, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions slides.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,13 +60,13 @@ def full_slides():


def compile_all():
files = (file.with_suffix(".tex") for file, _, _ in iter_all())
files = (file for file, _, _ in iter_all(ext="tex"))
with multiprocessing.Pool(multiprocessing.cpu_count()) as pool:
pool.map(pdflatex, files)


def compile_git():
files = (file.with_suffix(".tex") for file, _, _ in iter_all() if check_git(file.with_suffix(".tex")))
files = (file for file, _, _ in iter_all(ext="tex") if check_git(file.with_suffix(".tex")))
with multiprocessing.Pool(multiprocessing.cpu_count()) as pool:
pool.map(pdflatex, files)

Expand All @@ -80,7 +80,7 @@ def fits_identifier(week, slide):
return week == week_id
return week == week_id and slide == slide_id

files = (file.with_suffix(".tex") for file, week, slide in iter_all() if fits_identifier(week, slide))
files = (file for file, week, slide in iter_all(ext="tex") if fits_identifier(week, slide))
with multiprocessing.Pool(multiprocessing.cpu_count()) as pool:
pool.map(pdflatex, files)

Expand All @@ -92,9 +92,9 @@ def cleanup():


# Helper functions
def iter_all():
def iter_all(ext="pdf"):
folder_pattern = re.compile("w(\d{2})_")
slide_pattern = re.compile("t(\d{2,3})_[\w_]+\.pdf")
slide_pattern = re.compile("t(\d{2,3})_[\w_]+\." + ext)
for week_folder in GIT_REPO.iterdir():
week_number = folder_pattern.match(week_folder.name)
if week_number is None: # folder does not match mattern
Expand Down
Binary file modified slides/w02_evaluation.pdf
Binary file not shown.
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file modified w02_evaluation/t01_big_picture.pdf
Binary file not shown.
2 changes: 1 addition & 1 deletion w02_evaluation/t01_big_picture.tex
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@
\end{itemize}
\end{itemize}
\begin{center}
\includegraphics[width=.5\textwidth]{overfitting}
\includegraphics[width=.5\textwidth]{images/overfitting1}
\end{center}
Usually model performance gets better with more data/higher model complexity
and then worse, but see \lit{\href{https://arxiv.org/pdf/1912.02292.pdf}{Nakkiran et al. 2019}}%\url{https://openai.com/blog/deep-double-descent/}.
Expand Down
Binary file modified w02_evaluation/t02_evaluation.pdf
Binary file not shown.
35 changes: 19 additions & 16 deletions w02_evaluation/t02_evaluation.tex
Original file line number Diff line number Diff line change
Expand Up @@ -99,22 +99,24 @@
with measurement error $\epsilon$.

\begin{center}
\includegraphics[width=.7\textwidth]{poly}
\includegraphics[width=.7\textwidth]{images/poly}
\end{center}

Assume data generating process unknown. Approximate with $d$th-degree polynomial:
Assume data generating process unknown.
Approximate with $d$th-degree polynomial:
\[ f(\mathbf{x} | \mathbf{\theta}) = \theta_0 + \theta_1 x + \cdots + \theta_d x^d = \sum_{j = 0}^{d} \theta_j x^j \]

\framebreak

How should we choose $d$?

\begin{center}
\includegraphics[width=.5\textwidth]{poly-train}
\includegraphics[width=.5\textwidth]{images/poly-train}
\end{center}

d=1: MSE = 0.036 -- clear underfitting, d=3: MSE = 0.003 -- ok?, d=9: MSE =
0.001 -- clear overfitting
d=1: MSE = 0.036 -- clear underfitting,
d=3: MSE = 0.003 -- ok?,
d=9: MSE = 0.001 -- clear overfitting

Simply using the training error seems to be a bad idea.

Expand All @@ -123,30 +125,31 @@
\begin{frame}[c,allowframebreaks]{Outer Loss Example: Polynomial Regression}

\begin{center}
\includegraphics[width=.7\textwidth]{polyt}
\includegraphics[width=.7\textwidth]{images/polyt}
\end{center}

\framebreak

How should we choose $d$?

\begin{center}
\includegraphics[width=.5\textwidth]{poly-test}
\includegraphics[width=.5\textwidth]{images/poly-test}
\end{center}

d=1: MSE = 0.038 -- clear underfitting, d=3: MSE = 0.002 -- ok?, d=9: MSE =
0.046 -- clear overfitting
d=1: MSE = 0.038 -- clear underfitting,
d=3: MSE = 0.002 -- ok?,
d=9: MSE = 0.046 -- clear overfitting

\framebreak

\begin{center}
\includegraphics[width=.9\textwidth]{bias-variance}
\includegraphics[width=.9\textwidth]{images/bias-variance}
\end{center}

\end{frame}

\begin{frame}[c]{General Trade-Off Between Error and Complexity}
\includegraphics[width=\textwidth]{overfitting}
\includegraphics[width=\textwidth]{images/overfitting2}
\end{frame}

\begin{frame}[c]{Resampling}
Expand All @@ -172,7 +175,7 @@

\begin{center}
% FIGURE SOURCE: https://docs.google.com/presentation/d/1sKtnj5nIQrcOGU7rTisMsppUGOk7UX2gbjKhtQmTX7g/edit?usp=sharing
\includegraphics[height=.5\textheight]{crossvalidation.png}
\includegraphics[height=.5\textheight]{images/crossvalidation}
\end{center}
10-fold cross-validation is common.
\end{frame}
Expand Down Expand Up @@ -257,15 +260,15 @@
\end{itemize}

\begin{center}
\includegraphics[height=.35\textheight]{learning-curve}
\includegraphics[height=.35\textheight]{images/learning-curve}
\end{center}

\framebreak

Ideal learning curve:

\begin{center}
\includegraphics[height=.7\textheight]{learning-curve-ideal}
\includegraphics[height=.7\textheight]{images/learning-curve-ideal}
\end{center}

\framebreak
Expand All @@ -281,7 +284,7 @@
\end{itemize}

\begin{center}
\includegraphics[width=.7\textwidth]{learning-curve-underfitting}
\includegraphics[width=.7\textwidth]{images/learning-curve-underfitting}
\end{center}

\framebreak
Expand All @@ -294,7 +297,7 @@
\end{itemize}

\begin{center}
\includegraphics[width=.7\textwidth]{learning-curve-overfitting}
\includegraphics[width=.7\textwidth]{images/learning-curve-overfitting}
\end{center}

\end{enumerate}
Expand Down
Binary file added w02_evaluation/t03_benchmarking.pdf
Binary file not shown.
14 changes: 7 additions & 7 deletions w02_evaluation/t03_benchmarking.tex
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@
\end{itemize}

\begin{center}
\includegraphics[height=.5\textheight]{tests_overview.png}
\includegraphics[height=.5\textheight]{images/tests_overview}
\end{center}

\end{frame}
Expand All @@ -110,7 +110,7 @@

\medskip
\begin{minipage}{0.25\textwidth}
\includegraphics[width=\textwidth]{mcnemar_1.png}
\includegraphics[width=\textwidth]{images/mcnemar_1}
\end{minipage}
\begin{minipage}{0.74\textwidth}
\begin{itemize}
Expand All @@ -133,7 +133,7 @@
Even if the models have the \textbf{same} errors (indicating equal performance), cells B and C may be different because the models may misclassify different instances.
\end{minipage}
\begin{minipage}[c]{0.25\linewidth}
\includegraphics[width=\textwidth]{mcnemar_1.png}
\includegraphics[width=\textwidth]{images/mcnemar_1}
\end{minipage}

\medskip
Expand Down Expand Up @@ -290,7 +290,7 @@
\end{itemize}

\begin{center}
\includegraphics[height=.5\textheight]{crit-diff-nemenyi}
\includegraphics[height=.5\textheight]{images/crit-diff-nemenyi}
\end{center}

\framebreak
Expand All @@ -316,7 +316,7 @@
significantly different from the baseline
\end{itemize}
\begin{center}
\includegraphics[height=.6\textheight]{crit-diff-bd}
\includegraphics[height=.6\textheight]{images/crit-diff-bd}
\end{center}

\end{frame}
Expand All @@ -329,14 +329,14 @@
\bigskip
Boxplots
\begin{center}
\includegraphics[height=.65\textheight]{multiple-boxplots}
\includegraphics[height=.65\textheight]{images/multiple-boxplots}
\end{center}

\framebreak

Rank plots
\begin{center}
\includegraphics[height=.7\textheight]{multiple-ranks}
\includegraphics[height=.7\textheight]{images/multiple-ranks}
\end{center}

\end{frame}
Expand Down
Binary file modified w02_evaluation/t04_nested_evaluation.pdf
Binary file not shown.
27 changes: 12 additions & 15 deletions w02_evaluation/t04_nested_evaluation.tex
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,8 @@
\maketitle

\begin{frame}[c]{Motivation}
Selecting the best model from a set of potential candidates (e.g.\ different
classes of learners, different hyperparameter settings, different feature
sets, different preprocessing\ldots) is an important part of most machine
learning problems. However,
Selecting the best model from a set of potential candidates (e.g.\ different classes of learners, different hyperparameter settings, different feature sets, different preprocessing\ldots) is an important part of most machine learning problems.
However,

\begin{itemize}
\item cannot evaluate selected learner on the same
Expand Down Expand Up @@ -66,17 +64,16 @@
\framebreak

\begin{center}
\includegraphics[height=.5\textheight]{example-nested-resampling}
\includegraphics[height=.5\textheight]{images/example-nested-resampling}
\end{center}

\begin{itemize}
\item shown is best ``tuning error'' (i.e.\ performance of
model with fixed $\conf$ in cross-validation) after $k$ tuning iterations
\item shown is best ``tuning error'' (i.e.\ performance of model with fixed $\conf$ in cross-validation) after $k$ tuning iterations
\item evaluated for different data set sizes
\end{itemize}

\begin{center}
\includegraphics[height=.6\textheight]{dist-tuning1}
\includegraphics[height=.6\textheight]{images/dist-tuning1}
\end{center}

\begin{itemize}
Expand All @@ -88,7 +85,7 @@
\framebreak

\begin{center}
\includegraphics[height=.55\textheight]{dist-tuning2}
\includegraphics[height=.55\textheight]{images/dist-tuning2}
\end{center}

\begin{itemize}
Expand All @@ -108,7 +105,7 @@
preprocessing) evaluated \textbf{on training data}
\item test set only touched once, so no way of ``cheating''
\item test dataset is only used once \emph{after} model is completely
trained (including e.g.\ deciding hyper-parameter values)
trained (including e.g.\ deciding hyperparameter values)
\item performance estimates from test set now \textbf{unbiased estimates} of the true performance

\framebreak
Expand Down Expand Up @@ -138,7 +135,7 @@
resampling

\begin{center}
\includegraphics[height=0.6\textheight]{Nested_Resampling.png}
\includegraphics[height=0.6\textheight]{images/Nested_Resampling}
\end{center}

\framebreak
Expand All @@ -153,7 +150,7 @@
\end{footnotesize}

\begin{center}
\includegraphics[height=0.55\textheight]{Nested_Resampling.png}
\includegraphics[height=0.55\textheight]{images/Nested_Resampling}
\end{center}

\framebreak
Expand All @@ -168,7 +165,7 @@
\end{footnotesize}

\begin{center}
\includegraphics[height=0.55\textheight]{Nested_Resampling.png}
\includegraphics[height=0.55\textheight]{images/Nested_Resampling}
\end{center}

\framebreak
Expand All @@ -179,7 +176,7 @@
\end{footnotesize}

\begin{center}
\includegraphics[height=0.6\textheight]{Nested_Resampling.png}
\includegraphics[height=0.6\textheight]{images/Nested_Resampling}
\end{center}

\end{frame}
Expand All @@ -191,7 +188,7 @@
resampling:

\begin{center}
\includegraphics[width=0.8\textwidth]{nested-resampling-example}
\includegraphics[width=0.8\textwidth]{images/nested-resampling-example}
\end{center}


Expand Down