Skip to content

Commit d11b1c7

Browse files
committed
MLE lecture edits
1 parent 41a2667 commit d11b1c7

File tree

1 file changed

+125
-72
lines changed

1 file changed

+125
-72
lines changed

lectures/mle.md

Lines changed: 125 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@ import pandas as pd
2222
from math import exp, log
2323
```
2424

25-
2625
## Introduction
2726

2827
Consider a situation where a policymaker is trying to estimate how much revenue a proposed wealth tax
@@ -78,15 +77,16 @@ The data is derived from the
7877
[Survey of Consumer Finances](https://en.wikipedia.org/wiki/Survey_of_Consumer_Finances) (SCF).
7978

8079

81-
The following code imports this data and reads it into an array called `sample`.
80+
The following code imports this data and reads it into an array called `sample`.
8281

8382
```{code-cell} ipython3
8483
:tags: [hide-input]
84+
8585
url = 'https://media.githubusercontent.com/media/QuantEcon/high_dim_data/update_scf_noweights/SCF_plus/SCF_plus_mini_no_weights.csv'
8686
df = pd.read_csv(url)
8787
df = df.dropna()
8888
df = df[df['year'] == 2016]
89-
df = df.loc[df['n_wealth'] > 0 ] #restrcting data to net worth > 0
89+
df = df.loc[df['n_wealth'] > 1 ] #restrcting data to net worth > 1
9090
rv = df['n_wealth'].sample(n=n, random_state=1234)
9191
rv = rv.to_numpy() / 100_000
9292
sample = rv
@@ -157,22 +157,64 @@ ax.hist(ln_sample, density=True, bins=200, histtype='stepfilled', alpha=0.8)
157157
plt.show()
158158
```
159159
160+
+++ {"user_expressions": []}
161+
160162
Now our job is to obtain the maximum likelihood estimates of $\mu$ and $\sigma$, which
161163
we denote by $\hat{\mu}$ and $\hat{\sigma}$.
162164
163165
These estimates can be found by maximizing the likelihood function given the
164166
data.
165167
166-
In our case they are
168+
The pdf of a lognormally distributed random variable $X$ is given by:
169+
$$
170+
f(x) = \frac{1}{x}\frac{1}{\sigma \sqrt{2\pi}} exp\left(\frac{-1}{2}\left(\frac{\ln x-\mu}{\sigma}\right)\right)^2
171+
$$
167172
173+
Since $\ln X$ is normally distributed this is the same as
168174
$$
169-
\hat{\mu} = \frac{\sum_{i=1}^{n} \ln w_i}{n}
170-
\quad \text{and} \quad
171-
\hat{\sigma}
172-
= \left( \frac{\sum_{i=1}^{n}(\ln w_i - \hat{\mu})^2}{n} \right)^{1/2}
175+
f(x) = \frac{1}{x} \phi(x)
176+
$$
177+
where $\phi$ is the pdf of $\ln X$ which is normally distibuted with mean $\mu$ and variance $\sigma ^2$.
178+
179+
For a sample $x = (x_1, x_2, \cdots, x_n)$ the _likelihood function_ is given by:
180+
$$
181+
\begin{aligned}
182+
L(\mu, \sigma | x_i) = \prod_{i=1}^{n} f(\mu, \sigma | x_i) \\
183+
L(\mu, \sigma | x_i) = \prod_{i=1}^{n} \frac{1}{x_i} \phi(\ln x_i)
184+
\end{aligned}
185+
$$
186+
187+
Taking $\log$ on both sides gives us the _log likelihood function_ which is:
188+
$$
189+
\begin{aligned}
190+
l(\mu, \sigma | x_i) = -\sum_{i=1}^{n} \ln x_i + \sum_{i=1}^n \phi(\ln x_i) \\
191+
l(\mu, \sigma | x_i) = -\sum_{i=1}^{n} \ln x_i - \frac{n}{2} \ln(2\pi) - \frac{n}{2} \ln \sigma^2 - \frac{1}{2\sigma^2}
192+
\sum_{i=1}^n (\ln x_i - \mu)^2
193+
\end{aligned}
173194
$$
174195
175-
Let's calculate these values
196+
To find where this function is maximised we find its partial derivatives wrt $\mu$ and $\sigma ^2$ and equate them to $0$.
197+
198+
Let's first find the MLE of $\mu$,
199+
$$
200+
\begin{aligned}
201+
\frac{\delta l}{\delta \mu} = - \frac{1}{2\sigma^2} \times 2 \sum_{i=1}^n (\ln x_i - \mu) = 0 \\
202+
\Rightarrow \sum_{i=1}^n \ln x_i - n \mu = 0 \\
203+
\Rightarrow \hat{\mu} = \frac{\sum_{i=1}^n \ln x_i}{n}
204+
\end{aligned}
205+
$$
206+
207+
Now let's find the MLE of $\sigma$,
208+
$$
209+
\begin{aligned}
210+
\frac{\delta l}{\delta \sigma^2} = - \frac{n}{2\sigma^2} + \frac{1}{2\sigma^4} \sum_{i=1}^n (\ln x_i - \mu)^2 = 0 \\
211+
\Rightarrow \frac{n}{2\sigma^2} = \frac{1}{2\sigma^4} \sum_{i=1}^n (\ln x_i - \mu)^2 \\
212+
\Rightarrow \hat{\sigma} = \left( \frac{\sum_{i=1}^{n}(\ln x_i - \hat{\mu})^2}{n} \right)^{1/2}
213+
\end{aligned}
214+
$$
215+
216+
Now that we have derived the expressions for $\hat{\mu}$ and $\hat{\sigma}$,
217+
let's compute them for our wealth sample.
176218
177219
```{code-cell} ipython3
178220
μ_hat = np.mean(ln_sample)
@@ -200,7 +242,6 @@ ax.legend()
200242
plt.show()
201243
```
202244
203-
204245
Our estimated lognormal distribution appears to be a decent fit for the overall data.
205246
206247
We now use {eq}`eq:est_rev` to calculate total revenue.
@@ -268,7 +309,6 @@ tr_pareto
268309
269310
The number is very different!
270311
271-
272312
```{code-cell} ipython3
273313
tr_pareto / tr_lognorm
274314
```
@@ -280,7 +320,6 @@ We see that choosing the right distribution is extremely important.
280320
Let's compare the fitted Pareto distribution to the histogram:
281321
282322
```{code-cell} ipython3
283-
284323
fig, ax = plt.subplots()
285324
ax.set_xlim(-1, 20)
286325
ax.set_ylim(0,1.75)
@@ -292,68 +331,11 @@ ax.legend()
292331
plt.show()
293332
```
294333
334+
+++ {"user_expressions": []}
335+
295336
We observe that in this case the fit for the Pareto distribution is not very
296337
good, so we can probably reject it.
297338
298-
299-
300-
## Exponential distribution
301-
302-
What other distributions could we try?
303-
304-
Suppose we assume that the distribution is [exponential](https://en.wikipedia.org/wiki/Exponential_distribution)
305-
with parameter $\lambda > 0$.
306-
307-
The maximum likelihood estimate of $\lambda$ is given by
308-
309-
$$
310-
\hat{\lambda} = \frac{n}{\sum_{i=1}^n w_i}
311-
$$
312-
313-
Let's calculate it.
314-
315-
```{code-cell} ipython3
316-
λ_hat = 1/np.mean(sample)
317-
λ_hat
318-
```
319-
320-
Now let's compute total revenue:
321-
322-
```{code-cell} ipython3
323-
dist_exp = expon(scale = 1/λ_hat)
324-
tr_expo = total_revenue(dist_exp)
325-
tr_expo
326-
```
327-
328-
Again, calculated revenue is very different.
329-
330-
```{code-cell} ipython3
331-
tr_expo / tr_lognorm
332-
```
333-
334-
But once again, when we plot the fitted distribution against the data we see it
335-
is a bad fit.
336-
337-
338-
```{code-cell} ipython3
339-
340-
fig, ax = plt.subplots()
341-
ax.set_xlim(-1, 20)
342-
343-
ax.hist(sample, density=True, bins=5000, histtype='stepfilled', alpha=0.5)
344-
ax.plot(x, dist_exp.pdf(x), 'k-', lw=0.5, label='exponential pdf')
345-
ax.legend()
346-
347-
plt.show()
348-
```
349-
350-
So we can reject this calculation.
351-
352-
353-
354-
355-
356-
357339
## What is the best distribution?
358340
359341
There is no "best" distribution --- every choice we make is an assumption.
@@ -370,6 +352,7 @@ We set an arbitrary threshold of $500,000 and read the data into `sample_tail`.
370352
371353
```{code-cell} ipython3
372354
:tags: [hide-input]
355+
373356
df_tail = df.loc[df['n_wealth'] > 500_000 ]
374357
df_tail.head()
375358
rv_tail = df_tail['n_wealth'].sample(n=10_000, random_state=4321)
@@ -410,7 +393,6 @@ ax.legend()
410393
plt.show()
411394
```
412395
413-
414396
While the lognormal distribution was a good fit for the entire dataset,
415397
it is not a good fit for the right hand tail.
416398
@@ -435,6 +417,7 @@ ax.plot(x, dist_pareto_tail.pdf(x), 'k-', lw=0.5, label='pareto pdf')
435417
plt.show()
436418
```
437419
420+
+++ {"user_expressions": []}
438421
439422
The Pareto distribution is a better fit for the right hand tail of our dataset.
440423
@@ -451,3 +434,73 @@ There are other more rigorous tests, such as the [Kolmogorov-Smirnov test](https
451434
452435
We omit the details.
453436
437+
## Exercises
438+
439+
```{exercise-start}
440+
:label: mle_ex1
441+
```
442+
Suppose we assume wealth is [exponentially](https://en.wikipedia.org/wiki/Exponential_distribution)
443+
distributed with parameter $\lambda > 0$.
444+
445+
The maximum likelihood estimate of $\lambda$ is given by
446+
447+
$$
448+
\hat{\lambda} = \frac{n}{\sum_{i=1}^n w_i}
449+
$$
450+
451+
1. Compute $\hat{\lambda}$ for our initial sample.
452+
2. Use $\hat{\lambda}$ to find the total revenue
453+
454+
```{exercise-end}
455+
```
456+
457+
```{solution-start} mle_ex1
458+
:class: dropdown
459+
```
460+
461+
```{code-cell} ipython3
462+
λ_hat = 1/np.mean(sample)
463+
λ_hat
464+
```
465+
466+
```{code-cell} ipython3
467+
dist_exp = expon(scale = 1/λ_hat)
468+
tr_expo = total_revenue(dist_exp)
469+
tr_expo
470+
```
471+
472+
+++ {"user_expressions": []}
473+
474+
```{solution-end}
475+
```
476+
477+
```{exercise-start}
478+
:label: mle_ex2
479+
```
480+
481+
Plot the exponential distribution against the sample and check if it is a good fit or not.
482+
483+
```{exercise-end}
484+
```
485+
486+
```{solution-start} mle_ex2
487+
:class: dropdown
488+
```
489+
490+
```{code-cell} ipython3
491+
fig, ax = plt.subplots()
492+
ax.set_xlim(-1, 20)
493+
494+
ax.hist(sample, density=True, bins=5000, histtype='stepfilled', alpha=0.5)
495+
ax.plot(x, dist_exp.pdf(x), 'k-', lw=0.5, label='exponential pdf')
496+
ax.legend()
497+
498+
plt.show()
499+
```
500+
501+
+++ {"user_expressions": []}
502+
503+
Clearly, this distribution is not a good fit for our data.
504+
505+
```{solution-end}
506+
```

0 commit comments

Comments
 (0)