Skip to content

Commit

Permalink
Merge pull request #13 from Youngsun-Lee-DS/main
Browse files Browse the repository at this point in the history
chapter 4 update
  • Loading branch information
issactoast authored Feb 27, 2022
2 parents b177c60 + 4de36ac commit e3d73ab
Show file tree
Hide file tree
Showing 62 changed files with 2,561 additions and 114 deletions.
144 changes: 144 additions & 0 deletions 04-time-series-features.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# 시계열의 특징 {#chap4}

`feats`패키지에는 **FE**atures **A**nd **S**tatistics from **T**ime **S**eries를
computing하는 함수들이 있다.
우리는 이미 시계열의 특징 몇가지를 앞에서 살펴보았다.
예를 들면, autocorrelations(자기상관)이 시계열의 특징으로 제시되었다.

## 몇가지 간단한 통계

features() 함수를 통해 평균, 최소값, 최댓값을 계산할 수 있다.

### 평균

예를 들어, tourism 데이터(분기별 호주 여행객수)를 사용하여 **mean**으로 모든 시계열의 평균을 계산할 수 있다.

```{r}
tourism %>%
janitor::clean_names() %>%
features(trips, list(mean = mean)) %>%
arrange(mean)
```
South Australia 주에 있는 캥거루 섬을 방문한 평균 방문객 수가 가장 적었다는 것을 알 수 있다.

### 사분위수

**quantile**을 통해 최소값, 제1사분위수, 중위수, 제3사분위수, 최대값을 계산할 수 있다.

```{r}
tourism %>%
janitor::clean_names() %>%
features(trips, quantile)
```
0%는 최소값을 의미하고, 100%는 최대값을 의미한다.

### ETC

list()를 통해 평균과 최소값, 제1사분위수, 중위수, 제3사분위수, 최대값을 한번에 계산할 수 있다.

```{r}
tourism %>%
janitor::clean_names() %>%
features(trips, list(avg = mean, quantile))
```

## ACF

자기 상관(Autocorrelation)을 앞서 1장에서 배웠다.

### feat_acf
자기 상관은 feat_acf를 이용하여 ACF에 관한 정보를 얻을 수 있다.

* acf1: 시계열 데이터의 1차 자기상관계수

* acf10: 1~10차 자기상관계수 제곱합

* diff1_acf1: 1차 차분 시계열의 1차 자기상관계수

* diff1_acf10: 1차 차분 시계열의 1~10차 자기상관계수 제곱합

* diff2_acf1: 2차 차분 시계열의 1차 자기상관계수

* diff2_acf10: 2차 차분 시계열의 1~10차 자기상관계수 제곱합

* season_acf1: 첫번째 계절 시차에서의 자기상관계수

```{r}
tourism %>%
janitor::clean_names() %>%
features(trips, feat_acf)
```
tourism 데이터(분기별 호주 여행객수)는 분기별 데이터이기 때문에 위 결과에서 season_acf1은 시차 4에서의 자기상관계수값을 의미한다.

## STL

STL분해는 3장에서도 언급되었다.
STL은 Seasonal and Trend decomposition using Loess의 줄임말로 robust한 시계열 분해 방법에 해당된다.

시계열 분해는 추세요소$T_{t}$, 계절요소$S_{t}$, 관측치 $y_{t}$에서 추세요소와 계절 요소를 뺀 나머지 부분인 $R_{t}$로 나누어 볼 수 있었다.

\[
y_{t}=T_{t}+S_{t}+R_{t}
\]

강한 추세를 가진 데이터의 경우, 계절 조덩된 데이터가 $R_{t}$보다 더 큰 변동을 가져야 한다.
그러므로 $\frac{var(R_{t})}{var(T_{t}+R_{t})}$는 상대적으로 작아진다.
추세의 강도는 아래와 같이 정의되며, 0과 1사이의 값을 가진다.

\[
F_{t}=max(0,1-\frac{var(R_{t})}{var(T_{t}+R_{t})})
\]

계절성의 강도는 아래와 같이 정의된다.

\[
F_{s}=max(0,1-\frac{var(R_{t})}{var(S_{t}+R_{t})})
\]

### feat_stl
feat_stl을 이용하여 STL 분해 요소를 얻을 수 있다.
추세와 계절성의 강도와 함께 아래와 같은 값들도 얻을 수 있다.

* seasonal_peak_year: 계절성이 가장 큰 시점

* seasonal_trough_year: 계절성이 가장 작은 시점

* spikiness: $R_{t}$의 분산

* linearity: $T_{t}$(추세요소)의 선형성

* curvature: $T_{t}$(추세요소)의 곡률

* stl_e_acf1: 추세요소$T_{t}$와 계절요소$S_{t}$를 제외한 나머지 계열들의 1차 자기상관계수

* stl_e_acf10: 추세요소$T_{t}$와 계절요소$S_{t}$를 제외한 나머지 계열들의 1~10차 자기상관계수 제곱합

```{r}
tourism %>%
janitor::clean_names() %>%
features(trips, feat_stl)
```

위의 결과를 x축은 트렌드한 정도를, y축은 계절적인 정도를 표현해서 아래와 같이 시각화할 수 있다.
```{r}
tourism %>%
janitor::clean_names() %>%
features(trips, feat_stl) %>%
ggplot(aes(x = trend_strength, y = seasonal_strength_year,
col = purpose)) +
geom_point() +
facet_wrap(vars(state))
```
휴가를 목적으로 하는 관광이 계절성의 강도가 가장 큰 것을 보여준다.

```{r}
tourism %>%
features(Trips, feat_stl) %>%
filter(seasonal_strength_year == max(seasonal_strength_year)) %>%
left_join(tourism, by = c("State", "Region", "Purpose")) %>%
ggplot(aes(x = Quarter, y = Trips)) +
geom_line() +
facet_grid(vars(State, Region, Purpose))
```


1 change: 1 addition & 0 deletions _bookdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,6 @@ rmd_files: ["index.Rmd",
"01-intro-to-tsibble.Rmd",
"02-timeseries-decomposition.Rmd",
"03-time-series-decomposition.Rmd",
"04-time-series-features.Rmd",
"05-the-forecasters-toolbox.Rmd",
"references.Rmd"]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified bookdown.rds
Binary file not shown.
22 changes: 9 additions & 13 deletions docs/01-intro-to-tsibble.md

Large diffs are not rendered by default.

22 changes: 11 additions & 11 deletions docs/02-timeseries-decomposition.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,17 +205,17 @@ white.noise.r
#> # A tsibble: 100 x 3 [1]
#> t y yavg
#> <int> <dbl> <dbl>
#> 1 1 -0.126 NA
#> 2 2 0.126 NA
#> 3 3 0.635 NA
#> 4 4 2.25 NA
#> 5 5 0.772 0.732
#> 6 6 -0.612 0.634
#> 7 7 0.758 0.761
#> 8 8 -0.771 0.480
#> 9 9 -0.414 -0.0534
#> 10 10 -0.241 -0.256
#> # with 90 more rows
#> 1 1 0.534 NA
#> 2 2 -0.362 NA
#> 3 3 -2.00 NA
#> 4 4 1.20 NA
#> 5 5 0.988 0.0712
#> 6 6 -0.425 -0.121
#> 7 7 -0.917 -0.232
#> 8 8 1.11 0.391
#> 9 9 0.835 0.318
#> 10 10 -0.943 -0.0678
#> # ... with 90 more rows
```

```r
Expand Down
82 changes: 41 additions & 41 deletions docs/03-time-series-decomposition.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ gdp_national %>% head(10)
#> 8 1990 미국 5963. 23889 1.9 393592 516987 252120
#> 9 1990 스웨덴 262. 30594 0.8 57507 54835 8567
#> 10 1990 스위스 266 39608 3.7 63784 69681 6653
#> # with 2 more variables: unemployment_rate <dbl>, CPI <dbl>
#> # ... with 2 more variables: unemployment_rate <dbl>, CPI <dbl>
```

```r
Expand Down Expand Up @@ -177,7 +177,7 @@ aus_production %>%
#> 8 1957 Q4 320 6152 222 582 4735 6 2.00
#> 9 1958 Q1 272 5758 199 554 4608 5 1.78
#> 10 1958 Q2 233 5641 229 620 5196 7 2.19
#> # with 208 more rows
#> # ... with 208 more rows
```

```r
Expand Down Expand Up @@ -244,17 +244,17 @@ classical_decomp
#> # Key: Series_ID, .model [1]
#> Series_ID .model Month Employed trend seasonal random season_adjust
#> <chr> <chr> <mth> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 CEU4200000001 "classi… 1990 Jan 13256. NA -75.5 NA 13331.
#> 2 CEU4200000001 "classi… 1990 Feb 12966. NA -273. NA 13239.
#> 3 CEU4200000001 "classi… 1990 Mar 12938. NA -253. NA 13191.
#> 4 CEU4200000001 "classi… 1990 Apr 13012. NA -190. NA 13203.
#> 5 CEU4200000001 "classi… 1990 May 13108. NA -88.9 NA 13197.
#> 6 CEU4200000001 "classi… 1990 Jun 13183. NA -10.4 NA 13193.
#> 7 CEU4200000001 "classi… 1990 Jul 13170. 13178. -13.3 5.65 13183.
#> 8 CEU4200000001 "classi… 1990 Aug 13160. 13161. -9.99 8.80 13169.
#> 9 CEU4200000001 "classi… 1990 Sep 13113. 13141. -87.4 59.9 13201.
#> 10 CEU4200000001 "classi… 1990 Oct 13185. 13117. 34.6 33.8 13151.
#> # with 347 more rows
#> 1 CEU4200000001 "classic~ 1990 1 13256. NA -75.5 NA 13331.
#> 2 CEU4200000001 "classic~ 1990 2 12966. NA -273. NA 13239.
#> 3 CEU4200000001 "classic~ 1990 3 12938. NA -253. NA 13191.
#> 4 CEU4200000001 "classic~ 1990 4 13012. NA -190. NA 13203.
#> 5 CEU4200000001 "classic~ 1990 5 13108. NA -88.9 NA 13197.
#> 6 CEU4200000001 "classic~ 1990 6 13183. NA -10.4 NA 13193.
#> 7 CEU4200000001 "classic~ 1990 7 13170. 13178. -13.3 5.65 13183.
#> 8 CEU4200000001 "classic~ 1990 8 13160. 13161. -9.99 8.80 13169.
#> 9 CEU4200000001 "classic~ 1990 9 13113. 13141. -87.4 59.9 13201.
#> 10 CEU4200000001 "classic~ 1990 10 13185. 13117. 34.6 33.8 13151.
#> # ... with 347 more rows
```

```r
Expand Down Expand Up @@ -475,20 +475,20 @@ seasonal_step_4

Table: (\#tab:unnamed-chunk-11)가법모형 seasonal 계산결과

| seasonal| seasonal_step_2| seasonal_step_3| seasonal_step_5|Month |
|-----------:|---------------:|---------------:|---------------:|:--------|
| -75.461230| NA| -75.48017| -75.461230|1990 Jan |
| -273.051173| NA| -273.07011| -273.051173|1990 Feb |
| -253.195856| NA| -253.21480| -253.195856|1990 Mar |
| -190.219599| NA| -190.23854| -190.219599|1990 Apr |
| -88.923022| NA| -88.94196| -88.923022|1990 May |
| -10.388349| NA| -10.40729| -10.388349|1990 Jun |
| -13.311661| -7.662500| -13.33060| -13.311661|1990 Jul |
| -9.992695| -1.195833| -10.01164| -9.992695|1990 Aug |
| -87.379333| -27.470833| -87.39828| -87.379333|1990 Sep |
| 34.634747| 68.454167| 34.61580| 34.634747|1990 Oct |
| 394.300408| 372.362500| 394.28147| 394.300408|1990 Nov |
| 572.987764| 610.708333| 572.96882| 572.987764|1990 Dec |
| seasonal| seasonal_step_2| seasonal_step_3| seasonal_step_5|Month |
|-----------:|---------------:|---------------:|---------------:|:-------|
| -75.461230| NA| -75.48017| -75.461230|1990 1 |
| -273.051173| NA| -273.07011| -273.051173|1990 2 |
| -253.195856| NA| -253.21480| -253.195856|1990 3 |
| -190.219599| NA| -190.23854| -190.219599|1990 4 |
| -88.923022| NA| -88.94196| -88.923022|1990 5 |
| -10.388349| NA| -10.40729| -10.388349|1990 6 |
| -13.311661| -7.662500| -13.33060| -13.311661|1990 7 |
| -9.992695| -1.195833| -10.01164| -9.992695|1990 8 |
| -87.379333| -27.470833| -87.39828| -87.379333|1990 9 |
| 34.634747| 68.454167| 34.61580| 34.634747|1990 10 |
| 394.300408| 372.362500| 394.28147| 394.300408|1990 11 |
| 572.987764| 610.708333| 572.96882| 572.987764|1990 12 |


```r
Expand Down Expand Up @@ -561,20 +561,20 @@ seasonal_mult <- seasonal_mult %>%

Table: (\#tab:unnamed-chunk-14)승법모형 seasonal 계산결과

| month(Month)| seasonal| seasonal_step_2| seasonal_step_3|Month | seasonal_step_4|
|------------:|---------:|---------------:|---------------:|:--------|---------------:|
| 1| 0.9949463| NA| 0.9949310|1990 Jan | 0.9949463|
| 2| 0.9814765| NA| 0.9814614|1990 Feb | 0.9814765|
| 3| 0.9827143| NA| 0.9826991|1990 Mar | 0.9827143|
| 4| 0.9869857| NA| 0.9869705|1990 Apr | 0.9869857|
| 5| 0.9938970| NA| 0.9938817|1990 May | 0.9938970|
| 6| 0.9992581| NA| 0.9992427|1990 Jun | 0.9992581|
| 7| 0.9990583| 0.9994185| 0.9990429|1990 Jul | 0.9990583|
| 8| 0.9993224| 0.9999091| 0.9993070|1990 Aug | 0.9993224|
| 9| 0.9941725| 0.9979095| 0.9941572|1990 Sep | 0.9941725|
| 10| 1.0024237| 1.0052188| 1.0024083|1990 Oct | 1.0024237|
| 11| 1.0267098| 1.0284473| 1.0266940|1990 Nov | 1.0267098|
| 12| 1.0390354| 1.0467532| 1.0390193|1990 Dec | 1.0390354|
| month(Month)| seasonal| seasonal_step_2| seasonal_step_3|Month | seasonal_step_4|
|------------:|---------:|---------------:|---------------:|:-------|---------------:|
| 1| 0.9949463| NA| 0.9949310|1990 1 | 0.9949463|
| 2| 0.9814765| NA| 0.9814614|1990 2 | 0.9814765|
| 3| 0.9827143| NA| 0.9826991|1990 3 | 0.9827143|
| 4| 0.9869857| NA| 0.9869705|1990 4 | 0.9869857|
| 5| 0.9938970| NA| 0.9938817|1990 5 | 0.9938970|
| 6| 0.9992581| NA| 0.9992427|1990 6 | 0.9992581|
| 7| 0.9990583| 0.9994185| 0.9990429|1990 7 | 0.9990583|
| 8| 0.9993224| 0.9999091| 0.9993070|1990 8 | 0.9993224|
| 9| 0.9941725| 0.9979095| 0.9941572|1990 9 | 0.9941725|
| 10| 1.0024237| 1.0052188| 1.0024083|1990 10 | 1.0024237|
| 11| 1.0267098| 1.0284473| 1.0266940|1990 11 | 1.0267098|
| 12| 1.0390354| 1.0467532| 1.0390193|1990 12 | 1.0390354|


```r
Expand Down
Loading

0 comments on commit e3d73ab

Please sign in to comment.