update: 2025-01-20 ~ 2025-01-26 (#74)

haeramkeem · Jan 27, 2025 · abde334 · abde334
1 parent 7f2fe4b
commit abde334
Show file tree

Hide file tree

Showing 8 changed files with 134 additions and 11 deletions.
diff --git a/content/gardens/arch/drafts/Processor Event Based Sampling, PEBS (Intel Arch).md b/content/gardens/arch/drafts/Processor Event Based Sampling, PEBS (Intel Arch).md
@@ -0,0 +1,10 @@
+---
+tags:
+  - arch
+  - arch-intel
+aliases:
+  - Processor Event Based Sampling
+  - PEBS
+---
+> [!fail]- 본 글은 #draft 상태입니다.
+> - [ ] 내용 추가
diff --git a/content/gardens/database/common/story/Database - 어떤 언어가 하탈까?.md b/content/gardens/database/common/story/Database - 어떤 언어가 하탈까?.md
@@ -7,12 +7,12 @@ date: 2024-07-04
 
 - Database 개발에는 어떤 언어가 유리할까?
 
-|                 | C                                                                                                | C++                                                                                                                                          | Java                                                                                                          |
-| --------------- | ------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------- |
-| RDBMS           | [PostgreSQL](https://github.com/postgres/postgres)<br>[SQLite](https://github.com/sqlite/sqlite) | [MariaDB](https://github.com/MariaDB/server)<br>[MySQL](https://github.com/mysql/mysql-server)<br>[DuckDB](https://github.com/duckdb/duckdb) |                                                                                                               |
-| NoSQL           |                                                                                                  | [MongoDB](https://github.com/mongodb/mongo)                                                                                                  | [Elasticsearch](https://github.com/elastic/elasticsearch)<br>[Cassandra](https://github.com/apache/cassandra) |
-| In-memory       | [Redis](https://github.com/redis/redis)<br>[Memcached](https://github.com/memcached/memcached)   | [CachLib](https://github.com/facebook/CacheLib)                                                                                              |                                                                                                               |
-| Engine/KV Store | [Wildtiger](https://github.com/wiredtiger/wiredtiger)                                            | [RocksDB](https://github.com/facebook/rocksdb)<br>[LevelDB](https://github.com/google/leveldb)                                               |                                                                                                               |
+|                 | C                                                                                                | C++                                                                                                                                                                                                                                                  | Java                                                                                                          | Go                                                                                                 |
+| --------------- | ------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
+| Relational      | [PostgreSQL](https://github.com/postgres/postgres)<br>[SQLite](https://github.com/sqlite/sqlite) | [MariaDB](https://github.com/MariaDB/server)<br>[MySQL](https://github.com/mysql/mysql-server)<br>[DuckDB](https://github.com/duckdb/duckdb)<br>[ClickHouse](https://github.com/ClickHouse/ClickHouse)<br>[Impala](https://github.com/apache/impala) |                                                                                                               | [CockroachDB](https://github.com/cockroachdb/cockroach)<br>[TiDB](https://github.com/pingcap/tidb) |
+| NoSQL           |                                                                                                  | [MongoDB](https://github.com/mongodb/mongo)<br>[ScyllaDB](https://github.com/scylladb/scylladb)<br>[RethinkDB](https://github.com/rethinkdb/rethinkdb)                                                                                               | [Elasticsearch](https://github.com/elastic/elasticsearch)<br>[Cassandra](https://github.com/apache/cassandra) |                                                                                                    |
+| In-memory       | [Redis](https://github.com/redis/redis)<br>[Memcached](https://github.com/memcached/memcached)   | [CachLib](https://github.com/facebook/CacheLib)                                                                                                                                                                                                      |                                                                                                               |                                                                                                    |
+| Engine/KV Store | [Wildtiger](https://github.com/wiredtiger/wiredtiger)                                            | [RocksDB](https://github.com/facebook/rocksdb)<br>[LevelDB](https://github.com/google/leveldb)<br>[FoundationDB](https://github.com/apple/foundationdb)<br>[ForestDB](https://github.com/couchbase/forestdb)                                         |                                                                                                               | [etcd](https://github.com/etcd-io/etcd)                                                            |
 
 - 혹시나 했는데 Go 는 Go의 없다.
 - [[(Garden) C, Cpp|C, C++]] 나 열심히 하자.
diff --git a/...24.sosp.sigops.org/(논문) Tiered Memory Management - Access Latency is the Key.md b/...24.sosp.sigops.org/(논문) Tiered Memory Management - Access Latency is the Key.md
@@ -5,6 +5,8 @@ tags:
   - os-memory
 date: 2025-01-14
 title: "(논문) Tiered Memory Management: Access Latency is the Key!"
+aliases:
+  - Colloid
 ---
 > [!info] Colloid 링크
 > - [논문](https://dl.acm.org/doi/10.1145/3694715.3695968)

diff --git a/...emory/papers/colloid.2024.sosp.sigops.org/full/3. Colloid (Colloid, SOSP 24).md b/...emory/papers/colloid.2024.sosp.sigops.org/full/3. Colloid (Colloid, SOSP 24).md
@@ -111,7 +111,7 @@ $$
 p = {R_{D} \over R_{D} + R_{A}}
 $$
 
-- 그리고 Little's law 에 의해 $L_{D}$ 와 $L_{A}$ 다음과 같다.
+- 그리고 [[Little's Law (Storage)|Little's Law]] 에 의해 $L_{D}$ 와 $L_{A}$ 다음과 같다.
 
 $$
 L_{D} = {O_{D}\over R_{D}}, L_{A} = {O_{A}\over R_{A}}

diff --git a/....org/full/4. Colloid with Existing Memory Tiering Systems (Colloid, SOSP 24).md b/....org/full/4. Colloid with Existing Memory Tiering Systems (Colloid, SOSP 24).md
@@ -3,7 +3,7 @@ tags:
   - os
   - os-memory
   - paper-review
-date: 2025-01-14
+date: 2025-01-25
 title: "(논문) Tiered Memory Management: Access Latency is the Key!, SOSP 2024 (4. Colloid with existing memory tiering systems)"
 ---
 > [!info] 본 글은 논문 [Tiered Memory Management: Access Latency is the Key! (SOSP 2024)](https://dl.acm.org/doi/10.1145/3694715.3695968) 를 읽고 정리한 글입니다.
@@ -18,4 +18,48 @@ title: "(논문) Tiered Memory Management: Access Latency is the Key!, SOSP 2024
 > - [[5. Evaluation (Colloid, SOSP 24)|5. Evaluation]]
 > - [[6-7. Related Work and Conclusion (Colloid, SOSP 24)|6-7. Related Work and Conclusion]]
 
-> [!fail] 본 글은 아직 #draft 상태입니다.
+## 4.0. Overview
+
+> [!tip] NXSECTION
+> - `4.0` 은 overview 로, 논문에는 이런 section 은 없다.
+
+- [[3. Colloid (Colloid, SOSP 24)|Section 3]] 의 내용을 정리해 보자면, *Colloid* 는:
+	- *Latency measurement*: Latency 측정
+	- *Page placement algorithm*: 측정한 latency 에 따라 default 및 alternate tier 에 몇개의 hot page 를 배치할지 결정
+- 의 역할을 한다고 할 수 있다. 그러면 tiered memory 를 위해서는 다음의 두개가 더 필요하다는 것을 생각할 수 있다:
+	- *Access tracking*: 어떤 page 가 hot 인가?
+	- *Page migration strategy*: 어떻게 page 를 옮길것인가?
+- 위의 두개의 역할은 *Colloid* 에서는 담당하지 않고, 기존의 tiered memory system 들에서 담당한다.
+- 즉, [[4. Colloid with Existing Memory Tiering Systems (Colloid, SOSP 24)|Section 4]] 에서는 기존의 SOTA tiered memory system 인 [[(논문) HeMem - Scalable Tiered Memory Management for Big Data Applications and Real NVM|HeMem]], [[(논문) MEMTIS - Efficient Memory Tiering with Dynamic Page Classification and Page Size Determination|MEMTIS]], [[(논문) TPP - Transparent Page Placement for CXL-Enabled Tiered-Memory|TPP]] 들에 *Colloid* 를 통합시키는 implementation detail 에 대해 설명한다.
+- 구체적으로는, 위의 세 system 에 다음의 것들을 추가적으로 구현하는 과정에 대해 설명한다:
+	- 각 system 에 대한 latency measurement 구현
+	- 각 system 에 대한 page placement algorithm 구현
+	- 어떤 page 를 migration 할 것인가
+		- 즉, 각 page 들의 access probability ($p$) 를 구하여 어떤 page 를 migration 할 것인지 결정
+- 그리고 다음의 것들은 기존의 것들을 그대로 사용한다.
+	- 각 system 들의 access tracking 방식
+		- 이 부분에서 *Colloid* 의 access probability 와 좀 헷갈릴 수 있는데,
+		- Page 의 access tracking 하는 것은 기존의 방식을 사용하고, *Colloid* 에서는 이 access tracking 을 통해 알아낸 정보들로 access probability 를 계산하여 latency balancing 을 하는 것이다.
+	- 각 system 들의 page migration strategy 방식 (어떻게 옮길것인가? 언제 옮길것이냐? 등)
+
+## 4.1. HeMem with Colloid
+
+- HeMem 의 작동 과정을 간단히 살펴보면
+	1. Busy-polling thread 를 이용해 일정 기간마다 [[Processor Event Based Sampling, PEBS (Intel Arch)|Processor Event Based Sampling, PEBS]] 를 측정하여 page 별 access frequency 를 측정한다.
+	2. 각 tier 에 있는 page 들에 대한 hot list 와 cool list 를 유지하고, PEBS 로 측정한 frequency 가 일정 threshold 를 넘으면 hot list 에 추가되는 방식이다.
+	3. 그리고 이 측정한 frequency 에 대한 또 다른 threshold (`COOLING_THRESHOLD`) 가 있는데, 어떤 page 가 이 threshold 에 도달하게 되면 모든 page 의 frequency 가 절반이 되는 식으로 cooling 이 이루어진다.
+	4. Page migration 은 10ms 의 fixed quantum 마다 asynchronous 하게 진행된다.
+- 여기서 *Colloid* 를 위한 추가 구현 사항은 다음과 같다.
+	- Latency 를 측정하는 것은 (4) 번에서의 page migration thread 에서 담당한다.
+		- 즉, (4) 번 thread 에서 10ms 마다 page migration 을 할 때 [[Caching Home Agent, CHA (Intel Arch)|CHA]] 를 읽어들여 queue occupancy 와 request rate 를 측정하는 것.
+	- *Colloid* 의 page placement algorithm 또한 (4) 번 thread 에 구현된다.
+		- 즉, [[3. Colloid (Colloid, SOSP 24)|Section 3]] 에서 말한 대로 $\Delta p$ 를 계산하고,
+		- (1) 번에서 PEBS 로 측정한 per-page frequency 를 이용해 per-page access probability 를 계산한다.
+			- 구체적으로는 HeMem 에서 PEBS 로 per-page frequency 를 측정하였으니,
+			- Per-page frequency 를 모든 frequency 의 총합으로 나누어 per-page access probability 를 계산하는 것이다.
+		- 그리고 이렇게 구한 per-page access probability 를 이용해 $\Delta p$ 를 맞추기 위한 page 들을 선정한다.
+			- 이때 더 효율적으로 page selection 을 하기 위해 HeMem 와 좀 다른 list 들을 관리한다.
+			- HeMem 에서는 0 ~ `COOLING_THRESHOLD` 까지의 frequency 범위를 두개로 나눠 높은 쪽에 들어가는 page 들은 hot list 로 관리하고 낮은 쪽에 들어가는 page 들은 cool list 로 관리했다면,
+			- *Colloid* 에서는 이 frequency 범위를 5등분한 *bin* 이라는 범위 단위를 이용해 5개의 list 로 관리한다.
+			- 그래서 높은 범위에 대한 bin 부터 뒤지며 page 들의 $p$ 합이 $\Delta p$ 보다 작거나 갖게 되도록 하는 식으로 page 들을 고르게 된다.
+			- 즉, 이것은 page migration overhead 를 최소화 하기 위해 자연스레 $p$ 를 정렬하는 효과를 가진다: $p$ 가 높은애들부터 확인하여 $p$ 의 총합이 $\Delta p$ 보다는 작거나 같으면서도 최소한의 page 들을 migration 하게 된다.
diff --git a/content/gardens/storage/(Garden) Storage.md b/content/gardens/storage/(Garden) Storage.md
@@ -12,6 +12,7 @@ date: 2024-04-23
 - 용어집
 	- [[Input Output per second, IOps (storage)|IOps]]
 	- [[Latency (Storage)|Latency]]
+	- [[Little's Law (Storage)|Little's Law]]
 	- [[Logical Block Addressing, LBA (Storage)|Logical Block Addressing, LBA]]
 	- [[Throughput (Storage)|Throughput]]
 

diff --git a/content/gardens/storage/common/terms/Little's Law (Storage).md b/content/gardens/storage/common/terms/Little's Law (Storage).md
@@ -0,0 +1,59 @@
+---
+tags:
+  - storage
+  - terms
+date: 2025-01-20
+aliases:
+  - Little's Law
+---
+> [!info]- 참고한 것들
+> - [위키](https://en.wikipedia.org/wiki/Little%27s_law)
+> - [티스토리](https://performance.tistory.com/3)
+
+## 난쟁이가 쏘아올린 작은 법칙
+
+- 이건 *John Little* 이라는 사람이 만든 queueing theory 이다.
+- 가령 다음의 예시를 생각해 보자.
+
+> [[index|주인장 김씨]] 가 용산구 윤씨의 재판 과정을 보기 위해 헌법재판소에 방문했다고 해보자.
+> 근데 사람이 너무나 많은 것이었다! 김씨가 세어보니 2시간동안 200명이 헌법재판소에 방문해서 줄을 섰고, 한 사람당 대기 시간은 평균 30분이었다.
+> 이때, 대기줄의 평균 길이는 얼마일까?
+
+- 생각해 보면
+	- 대기줄의 길이가 평균 1이었다고 가정해 보자. 그럼 한 사람당 대기시간의 평균은 30분 (0.5시간) 이기 때문에 100명이 대기하려면 50시간이 걸렸을 것이다.
+	- 이번에는 대기줄의 길이가 평균 2였다고 가정해 보자. 그럼 두사람이 30분을 대기하고 입장할 것이기 때문에 100명이 대기하고 들어가기까지는 25시간이 걸릴 것이다.
+	- 근데 실제로는 100명이 대기하는데에는 1시간이 걸렸으므로 (2시간동안 200명이므로) 대기줄의 길이는 평균 50인 것을 알 수 있다.
+- 즉, 이것은 바꾸어 말하면 (시간당 방문자 수) $\times$ (대기시간의 평균) = (대기열의 평균 길이) 가 된다는 것을 알 수 있다.
+	- 위의 예시에서 (시간당 방문자 수) 는 100명이고,
+	- (대기시간의 평균) 은 0.5 시간이었으므로
+	- 이 둘을 곱한 50명이 (대기열의 평균 길이) 가 되는 것이다.
+- 이것을 수식으로 표현해 보면 다음과 같다.
+	- (시간당 방문자 수) 를 $\lambda$ 라고 하고,
+	- (대기시간의 평균) 을 $W$ 라고 하며
+	- (대기열의 평균 길이) 를 $L$ 라고 하면
+
+$$
+L = \lambda \times W
+$$
+
+- 가 된다.
+- 이것을 Queueing theory 에도 적용할 수가 있다.
+- 가령 네트워크 패킷을 처리하는 NIC 의 queue 의 경우에 이 법칙을 적용해 보자.
+	- 그럼 $L$ 은 queue 의 평균 occupancy 가 될 것이고
+	- $\lambda$ 는 패킷 수신 속도가 되며 (단위시간당 수신된 패킷의 양)
+	- $W$ 은 queue 에 머무른 시간이 된다.
+		- 만약에 packet 을 처리한 뒤에 queue 에서 해당 request 를 삭제한다면, 이 값은 *Response time* 이 된다.
+		- 또한 이것은 *Latency* 로도 생각할 수 있다.
+
+## 활용
+
+- 이 수식을 활용해 latency 를 계산할 수도 있다. [[(논문) Tiered Memory Management - Access Latency is the Key|Colloid]] 에서의 용례를 그대로 가져와 보면,
+- [[Caching Home Agent, CHA (Intel Arch)|CHA]] 에서 제공하는 memory request 의 occupancy ($O$) 와 arrival rate ($R$) 를 위의 수식에 넣으면 memory request latency ($D$) 를 구할 수 있다.
+- Occupancy 는 Little's law 에서 $L$ 에 해당하고, arrival rate 은 $\lambda$ 에 해당하며 latency 는 $W$ 에 해당한다.
+- 따라서 위의 수식에 따라 latency ($D$) 는:
+
+$$
+D = {O \over R}
+$$
+
+- 이 된다.
diff --git a/content/index.md b/content/index.md
@@ -1,5 +1,13 @@
 ---
 title: 매디쏜 디지딸 갈든
+date: 2023-10-24
+aliases:
+  - 김해람
+  - 주인장
+  - 김씨
+  - 주인장 김씨
+  - 동작구 김씨
+  - 디지털 귀농
 ---
 <a href="https://mdg.haeramk.im">
     <div align="center">
@@ -9,7 +17,7 @@ title: 매디쏜 디지딸 갈든
 
 ## 김해람과 디지털 귀농
 
-뉴욕에 있는 [Madison Square Garden](https://en.wikipedia.org/wiki/Madison_Square_Garden) 을 따서 Madison Digital Garden 라고 이름은 지었지만 아직 [디지털 가드닝](https://maggieappleton.com/garden-history) 이 뭔지 잘 모르겠습니다. 이곳은 예쁘고 아기자기하게 꾸민 정원 (Garden) 보다는 먹고살기 위한 농사 (Farming) 가 좀더 어울리지 않나 싶어요. 그래서 여기는 디지털 가드닝의 공간이라기 보다는, 도시사람의 먹고살기위한 농사, 즉 [[index|디지털 귀농]] 의 공간입니다.
+뉴욕에 있는 [Madison Square Garden](https://en.wikipedia.org/wiki/Madison_Square_Garden) 을 따서 Madison Digital Garden 라고 이름은 지었지만 아직 [디지털 가드닝](https://maggieappleton.com/garden-history) 이 뭔지 잘 모르겠습니다. 이곳은 예쁘고 아기자기하게 꾸민 정원 (Garden) 보다는 먹고살기 위한 농사 (Farming) 에 좀 더 가깝지 않나 싶어요. 그래서 여기는 디지털 가드닝의 공간이라기 보다는, 도시사람의 먹고살기위한 농사, 즉 [[index|디지털 귀농]] 의 공간입니다.
 
 대강 이런 작물들을 심고 있습니다.
 
@@ -28,7 +36,6 @@ title: 매디쏜 디지딸 갈든
 - [[(Garden) Storage|스토리지]]
 - [[(Garden) Software Engineering|소프트웨어 공학]]
 - [[(Garden) Web|웹개발]]
-- [[아까이브 갈든 - Archive Garden|아까이브 갈든]] : 여기에는 정리는 안돼있지만 버리기에는 "아까" 운 작물들 (그냥 창고) 이 있습니다.
 
 그리고 저는 정원일 혹은 농사일 둘 다와 무관한 [이런걸로](https://www.linkedin.com/in/haeram-kim-277404220) 먹고사는 사람입니다.