diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4e83a68c547..aa1032f3d3f 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -90,8 +90,6 @@ cargo fmt make clippy ``` -See [Rustdoc of TiKV](https://tikv.github.io) for TiKV code documentation. - See the [style doc](https://github.com/rust-lang/rfcs/blob/master/style-guide/README.md) and the [API guidelines](https://rust-lang-nursery.github.io/api-guidelines/) for details on the conventions. Please follow this style to make TiKV easy to review, maintain, and develop. @@ -133,6 +131,8 @@ This is a rough outline of what a contributor's workflow looks like: - Our CI system automatically tests all pull requests. - Our bot will merge your PR. It can be summoned by commenting `/merge` or adding the `S: CanMerge` label (requires tests to pass and two approvals. You might have to ask your reviewer to do this). +See [Rustdoc of TiKV](https://tikv.github.io) for TiKV code documentation. + Thanks for your contributions! ### Finding something to work on @@ -149,7 +149,7 @@ The TiKV team actively develops and maintains a bunch of dependencies used in Ti - [grpc-rs](https://github.com/tikv/grpc-rs): The gRPC library for Rust built on the gRPC C Core library and Rust Futures - [fail-rs](https://github.com/tikv/fail-rs): Fail points for Rust -See more on [TiKV Community](https://github.com/tikv/community). +See more in [TiKV Community](https://github.com/tikv/community). ### Format of the commit message diff --git a/README.md b/README.md index 9a7eec5d84e..191f474444b 100644 --- a/README.md +++ b/README.md @@ -86,6 +86,7 @@ We provide multiple deployment methods, but it is recommended to use our Ansible You can use [`tidb-docker-compose`](https://github.com/pingcap/tidb-docker-compose/) to quickly test TiKV and TiDB on a single machine. This is the easiest way. For other ways, see [TiDB documentation](https://docs.pingcap.com/). - Try TiKV separately + - [Deploy TiKV Using Docker Stack](https://tikv.org/docs/4.0/tasks/try/docker-stack/): To quickly test TiKV separately without TiDB on a single machine - [Deploy TiKV Using Docker](https://tikv.org/docs/4.0/tasks/deploy/docker/): To deploy a multi-node TiKV testing cluster using Docker - [Deploy TiKV Using Binary Files](https://tikv.org/docs/4.0/tasks/deploy/binary/): To deploy a TiKV cluster using binary files on a single node or on multiple nodes @@ -124,13 +125,13 @@ Quick links: ### Security audit -A third-party security auditing was performed by Cure53. See the full report [here](./docs/Security-Audit.pdf). +A third-party security auditing was performed by Cure53. See the full report [here](./security/Security-Audit.pdf). ### Reporting Security Vulnerabilities To report a security vulnerability, please send an email to [TiKV-security](mailto:tikv-security@lists.cncf.io) group. -See [Security](./SECURITY.md) for the process and policy followed by the TiKV project. +See [Security](./security/SECURITY.md) for the process and policy followed by the TiKV project. ## Communication diff --git a/docs/2018-ROADMAP.md b/docs/2018-ROADMAP.md deleted file mode 100644 index 64433187b4e..00000000000 --- a/docs/2018-ROADMAP.md +++ /dev/null @@ -1,52 +0,0 @@ ---- -title: TiKV Roadmap -category: Roadmap ---- - -# TiKV Roadmap - -This document defines the roadmap for TiKV development. - -## Raft - -- [x] Region Merge - Merge small Regions together to reduce overhead -- [x] Local Read Thread - Process read requests in a local read thread -- [x] Split Region in Batch - Speed up Region split for large Regions -- [x] Raft Learner - Support Raft learner to smooth the configuration change process -- [x] Raft Pre-vote - Support Raft pre-vote to avoid unnecessary leader election on network isolation -- [ ] Joint Consensus - Change multi members safely -- [X] Multi-thread Raftstore - Process Region Raft logic in multiple threads -- [X] Multi-thread Apply Pool - Apply Region Raft committed entries in multiple threads - -## Engine - -- [ ] Titan - Separate large key-values from LSM-Tree -- [ ] Pluggable Engine Interface - Clean up the engine wrapper code and provide more extendibility - -## Storage - -- [ ] Flow Control - Do flow control in scheduler to avoid write stall in advance - -## Transaction - -- [X] Optimize transaction conflicts -- [X] Distributed GC - Distribute MVCC garbage collection control to TiKV - -## Coprocessor - -- [x] Streaming - Cut large data set into small chunks to optimize memory consumption -- [ ] Chunk Execution - Process data in chunk to improve performance -- [ ] Request Tracing - Provide per-request execution details - -## Tools - -- [x] TiKV Importer - Speed up data importing by SST file ingestion - -## Client - -- [ ] TiKV Client (Rust crate) -- [ ] Batch gRPC Message - Reduce message overhead - -## PD - -- [ ] Optimize Region Metadata - Save Region metadata in detached storage engine diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md deleted file mode 100644 index e594c6b75ca..00000000000 --- a/docs/ROADMAP.md +++ /dev/null @@ -1,66 +0,0 @@ ---- -title: TiKV Roadmap -category: Roadmap ---- - -# TiKV Roadmap - -This document defines the roadmap for TiKV development. - -## Engine - -- [ ] Titan - Use the separated key-value engine in production -- [ ] Pluggable Engine Interface - Clean up the engine wrapper code and provide more extendibility - -## Raft - -- [ ] Joint Consensus - Change multi members safely -- [ ] Quiescent Region - Reduce the heartbeat cost of inactive Regions -- [ ] Raft engine - Customize engine to replace RocksDB -- [ ] Remove KV RocksDB WAL - Use Raft log as the WAL of state machine engine -- [ ] Crossing datacenter optimization - - [ ] Raft witness role - Can only vote but not sync logs from Raft leader - - [ ] Follower snapshot - Get the snapshot from the follower when a new node is added - - [ ] Chain replication - Get Raft logs from the non-leader node - -## Distributed Transaction - -- [ ] Flow control - Do flow control in scheduler to avoid write stall in advance -- [ ] History version - Use another place to save history version data -- [ ] Performance - - [ ] Optimize storage format - - [ ] Reduce the latency of getting timestamp - - [ ] Optimize transaction conflicts - -## Coprocessor - -- [ ] More pushdown expressions support -- [ ] Common expression pipeline optimization -- [ ] Batch executor - Process multiple rows at a time -- [ ] Vectorized expression evaluation - Evaluate expressions by column -- [ ] Chunk - Compute over the data stored in the continuous memory -- [ ] Explain support - Evaluate the cost of execution - -## Tools - -- [ ] Use Raft learner to support Backup/Restore/Replication - -## Client - -- [ ] Rust -- [ ] Go -- [ ] Java -- [ ] C++ - -## PD - -- [ ] Visualization - Show the cluster status visually -- [ ] Learner scheduler - Schedule learner Regions to specified nodes -- [ ] Improve hot Region scheduler -- [ ] Improve simulator -- [ ] Range scheduler - Schedule Regions in the specified range - -## Misc - -- [ ] Tolerate corrupted Region - TiKV can be started even when the data in some Regions is corrupted -- [ ] Support huge Region whose size is larger than 1 GB \ No newline at end of file diff --git a/docs/V2.1/2018-ROADMAP.md b/docs/V2.1/2018-ROADMAP.md deleted file mode 100644 index 64433187b4e..00000000000 --- a/docs/V2.1/2018-ROADMAP.md +++ /dev/null @@ -1,52 +0,0 @@ ---- -title: TiKV Roadmap -category: Roadmap ---- - -# TiKV Roadmap - -This document defines the roadmap for TiKV development. - -## Raft - -- [x] Region Merge - Merge small Regions together to reduce overhead -- [x] Local Read Thread - Process read requests in a local read thread -- [x] Split Region in Batch - Speed up Region split for large Regions -- [x] Raft Learner - Support Raft learner to smooth the configuration change process -- [x] Raft Pre-vote - Support Raft pre-vote to avoid unnecessary leader election on network isolation -- [ ] Joint Consensus - Change multi members safely -- [X] Multi-thread Raftstore - Process Region Raft logic in multiple threads -- [X] Multi-thread Apply Pool - Apply Region Raft committed entries in multiple threads - -## Engine - -- [ ] Titan - Separate large key-values from LSM-Tree -- [ ] Pluggable Engine Interface - Clean up the engine wrapper code and provide more extendibility - -## Storage - -- [ ] Flow Control - Do flow control in scheduler to avoid write stall in advance - -## Transaction - -- [X] Optimize transaction conflicts -- [X] Distributed GC - Distribute MVCC garbage collection control to TiKV - -## Coprocessor - -- [x] Streaming - Cut large data set into small chunks to optimize memory consumption -- [ ] Chunk Execution - Process data in chunk to improve performance -- [ ] Request Tracing - Provide per-request execution details - -## Tools - -- [x] TiKV Importer - Speed up data importing by SST file ingestion - -## Client - -- [ ] TiKV Client (Rust crate) -- [ ] Batch gRPC Message - Reduce message overhead - -## PD - -- [ ] Optimize Region Metadata - Save Region metadata in detached storage engine diff --git a/docs/V2.1/ROADMAP.md b/docs/V2.1/ROADMAP.md deleted file mode 100644 index e594c6b75ca..00000000000 --- a/docs/V2.1/ROADMAP.md +++ /dev/null @@ -1,66 +0,0 @@ ---- -title: TiKV Roadmap -category: Roadmap ---- - -# TiKV Roadmap - -This document defines the roadmap for TiKV development. - -## Engine - -- [ ] Titan - Use the separated key-value engine in production -- [ ] Pluggable Engine Interface - Clean up the engine wrapper code and provide more extendibility - -## Raft - -- [ ] Joint Consensus - Change multi members safely -- [ ] Quiescent Region - Reduce the heartbeat cost of inactive Regions -- [ ] Raft engine - Customize engine to replace RocksDB -- [ ] Remove KV RocksDB WAL - Use Raft log as the WAL of state machine engine -- [ ] Crossing datacenter optimization - - [ ] Raft witness role - Can only vote but not sync logs from Raft leader - - [ ] Follower snapshot - Get the snapshot from the follower when a new node is added - - [ ] Chain replication - Get Raft logs from the non-leader node - -## Distributed Transaction - -- [ ] Flow control - Do flow control in scheduler to avoid write stall in advance -- [ ] History version - Use another place to save history version data -- [ ] Performance - - [ ] Optimize storage format - - [ ] Reduce the latency of getting timestamp - - [ ] Optimize transaction conflicts - -## Coprocessor - -- [ ] More pushdown expressions support -- [ ] Common expression pipeline optimization -- [ ] Batch executor - Process multiple rows at a time -- [ ] Vectorized expression evaluation - Evaluate expressions by column -- [ ] Chunk - Compute over the data stored in the continuous memory -- [ ] Explain support - Evaluate the cost of execution - -## Tools - -- [ ] Use Raft learner to support Backup/Restore/Replication - -## Client - -- [ ] Rust -- [ ] Go -- [ ] Java -- [ ] C++ - -## PD - -- [ ] Visualization - Show the cluster status visually -- [ ] Learner scheduler - Schedule learner Regions to specified nodes -- [ ] Improve hot Region scheduler -- [ ] Improve simulator -- [ ] Range scheduler - Schedule Regions in the specified range - -## Misc - -- [ ] Tolerate corrupted Region - TiKV can be started even when the data in some Regions is corrupted -- [ ] Support huge Region whose size is larger than 1 GB \ No newline at end of file diff --git a/docs/V2.1/adopters.md b/docs/V2.1/adopters.md deleted file mode 100644 index 10ccf41e170..00000000000 --- a/docs/V2.1/adopters.md +++ /dev/null @@ -1,81 +0,0 @@ ---- -title: TiKV Adopters -category: adopters ---- - -# TiKV Adopters - -This is a list of TiKV adopters in various industries. - -| Company | Industry | Success Story | -| :--- | :--- | :--- | -|[Mobike](https://en.wikipedia.org/wiki/Mobike)|Ridesharing|[English](https://www.pingcap.com/blog/Use-Case-TiDB-in-Mobike/); [Chinese](https://www.pingcap.com/cases-cn/user-case-mobike/)| -|[Jinri Toutiao](https://en.wikipedia.org/wiki/Toutiao)|Mobile News Platform|[Chinese](https://www.pingcap.com/cases-cn/user-case-toutiao/)| -|[Yiguo.com](https://www.crunchbase.com/organization/shanghai-yiguo-electron-business)|E-commerce|[English](https://www.datanami.com/2018/02/22/hybrid-database-capturing-perishable-insights-yiguo/); [Chinese](https://www.pingcap.com/cases-cn/user-case-yiguo)| -|[Shopee](https://en.wikipedia.org/wiki/Shopee)|E-commerce|[English](https://www.pingcap.com/success-stories/tidb-in-shopee/); [Chinese](https://www.pingcap.com/cases-cn/user-case-shopee/)| -|[Yuanfudao.com](https://www.crunchbase.com/organization/yuanfudao)|EdTech|[English](https://www.pingcap.com/blog/2017-08-08-tidbforyuanfudao/); [Chinese](https://www.pingcap.com/cases-cn/user-case-yuanfudao/)| -|[Xiaomi](https://en.wikipedia.org/wiki/Xiaomi)|Consumer Electronics|[Chinese](https://pingcap.com/cases-cn/user-case-xiaomi/)| -|[LY.com](https://www.crunchbase.com/organization/ly-com)|Travel|[Chinese](https://www.pingcap.com/cases-cn/user-case-tongcheng/)| -|[Qunar.com](https://www.crunchbase.com/organization/qunar-com)|Travel|[Chinese](https://www.pingcap.com/cases-cn/user-case-qunar/)| -|[Hulu](https://www.hulu.com)|Entertainment|| -|[VIPKID](https://www.crunchbase.com/organization/vipkid)|EdTech|| -|[Lenovo](https://en.wikipedia.org/wiki/Lenovo)|Enterprise Technology|| -|[Bank of Beijing](https://en.wikipedia.org/wiki/Bank_of_Beijing)|Banking|[Chinese](https://pingcap.com/cases-cn/user-case-beijing-bank/)| -|[Industrial and Commercial Bank of China](https://en.wikipedia.org/wiki/Industrial_and_Commercial_Bank_of_China)|Banking|| -|[iQiyi](https://en.wikipedia.org/wiki/IQiyi)|Media and Entertainment|[English](https://www.pingcap.com/success-stories/tidb-in-iqiyi/); [Chinese](https://pingcap.com/cases-cn/user-case-iqiyi/)| -|[BookMyShow](https://www.crunchbase.com/organization/bookmyshow)|Media and Entertainment|[English](https://www.pingcap.com/success-stories/tidb-in-bookmyshow/)| -|[Yimian Data](https://www.crunchbase.com/organization/yimian-data)|Big Data|[Chinese](https://www.pingcap.com/cases-cn/user-case-yimian)| -|[CAASDATA](https://www.caasdata.com/)|Big Data|[Chinese](https://pingcap.com/cases-cn/user-case-kasi/)| -|[Phoenix New Media](https://www.crunchbase.com/organization/phoenix-new-media)|Media|[Chinese](https://www.pingcap.com/cases-cn/user-case-ifeng/)| -|[Mobikok](http://www.mobikok.com/en/)|AdTech|[Chinese](https://pingcap.com/cases-cn/user-case-mobikok/)| -|[LinkDoc Technology](https://www.crunchbase.com/organization/linkdoc-technology)|HealthTech|[Chinese](https://www.pingcap.com/cases-cn/user-case-linkdoc/)| -|[G7 Networks](https://www.english.g7.com.cn/)| Logistics|[Chinese](https://www.pingcap.com/cases-cn/user-case-g7/)| -|[Hive-Box](http://www.fcbox.com/en/pc/index.html#/)|Logistics|[Chinese](https://pingcap.com/cases-cn/user-case-fengchao/)| -|[360 Finance](https://www.crunchbase.com/organization/360-finance)|FinTech|[Chinese](https://www.pingcap.com/cases-cn/user-case-360/)| -|[GAEA](http://www.gaea.com/en/)|Gaming|[English](https://www.pingcap.com/blog/2017-05-22-Comparison-between-MySQL-and-TiDB-with-tens-of-millions-of-data-per-day/); [Chinese](https://www.pingcap.com/cases-cn/user-case-gaea-ad/)| -|[YOOZOO Games](https://www.crunchbase.com/organization/yoozoo-games)|Gaming|[Chinese](https://pingcap.com/cases-cn/user-case-youzu/)| -|[Seasun Games](https://www.crunchbase.com/organization/seasun)|Gaming|[Chinese](https://pingcap.com/cases-cn/user-case-xishanju/)| -|[NetEase Games](https://game.163.com/en/)|Gaming|| -|[FUNYOURS JAPAN](http://company.funyours.co.jp/)|Gaming|[Chinese](https://pingcap.com/cases-cn/user-case-funyours-japan/)| -|[Zhaopin.com](https://www.crunchbase.com/organization/zhaopin)|Recruiting|| -|[Panda.tv](https://www.crunchbase.com/organization/panda-tv)|Live Streaming|| -|[Hoodinn](https://www.crunchbase.com/organization/hoodinn)|Gaming|| -|[Ping++](https://www.crunchbase.com/organization/ping-5)|Mobile Payment|[Chinese](https://pingcap.com/cases-cn/user-case-ping++/)| -|[Hainan eKing Technology](https://www.crunchbase.com/organization/hainan-eking-technology)|Enterprise Technology|[Chinese](https://pingcap.com/cases-cn/user-case-ekingtech/)| -|[LianLian Tech](http://www.10030.com.cn/web/)|Mobile Payment|| -|[Tongdun Technology](https://www.crunchbase.com/organization/tongdun-technology)|FinTech|| -|[Wacai](https://www.crunchbase.com/organization/wacai)|FinTech|| -|[Tree Finance](https://www.treefinance.com.cn/)|FinTech|| -|[2Dfire.com](http://www.2dfire.com/)|FoodTech|[Chinese](https://www.pingcap.com/cases-cn/user-case-erweihuo/)| -|[Happigo.com](https://www.crunchbase.com/organization/happigo-com)|E-commerce|| -|[Mashang Consumer Finance](https://www.crunchbase.com/organization/ms-finance)|FinTech|| -|[Tencent OMG](https://en.wikipedia.org/wiki/Tencent)|Media|| -|[Terren](http://webterren.com.zigstat.com/)|Media|| -|[LeCloud](https://www.crunchbase.com/organization/letv-2)|Media|| -|[Miaopai](https://en.wikipedia.org/wiki/Miaopai)|Media|| -|[Snowball Finance](https://www.crunchbase.com/organization/snowball-finance)|FinTech|| -|[Yimutian](http://www.ymt.com/)|E-commerce|| -|[Gengmei](https://www.crunchbase.com/organization/gengmei)|Plastic Surgery|| -|[Acewill](https://www.crunchbase.com/organization/acewill)|FoodTech|| -|[Keruyun](https://www.crunchbase.com/organization/keruyun-technology-beijing-co-ltd)|SaaS|[Chinese](https://pingcap.com/cases-cn/user-case-keruyun/)| -|[Youju Tech](https://www.ujuz.cn/)|E-Commerce|| -|[Maizuo](https://www.crunchbase.com/organization/maizhuo)|E-Commerce|| -|[Mogujie](https://www.crunchbase.com/organization/mogujie)|E-Commerce|| -|[Zhuan Zhuan](https://www.crunchbase.com/organization/zhuan-zhuan)|Online Marketplace|[English](https://www.pingcap.com/success-stories/tidb-in-zhuanzhuan/); [Chinese](https://pingcap.com/cases-cn/user-case-zhuanzhuan/)| -|[Shuangchuang Huipu](http://scphjt.com/)|FinTech|| -|[Meizu](https://en.wikipedia.org/wiki/Meizu)|Media|| -|[SEA group](https://sea-group.org/?lang=en)|Gaming|| -|[Sogou](https://en.wikipedia.org/wiki/Sogou)|MediaTech|| -|[Chunyu Yisheng](https://www.crunchbase.com/organization/chunyu)|HealthTech|| -|[Meituan-Dianping](https://en.wikipedia.org/wiki/Meituan-Dianping)|Food Delivery|[English](https://www.pingcap.com/success-stories/tidb-in-meituan-dianping/); [Chinese](https://pingcap.com/cases-cn/user-case-meituan/)| -|[Qutoutiao](https://www.crunchbase.com/organization/qutoutiao)|Social Network|| -|[QuantGroup](https://www.crunchbase.com/organization/quantgroup)|FinTech|| -|[FINUP](https://www.crunchbase.com/organization/finup)|FinTech|| -|[Meili Finance](https://www.crunchbase.com/organization/meili-jinrong)|FinTech|| -|[Guolian Securities](https://www.crunchbase.com/organization/guolian-securities)|Financial Services|| -|[Founder Securities](https://www.linkedin.com/company/founder-securities-co-ltd-/)|Financial Services|| -|[China Telecom Shanghai](http://sh.189.cn/en/index.html)|Telecom|| -|[State Administration of Taxation](https://en.wikipedia.org/wiki/State_Administration_of_Taxation)|Finance|| -|[Wuhan Antian Information Technology](https://www.avlsec.com/)|Enterprise Technology|| -|[Ausnutria Dairy](https://www.crunchbase.com/organization/ausnutria-dairy)|FoodTech|| -|[Qingdao Telaidian](https://www.teld.cn/)|Electric Car Charger|[Chinese](https://pingcap.com/cases-cn/user-case-telaidian/)| \ No newline at end of file diff --git a/docs/V2.1/clients/go-client-api.md b/docs/V2.1/clients/go-client-api.md deleted file mode 100644 index 3fd508650f2..00000000000 --- a/docs/V2.1/clients/go-client-api.md +++ /dev/null @@ -1,358 +0,0 @@ ---- -title: Try Two Types of APIs -summary: Learn how to use the Raw Key-Value API and the Transactional Key-Value API in TiKV. -category: user guide ---- - -# Try Two Types of APIs - -To apply to different scenarios, TiKV provides [two types of APIs](../overview.md#two-types-of-apis) for developers: the Raw Key-Value API and the Transactional Key-Value API. This document uses two examples to guide you through how to use the two APIs in TiKV. The usage examples are based on multiple nodes for testing. You can also quickly try the two types of APIs on a single machine. - -> **Warning:** Do not use these two APIs together in the same cluster, otherwise they might corrupt each other's data. - -## Try the Raw Key-Value API - -To use the Raw Key-Value API in applications developed in the Go language, take the following steps: - -1. Install the necessary packages. - - ```bash - export GO111MODULE=on - go mod init rawkv-demo - go get github.com/pingcap/tidb@master - ``` - -2. Import the dependency packages. - - ```go - import ( - "fmt" - "github.com/pingcap/tidb/config" - "github.com/pingcap/tidb/store/tikv" - ) - ``` - -3. Create a Raw Key-Value client. - - ```go - cli, err := tikv.NewRawKVClient([]string{"192.168.199.113:2379"}, config.Security{}) - ``` - - Description of two parameters in the above command: - - - `string`: a list of PD servers’ addresses - - `config.Security`: used to establish TLS connections, usually left empty when you do not need TLS - -4. Call the Raw Key-Value client methods to access the data on TiKV. The Raw Key-Value API contains the following methods, and you can also find them at [GoDoc](https://godoc.org/github.com/pingcap/tidb/store/tikv#RawKVClient). - - ```go - type RawKVClient struct - func (c *RawKVClient) Close() error - func (c *RawKVClient) ClusterID() uint64 - func (c *RawKVClient) Delete(key []byte) error - func (c *RawKVClient) Get(key []byte) ([]byte, error) - func (c *RawKVClient) Put(key, value []byte) error - func (c *RawKVClient) Scan(startKey, endKey []byte, limit int) (keys [][]byte, values [][]byte, err error) - ``` - -### Usage example of the Raw Key-Value API - -```go -package main - -import ( - "fmt" - - "github.com/pingcap/tidb/config" - "github.com/pingcap/tidb/store/tikv" -) - -func main() { - cli, err := tikv.NewRawKVClient([]string{"192.168.199.113:2379"}, config.Security{}) - if err != nil { - panic(err) - } - defer cli.Close() - - fmt.Printf("cluster ID: %d\n", cli.ClusterID()) - - key := []byte("Company") - val := []byte("PingCAP") - - // put key into tikv - err = cli.Put(key, val) - if err != nil { - panic(err) - } - fmt.Printf("Successfully put %s:%s to tikv\n", key, val) - - // get key from tikv - val, err = cli.Get(key) - if err != nil { - panic(err) - } - fmt.Printf("found val: %s for key: %s\n", val, key) - - // delete key from tikv - err = cli.Delete(key) - if err != nil { - panic(err) - } - fmt.Printf("key: %s deleted\n", key) - - // get key again from tikv - val, err = cli.Get(key) - if err != nil { - panic(err) - } - fmt.Printf("found val: %s for key: %s\n", val, key) -} -``` - -The result is like: - -```bash -INFO[0000] [pd] create pd client with endpoints [192.168.199.113:2379] -INFO[0000] [pd] leader switches to: http://127.0.0.1:2379, previous: -INFO[0000] [pd] init cluster id 6554145799874853483 -cluster ID: 6554145799874853483 -Successfully put Company:PingCAP to tikv -found val: PingCAP for key: Company -key: Company deleted -found val: for key: Company -``` - -RawKVClient is a client of the TiKV server and only supports the GET/PUT/DELETE/SCAN commands. The RawKVClient can be safely and concurrently accessed by multiple goroutines, as long as it is not closed. Therefore, for one process, one client is enough generally. - -### Possible Error - -- If you see this error: - - ```bash - build rawkv-demo: cannot load github.com/pingcap/pd/pd-client: cannot find module providing package github.com/pingcap/pd/pd-client - ``` - - You can run `GO111MODULE=on go get -u github.com/pingcap/tidb@master` to fix it. - -- If you got this error when you run `go get -u github.com/pingcap/tidb@master`: - - ``` - go: github.com/golang/lint@v0.0.0-20190409202823-959b441ac422: parsing go.mod: unexpected module path "golang.org/x/lint" - ``` - - You can run `go mod edit -replace github.com/golang/lint=golang.org/x/lint@latest` to fix it. [Refer Link](https://github.com/golang/lint/issues/446#issuecomment-483638233) - -## Try the Transactional Key-Value API - -The Transactional Key-Value API is more complicated than the Raw Key-Value API. Some transaction related concepts are listed as follows. For more details, see the [KV package](https://github.com/pingcap/tidb/tree/master/kv). - -- Storage - - Like the RawKVClient, a Storage is an abstract TiKV cluster. - -- Snapshot - - A Snapshot is the state of a Storage at a particular point of time, which provides some readonly methods. The multiple times read from a same Snapshot is guaranteed consistent. - -- Transaction - - Like the transactions in SQL, a Transaction symbolizes a series of read and write operations performed within the Storage. Internally, a Transaction consists of a Snapshot for reads, and a MemBuffer for all writes. The default isolation level of a Transaction is Snapshot Isolation. - -To use the Transactional Key-Value API in applications developed by golang, take the following steps: - -1. Install the necessary packages. - - ```bash - export GO111MODULE=on - go mod init txnkv-demo - go get github.com/pingcap/tidb@master - ``` - -2. Import the dependency packages. - - ```go - import ( - "flag" - "fmt" - "os" - - "github.com/juju/errors" - "github.com/pingcap/tidb/kv" - "github.com/pingcap/tidb/store/tikv" - "github.com/pingcap/tidb/terror" - - goctx "golang.org/x/net/context" - ) - ``` - -3. Create Storage using a URL scheme. - - ```go - driver := tikv.Driver{} - storage, err := driver.Open("tikv://192.168.199.113:2379") - ``` - -4. (Optional) Modify the Storage using a Transaction. - - The lifecycle of a Transaction is: _begin → {get, set, delete, scan} → {commit, rollback}_. - -5. Call the Transactional Key-Value API's methods to access the data on TiKV. The Transactional Key-Value API contains the following methods: - - ```go - Begin() -> Txn - Txn.Get(key []byte) -> (value []byte) - Txn.Set(key []byte, value []byte) - Txn.Iter(begin, end []byte) -> Iterator - Txn.Delete(key []byte) - Txn.Commit() - ``` - -### Usage example of the Transactional Key-Value API - -```go -package main - -import ( - "flag" - "fmt" - "os" - - "github.com/juju/errors" - "github.com/pingcap/tidb/kv" - "github.com/pingcap/tidb/store/tikv" - "github.com/pingcap/tidb/terror" - - goctx "golang.org/x/net/context" -) - -type KV struct { - K, V []byte -} - -func (kv KV) String() string { - return fmt.Sprintf("%s => %s (%v)", kv.K, kv.V, kv.V) -} - -var ( - store kv.Storage - pdAddr = flag.String("pd", "192.168.199.113:2379", "pd address:192.168.199.113:2379") -) - -// Init initializes information. -func initStore() { - driver := tikv.Driver{} - var err error - store, err = driver.Open(fmt.Sprintf("tikv://%s", *pdAddr)) - terror.MustNil(err) -} - -// key1 val1 key2 val2 ... -func puts(args ...[]byte) error { - tx, err := store.Begin() - if err != nil { - return errors.Trace(err) - } - - for i := 0; i < len(args); i += 2 { - key, val := args[i], args[i+1] - err := tx.Set(key, val) - if err != nil { - return errors.Trace(err) - } - } - err = tx.Commit(goctx.Background()) - if err != nil { - return errors.Trace(err) - } - - return nil -} - -func get(k []byte) (KV, error) { - tx, err := store.Begin() - if err != nil { - return KV{}, errors.Trace(err) - } - v, err := tx.Get(k) - if err != nil { - return KV{}, errors.Trace(err) - } - return KV{K: k, V: v}, nil -} - -func dels(keys ...[]byte) error { - tx, err := store.Begin() - if err != nil { - return errors.Trace(err) - } - for _, key := range keys { - err := tx.Delete(key) - if err != nil { - return errors.Trace(err) - } - } - err = tx.Commit(goctx.Background()) - if err != nil { - return errors.Trace(err) - } - return nil -} - -func scan(keyPrefix []byte, limit int) ([]KV, error) { - tx, err := store.Begin() - if err != nil { - return nil, errors.Trace(err) - } - it, err := tx.Iter(kv.Key(keyPrefix), nil) - if err != nil { - return nil, errors.Trace(err) - } - defer it.Close() - var ret []KV - for it.Valid() && limit > 0 { - ret = append(ret, KV{K: it.Key()[:], V: it.Value()[:]}) - limit-- - it.Next() - } - return ret, nil -} - -func main() { - pdAddr := os.Getenv("PD_ADDR") - if pdAddr != "" { - os.Args = append(os.Args, "-pd", pdAddr) - } - flag.Parse() - initStore() - - // set - err := puts([]byte("key1"), []byte("value1"), []byte("key2"), []byte("value2")) - terror.MustNil(err) - - // get - kv, err := get([]byte("key1")) - terror.MustNil(err) - fmt.Println(kv) - - // scan - ret, err := scan([]byte("key"), 10) - for _, kv := range ret { - fmt.Println(kv) - } - - // delete - err = dels([]byte("key1"), []byte("key2")) - terror.MustNil(err) -} -``` - -The result is like: - -```bash -INFO[0000] [pd] create pd client with endpoints [192.168.199.113:2379] -INFO[0000] [pd] leader switches to: http://192.168.199.113:2379, previous: -INFO[0000] [pd] init cluster id 6563858376412119197 -key1 => value1 ([118 97 108 117 101 49]) -key1 => value1 ([118 97 108 117 101 49]) -key2 => value2 ([118 97 108 117 101 50]) -``` diff --git a/docs/V2.1/op-guide/ansible-deployment-scale.md b/docs/V2.1/op-guide/ansible-deployment-scale.md deleted file mode 100644 index 67b383400d0..00000000000 --- a/docs/V2.1/op-guide/ansible-deployment-scale.md +++ /dev/null @@ -1,373 +0,0 @@ ---- -title: Scale a TiKV Cluster Using TiDB-Ansible -summary: Use TiDB-Ansible to scale out or scale in a TiKV cluster. -category: operations ---- - -# Scale a TiKV Cluster Using TiDB-Ansible - -This document describes how to use TiDB-Ansible to scale out or scale in a TiKV cluster without affecting the online services. - -> **Note:** This document applies to the TiKV deployment using Ansible. If your TiKV cluster is deployed in other ways, see [Scale a TiKV Cluster](horizontal-scale.md). - -Assume that the topology is as follows: - -| Name | Host IP | Services | -| ---- | ------- | -------- | -| node1 | 172.16.10.1 | PD1, Monitor | -| node2 | 172.16.10.2 | PD2 | -| node3 | 172.16.10.3 | PD3 | -| node4 | 172.16.10.4 | TiKV1 | -| node5 | 172.16.10.5 | TiKV2 | -| node6 | 172.16.10.6 | TiKV3 | - -## Scale out a TiKV cluster - -This section describes how to increase the capacity of a TiKV cluster by adding a TiKV or PD node. - -### Add TiKV nodes - -For example, if you want to add two TiKV nodes (node101, node102) with the IP addresses `172.16.10.101` and `172.16.10.102`, take the following steps: - -1. Edit the `inventory.ini` file and append the TiKV node information in `tikv_servers`: - - ```ini - [tidb_servers] - - [pd_servers] - 172.16.10.1 - 172.16.10.2 - 172.16.10.3 - - [tikv_servers] - 172.16.10.4 - 172.16.10.5 - 172.16.10.6 - 172.16.10.101 - 172.16.10.102 - - [monitoring_servers] - 172.16.10.1 - - [grafana_servers] - 172.16.10.1 - - [monitored_servers] - 172.16.10.1 - 172.16.10.2 - 172.16.10.3 - 172.16.10.4 - 172.16.10.5 - 172.16.10.6 - 172.16.10.101 - 172.16.10.102 - ``` - - Now the topology is as follows: - - | Name | Host IP | Services | - | ---- | ------- | -------- | - | node1 | 172.16.10.1 | PD1, Monitor | - | node2 | 172.16.10.2 | PD2 | - | node3 | 172.16.10.3 | PD3 | - | node4 | 172.16.10.4 | TiKV1 | - | node5 | 172.16.10.5 | TiKV2 | - | node6 | 172.16.10.6 | TiKV3 | - | **node101** | **172.16.10.101** | **TiKV4** | - | **node102** | **172.16.10.102** | **TiKV5** | - -2. Initialize the newly added node: - - ```bash - ansible-playbook bootstrap.yml -l 172.16.10.101,172.16.10.102 - ``` - - > **Note:** If an alias is configured in the `inventory.ini` file, for example, `node101 ansible_host=172.16.10.101`, use `-l` to specify the alias when executing `ansible-playbook`. For example, `ansible-playbook bootstrap.yml -l node101,node102`. This also applies to the following steps. - -3. Deploy the newly added node: - - ```bash - ansible-playbook deploy.yml -l 172.16.10.101,172.16.10.102 - ``` - -4. Start the newly added node: - - ```bash - ansible-playbook start.yml -l 172.16.10.101,172.16.10.102 - ``` - -5. Update the Prometheus configuration and restart: - - ```bash - ansible-playbook rolling_update_monitor.yml --tags=prometheus - ``` - -6. Monitor the status of the entire cluster and the newly added nodes by opening a browser to access the monitoring platform: `http://172.16.10.1:3000`. - -### Add a PD node - -To add a PD node (node103) with the IP address `172.16.10.103`, take the following steps: - -1. Edit the `inventory.ini` file and append the PD node information in `pd_servers`: - - ```ini - [tidb_servers] - - [pd_servers] - 172.16.10.1 - 172.16.10.2 - 172.16.10.3 - 172.16.10.103 - - [tikv_servers] - 172.16.10.4 - 172.16.10.5 - 172.16.10.6 - - [monitoring_servers] - 172.16.10.1 - - [grafana_servers] - 172.16.10.1 - - [monitored_servers] - 172.16.10.1 - 172.16.10.2 - 172.16.10.3 - 172.16.10.103 - 172.16.10.4 - 172.16.10.5 - 172.16.10.6 - ``` - - Now the topology is as follows: - - | Name | Host IP | Services | - | ---- | ------- | -------- | - | node1 | 172.16.10.1 | PD1, Monitor | - | node2 | 172.16.10.2 | PD2 | - | node3 | 172.16.10.3 | PD3 | - | **node103** | **172.16.10.103** | **PD4** | - | node4 | 172.16.10.4 | TiKV1 | - | node5 | 172.16.10.5 | TiKV2 | - | node6 | 172.16.10.6 | TiKV3 | - -2. Initialize the newly added node: - - ```bash - ansible-playbook bootstrap.yml -l 172.16.10.103 - ``` - -3. Deploy the newly added node: - - ```bash - ansible-playbook deploy.yml -l 172.16.10.103 - ``` - -4. Login the newly added PD node and edit the starting script: - - ```bash - {deploy_dir}/scripts/run_pd.sh - ``` - - 1. Remove the `--initial-cluster="xxxx" \` configuration. - 2. Add `--join="http://172.16.10.1:2379" \`. The IP address (`172.16.10.1`) can be any of the existing PD IP addresses in the cluster. - 3. Manually start the PD service in the newly added PD node: - - ```bash - {deploy_dir}/scripts/start_pd.sh - ``` - - 4. Use `pd-ctl` to check whether the new node is added successfully: - - ```bash - ./pd-ctl -u "http://172.16.10.1:2379" - ``` - - > **Note:** `pd-ctl` is a command used to check the number of PD nodes. - -5. Apply a rolling update to the entire cluster: - - ```bash - ansible-playbook rolling_update.yml - ``` - -6. Update the Prometheus configuration and restart: - - ```bash - ansible-playbook rolling_update_monitor.yml --tags=prometheus - ``` - -7. Monitor the status of the entire cluster and the newly added node by opening a browser to access the monitoring platform: `http://172.16.10.1:3000`. - -## Scale in a TiKV cluster - -This section describes how to decrease the capacity of a TiKV cluster by removing a TiKV or PD node. - -> **Warning:** In decreasing the capacity, if your cluster has a mixed deployment of other services, do not perform the following procedures. The following examples assume that the removed nodes have no mixed deployment of other services. - -### Remove a TiKV node - -To remove a TiKV node (node6) with the IP address `172.16.10.6`, take the following steps: - -1. Remove the node from the cluster using `pd-ctl`: - - 1. View the store ID of node6: - - ```bash - ./pd-ctl -u "http://172.16.10.1:2379" -d store - ``` - - 2. Remove node6 from the cluster, assuming that the store ID is 10: - - ```bash - ./pd-ctl -u "http://172.16.10.1:2379" -d store delete 10 - ``` - -2. Use Grafana or `pd-ctl` to check whether the node is successfully removed: - - ```bash - ./pd-ctl -u "http://172.16.10.1:2379" -d store 10 - ``` - - > **Note:** It takes some time to remove the node. If the status of the node you remove becomes Tombstone, then this node is successfully removed. - -3. After the node is successfully removed, stop the services on node6: - - ```bash - ansible-playbook stop.yml -l 172.16.10.6 - ``` - -4. Edit the `inventory.ini` file and remove the node information: - - ```ini - [tidb_servers] - - [pd_servers] - 172.16.10.1 - 172.16.10.2 - 172.16.10.3 - - [tikv_servers] - 172.16.10.4 - 172.16.10.5 - #172.16.10.6 # the removed node - - [monitoring_servers] - 172.16.10.1 - - [grafana_servers] - 172.16.10.1 - - [monitored_servers] - 172.16.10.1 - 172.16.10.2 - 172.16.10.3 - 172.16.10.4 - 172.16.10.5 - #172.16.10.6 # the removed node - ``` - - Now the topology is as follows: - - | Name | Host IP | Services | - | ---- | ------- | -------- | - | node1 | 172.16.10.1 | PD1, Monitor | - | node2 | 172.16.10.2 | PD2 | - | node3 | 172.16.10.3 | PD3 | - | node4 | 172.16.10.4 | TiKV1 | - | node5 | 172.16.10.5 | TiKV2 | - | **node6** | **172.16.10.6** | **TiKV3 removed** | - -5. Update the Prometheus configuration and restart: - - ```bash - ansible-playbook rolling_update_monitor.yml --tags=prometheus - ``` - -6. Monitor the status of the entire cluster by opening a browser to access the monitoring platform: `http://172.16.10.1:3000`. - -### Remove a PD node - -To remove a PD node (node2) with the IP address `172.16.10.2`, take the following steps: - -1. Remove the node from the cluster using `pd-ctl`: - - 1. View the name of node2: - - ```bash - ./pd-ctl -u "http://172.16.10.1:2379" -d member - ``` - - 2. Remove node2 from the cluster, assuming that the name is pd2: - - ```bash - ./pd-ctl -u "http://172.16.10.1:2379" -d member delete name pd2 - ``` - -2. Use Grafana or `pd-ctl` to check whether the node is successfully removed: - - ```bash - ./pd-ctl -u "http://172.16.10.1:2379" -d member - ``` - -3. After the node is successfully removed, stop the services on node2: - - ```bash - ansible-playbook stop.yml -l 172.16.10.2 - ``` - -4. Edit the `inventory.ini` file and remove the node information: - - ```ini - [tidb_servers] - - [pd_servers] - 172.16.10.1 - #172.16.10.2 # the removed node - 172.16.10.3 - - [tikv_servers] - 172.16.10.4 - 172.16.10.5 - 172.16.10.6 - - [monitoring_servers] - 172.16.10.1 - - [grafana_servers] - 172.16.10.1 - - [monitored_servers] - 172.16.10.1 - #172.16.10.2 # the removed node - 172.16.10.3 - 172.16.10.4 - 172.16.10.5 - 172.16.10.6 - ``` - - Now the topology is as follows: - - | Name | Host IP | Services | - | ---- | ------- | -------- | - | node1 | 172.16.10.1 | PD1, Monitor | - | **node2** | **172.16.10.2** | **PD2 removed** | - | node3 | 172.16.10.3 | PD3 | - | node4 | 172.16.10.4 | TiKV1 | - | node5 | 172.16.10.5 | TiKV2 | - | node6 | 172.16.10.6 | TiKV3 | - -5. Perform a rolling update to the entire TiKV cluster: - - ```bash - ansible-playbook rolling_update.yml - ``` - -6. Update the Prometheus configuration and restart: - - ```bash - ansible-playbook rolling_update_monitor.yml --tags=prometheus - ``` - -7. To monitor the status of the entire cluster, open a browser to access the monitoring platform: `http://172.16.10.1:3000`. \ No newline at end of file diff --git a/docs/V2.1/op-guide/coprocessor-config.md b/docs/V2.1/op-guide/coprocessor-config.md deleted file mode 100644 index e216d7e8561..00000000000 --- a/docs/V2.1/op-guide/coprocessor-config.md +++ /dev/null @@ -1,83 +0,0 @@ ---- -title: TiKV Coprocessor Configuration -summary: Learn how to configure Coprocessor in TiKV. -category: operations ---- - -# TiKV Coprocessor Configuration - -Coprocessor is the component that handles most of the read requests from TiDB. Unlike Storage, it is more high-leveled that it not only fetches KV data but also does computing like filter or aggregation. TiKV is used as a distribution computing engine and Coprocessor is also used to reduce data serialization and traffic. This document describes how to configure TiKV Coprocessor. - -## Configuration - -Most Coprocessor configurations are in the `[readpool.coprocessor]` section and some configurations are in the `[server]` section. - -### `[readpool.coprocessor]` - -There are three thread pools for handling high priority, normal priority and low priority requests respectively. TiDB point select is high priority, range scan is normal priority and background jobs like table analyzing is low priority. - -#### `high-concurrency` - -- Specifies the thread pool size for handling high priority Coprocessor requests -- Default value: number of cores * 0.8 (> 8 cores) or 8 (<= 8 cores) -- Minimum value: 1 -- It must be larger than zero but should not exceed the number of CPU cores of the host machine -- On a machine with more than 8 CPU cores, its default value is NUM_CPUS * 0.8. Otherwise it is 8 -- If you are running multiple TiKV instances on the same machine, make sure that the sum of this configuration item does not exceed the number of CPU cores. For example, assuming that you have a 48 core server running 3 TiKVs, then the `high-concurrency` value for each instance should be less than 16 -- Do not set it to a too small value, otherwise your read request QPS is limited. On the other hand, a larger value is not always the optimal choice because there might be larger resource contention - -#### `normal-concurrency` - -- Specifies the thread pool size for handling normal priority Coprocessor requests -- Default value: number of cores * 0.8 (> 8 cores) or 8 (<= 8 cores) -- Minimum value: 1 - -#### `low-concurrency` - -- Specifies the thread pool size for handling low priority Coprocessor requests -- Default value: number of cores * 0.8 (> 8 cores) or 8 (<= 8 cores) -- Minimum value: 1 -- Generally, you don’t need to ensure that the sum of high + normal + low < the number of CPU cores, because a single Coprocessor request is handled by only one of them - -#### `max-tasks-per-worker-high` - -- Specifies the max number of running operations for each thread in high priority thread pool -- Default value: number of cores * 0.8 (> 8 cores) or 8 (<= 8 cores) -- Minimum value: 1 -- Because actually a throttle of the thread-pool level instead of single thread level is performed, the max number of running operations for the thread pool is limited to `max-tasks-per-worker-high * high-concurrency`. If the number of running operations exceeds this configuration, new operations are simply rejected without being handled and it contains an error header telling that TiKV is busy -- Generally, you don’t need to adjust this configuration unless you are following trustworthy advice - -#### `max-tasks-per-worker-normal` - -- Specifies the max running operations for each thread in the normal priority thread pool -- Default value: 2000 -- Minimum value: 2000 - -#### `max-tasks-per-worker-low` - -- Specifies the max running operations for each thread in the low priority thread pool -- Default value: 2000 -- Minimum value: 2000 - -#### `stack-size` - -- Sets the stack size for each thread in the three thread pools -- Default value: 10MB -- Minimum value: 2MB -- For large requests, you need a large stack to handle. Some Coprocessor requests are extremely large, change with caution - -### `[server]` - -#### `end-point-recursion-limit` - -- Sets the max allowed recursions when decoding Coprocessor DAG expressions -- Default value: 1000 -- Minimum value: 100 -- Smaller value might cause large Coprocessor DAG requests to fail - -#### `end-point-request-max-handle-duration` - -- Sets the max allowed waiting time for each request -- Default value: 60s -- Minimum value: 60s -- When there are many backlog Coprocessor requests, new requests might wait in queue. If the waiting time of a request exceeds this configuration, it is rejected with the TiKV busy error and is not handled \ No newline at end of file diff --git a/docs/V2.1/op-guide/deploy-tikv-using-ansible.md b/docs/V2.1/op-guide/deploy-tikv-using-ansible.md deleted file mode 100644 index 267d7eb1bf1..00000000000 --- a/docs/V2.1/op-guide/deploy-tikv-using-ansible.md +++ /dev/null @@ -1,567 +0,0 @@ ---- -title: Install and Deploy TiKV Using Ansible -summary: Use TiDB-Ansible to deploy a TiKV cluster on multiple nodes. -category: operations ---- - -# Install and Deploy TiKV Using Ansible - -This guide describes how to install and deploy TiKV using Ansible. Ansible is an IT automation tool that can configure systems, deploy software, and orchestrate more advanced IT tasks such as continuous deployments or zero downtime rolling updates. - -[TiDB-Ansible](https://github.com/pingcap/tidb-ansible) is a TiDB cluster deployment tool developed by PingCAP, based on Ansible playbook. TiDB-Ansible enables you to quickly deploy a new TiKV cluster which includes PD, TiKV, and the cluster monitoring modules. - -> **Warning:** For the production environment, use TiDB-Ansible to deploy your TiKV cluster. If you only want to try TiKV out and explore the features, see [Install and Deploy TiKV using Docker Compose](deploy-tikv-using-docker-compose.md) on a single machine. - -## Prepare - -Before you start, make sure you have: - -1. Several target machines that meet the following requirements: - - - 4 or more machines - - A standard TiKV cluster contains 6 machines. You can use 4 machines for testing. - - - CentOS 7.3 (64 bit) or later with Python 2.7 installed, x86_64 architecture (AMD64) - - Network between machines - - > **Note:** When you deploy TiKV using Ansible, use SSD disks for the data directory of TiKV and PD nodes. Otherwise, the system will not perform well. For more details, see [Software and Hardware Requirements](https://github.com/pingcap/docs/blob/master/op-guide/recommendation.md). - -2. A Control Machine that meets the following requirements: - - > **Note:** The Control Machine can be one of the target machines. - - - CentOS 7.3 (64 bit) or later with Python 2.7 installed - - Access to the Internet - - Git installed - -## Step 1: Install system dependencies on the Control Machine - -Log in to the Control Machine using the `root` user account, and run the corresponding command according to your operating system. - -- If you use a Control Machine installed with CentOS 7, run the following command: - - ``` - # yum -y install epel-release git curl sshpass - # yum -y install python-pip - ``` - -- If you use a Control Machine installed with Ubuntu, run the following command: - - ``` - # apt-get -y install git curl sshpass python-pip - ``` - -## Step 2: Create the `tidb` user on the Control Machine and generate the SSH key - -Make sure you have logged in to the Control Machine using the `root` user account, and then run the following command. - -1. Create the `tidb` user. - - ``` - # useradd -m -d /home/tidb tidb - ``` - -2. Set a password for the `tidb` user account. - - ``` - # passwd tidb - ``` - -3. Configure sudo without password for the `tidb` user account by adding `tidb ALL=(ALL) NOPASSWD: ALL` to the end of the sudo file: - - ``` - # visudo - tidb ALL=(ALL) NOPASSWD: ALL - ``` -4. Generate the SSH key. - - Execute the `su` command to switch the user from `root` to `tidb`. Create the SSH key for the `tidb` user account and hit the Enter key when `Enter passphrase` is prompted. After successful execution, the SSH private key file is `/home/tidb/.ssh/id_rsa`, and the SSH public key file is `/home/tidb/.ssh/id_rsa.pub`. - - ``` - # su - tidb - $ ssh-keygen -t rsa - Generating public/private rsa key pair. - Enter file in which to save the key (/home/tidb/.ssh/id_rsa): - Created directory '/home/tidb/.ssh'. - Enter passphrase (empty for no passphrase): - Enter same passphrase again: - Your identification has been saved in /home/tidb/.ssh/id_rsa. - Your public key has been saved in /home/tidb/.ssh/id_rsa.pub. - The key fingerprint is: - SHA256:eIBykszR1KyECA/h0d7PRKz4fhAeli7IrVphhte7/So tidb@172.16.10.49 - The key's randomart image is: - +---[RSA 2048]----+ - |=+o+.o. | - |o=o+o.oo | - | .O.=.= | - | . B.B + | - |o B * B S | - | * + * + | - | o + . | - | o E+ . | - |o ..+o. | - +----[SHA256]-----+ - ``` - -## Step 3: Download TiDB-Ansible to the Control Machine - -1. Log in to the Control Machine using the `tidb` user account and enter the `/home/tidb` directory. - -2. Download the corresponding TiDB-Ansible version from the [TiDB-Ansible project](https://github.com/pingcap/tidb-ansible). The default folder name is `tidb-ansible`. - - - Download the 2.0 GA version: - - ```bash - $ git clone -b release-2.0 https://github.com/pingcap/tidb-ansible.git - ``` - - - Download the master version: - - ```bash - $ git clone https://github.com/pingcap/tidb-ansible.git - ``` - - > **Note:** It is required to download `tidb-ansible` to the `/home/tidb` directory using the `tidb` user account. If you download it to the `/root` directory, a privilege issue occurs. - - If you have questions regarding which version to use, email to info@pingcap.com for more information or [file an issue](https://github.com/pingcap/tidb-ansible/issues/new). - -## Step 4: Install Ansible and its dependencies on the Control Machine - -Make sure you have logged in to the Control Machine using the `tidb` user account. - -It is required to use `pip` to install Ansible and its dependencies, otherwise a compatibility issue occurs. Currently, the TiDB 2.0 GA version and the master version are compatible with Ansible 2.4 and Ansible 2.5. - -1. Install Ansible and the dependencies on the Control Machine: - - ```bash - $ cd /home/tidb/tidb-ansible - $ sudo pip install -r ./requirements.txt - ``` - - Ansible and the related dependencies are in the `tidb-ansible/requirements.txt` file. - -2. View the version of Ansible: - - ```bash - $ ansible --version - ansible 2.5.0 - ``` - -## Step 5: Configure the SSH mutual trust and sudo rules on the Control Machine - -Make sure you have logged in to the Control Machine using the `tidb` user account. - -1. Add the IPs of your target machines to the `[servers]` section of the `hosts.ini` file. - - ```bash - $ cd /home/tidb/tidb-ansible - $ vi hosts.ini - [servers] - 172.16.10.1 - 172.16.10.2 - 172.16.10.3 - 172.16.10.4 - 172.16.10.5 - 172.16.10.6 - - [all:vars] - username = tidb - ntp_server = pool.ntp.org - ``` - -2. Run the following command and input the `root` user account password of your target machines. - - ```bash - $ ansible-playbook -i hosts.ini create_users.yml -u root -k - ``` - - This step creates the `tidb` user account on the target machines, and configures the sudo rules and the SSH mutual trust between the Control Machine and the target machines. - -> **Note:** To configure the SSH mutual trust and sudo without password manually, see [How to manually configure the SSH mutual trust and sudo without password](https://github.com/pingcap/docs/blob/master/op-guide/ansible-deployment.md#how-to-manually-configure-the-ssh-mutual-trust-and-sudo-without-password). - -## Step 6: Install the NTP service on the target machines - -> **Note:** If the time and time zone of all your target machines are same, the NTP service is on and is normally synchronizing time, you can ignore this step. See [How to check whether the NTP service is normal](https://github.com/pingcap/docs/blob/master/op-guide/ansible-deployment.md#how-to-check-whether-the-ntp-service-is-normal). - -Make sure you have logged in to the Control Machine using the `tidb` user account, run the following command: - -```bash -$ cd /home/tidb/tidb-ansible -$ ansible-playbook -i hosts.ini deploy_ntp.yml -u tidb -b -``` - -The NTP service is installed and started using the software repository that comes with the system on the target machines. The default NTP server list in the installation package is used. The related `server` parameter is in the `/etc/ntp.conf` configuration file. - -To make the NTP service start synchronizing as soon as possible, the system executes the `ntpdate` command to set the local date and time by polling `ntp_server` in the `hosts.ini` file. The default server is `pool.ntp.org`, and you can also replace it with your NTP server. - -## Step 7: Configure the CPUfreq governor mode on the target machine - -For details about CPUfreq, see [the CPUfreq Governor documentation](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/power_management_guide/cpufreq_governors). - -Set the CPUfreq governor mode to `performance` to make full use of CPU performance. - -### Check the governor modes supported by the system - -You can run the `cpupower frequency-info --governors` command to check the governor modes which the system supports: - -``` -# cpupower frequency-info --governors -analyzing CPU 0: - available cpufreq governors: performance powersave -``` - -Taking the above code for example, the system supports the `performance` and `powersave` modes. - -> **Note:** As the following shows, if it returns "Not Available", it means that the current system does not support CPUfreq configuration and you can skip this step. - -``` -# cpupower frequency-info --governors -analyzing CPU 0: - available cpufreq governors: Not Available -``` - -### Check the current governor mode - -You can run the `cpupower frequency-info --policy` command to check the current CPUfreq governor mode: - -``` -# cpupower frequency-info --policy -analyzing CPU 0: - current policy: frequency should be within 1.20 GHz and 3.20 GHz. - The governor "powersave" may decide which speed to use - within this range. -``` - -As the above code shows, the current mode is `powersave` in this example. - -### Change the governor mode - -- You can run the following command to change the current mode to `performance`: - - ``` - # cpupower frequency-set --governor performance - ``` - -- You can also run the following command to set the mode on the target machine in batches: - - ``` - $ ansible -i hosts.ini all -m shell -a "cpupower frequency-set --governor performance" -u tidb -b - ``` - -## Step 8: Mount the data disk ext4 filesystem with options on the target machines - -Log in to the Control Machine using the `root` user account. - -Format your data disks to the ext4 filesystem and mount the filesystem with the `nodelalloc` and `noatime` options. It is required to mount the `nodelalloc` option, or else the Ansible deployment cannot pass the test. The `noatime` option is optional. - -> **Note:** If your data disks have been formatted to ext4 and have mounted the options, you can uninstall it by running the `# umount /dev/nvme0n1` command, follow the steps starting from editing the `/etc/fstab` file, and remount the filesystem with options. - -Take the `/dev/nvme0n1` data disk as an example: - -1. View the data disk. - - ``` - # fdisk -l - Disk /dev/nvme0n1: 1000 GB - ``` - -2. Create the partition table. - - ``` - # parted -s -a optimal /dev/nvme0n1 mklabel gpt -- mkpart primary ext4 1 -1 - ``` - -3. Format the data disk to the ext4 filesystem. - - ``` - # mkfs.ext4 /dev/nvme0n1 - ``` - -4. View the partition UUID of the data disk. - - In this example, the UUID of `nvme0n1` is `c51eb23b-195c-4061-92a9-3fad812cc12f`. - - ``` - # lsblk -f - NAME FSTYPE LABEL UUID MOUNTPOINT - sda - ├─sda1 ext4 237b634b-a565-477b-8371-6dff0c41f5ab /boot - ├─sda2 swap f414c5c0-f823-4bb1-8fdf-e531173a72ed - └─sda3 ext4 547909c1-398d-4696-94c6-03e43e317b60 / - sr0 - nvme0n1 ext4 c51eb23b-195c-4061-92a9-3fad812cc12f - ``` - -5. Edit the `/etc/fstab` file and add the mount options. - - ``` - # vi /etc/fstab - UUID=c51eb23b-195c-4061-92a9-3fad812cc12f /data1 ext4 defaults,nodelalloc,noatime 0 2 - ``` - -6. Mount the data disk. - - ``` - # mkdir /data1 - # mount -a - ``` - -7. Check using the following command. - - ``` - # mount -t ext4 - /dev/nvme0n1 on /data1 type ext4 (rw,noatime,nodelalloc,data=ordered) - ``` - - If the filesystem is ext4 and `nodelalloc` is included in the mount options, you have successfully mount the data disk ext4 filesystem with options on the target machines. - -## Step 9: Edit the `inventory.ini` file to orchestrate the TiKV cluster - -Edit the `tidb-ansible/inventory.ini` file to orchestrate the TiKV cluster. The standard TiKV cluster contains 6 machines: 3 PD nodes and 3 TiKV nodes. - -- Deploy at least 3 instances for TiKV. -- Do not deploy TiKV together with PD on the same machine. -- Use the first PD machine as the monitoring machine. - -> **Note:** -> -> - Leave `[tidb_servers]` in the `inventory.ini` file empty, because this deployment is for the TiKV cluster, not the TiDB cluster. -> - It is required to use the internal IP address to deploy. If the SSH port of the target machines is not the default 22 port, you need to add the `ansible_port` variable. For example, `TiDB1 ansible_host=172.16.10.1 ansible_port=5555`. - -You can choose one of the following two types of cluster topology according to your scenario: - -- [The cluster topology of a single TiKV instance on each TiKV node](#option-1-use-the-cluster-topology-of-a-single-tikv-instance-on-each-tikv-node) - - In most cases, it is recommended to deploy one TiKV instance on each TiKV node for better performance. However, if the CPU and memory of your TiKV machines are much better than the required in [Hardware and Software Requirements](https://github.com/pingcap/docs/blob/master/op-guide/recommendation.md), and you have more than two disks in one node or the capacity of one SSD is larger than 2 TB, you can deploy no more than 2 TiKV instances on a single TiKV node. - -- [The cluster topology of multiple TiKV instances on each TiKV node](#option-2-use-the-cluster-topology-of-multiple-tikv-instances-on-each-tikv-node) - -### Option 1: Use the cluster topology of a single TiKV instance on each TiKV node - -| Name | Host IP | Services | -|-------|-------------|----------| -| node1 | 172.16.10.1 | PD1 | -| node2 | 172.16.10.2 | PD2 | -| node3 | 172.16.10.3 | PD3 | -| node4 | 172.16.10.4 | TiKV1 | -| node5 | 172.16.10.5 | TiKV2 | -| node6 | 172.16.10.6 | TiKV3 | - -Edit the `inventory.ini` file as follows: - -```ini -[tidb_servers] - -[pd_servers] -172.16.10.1 -172.16.10.2 -172.16.10.3 - -[tikv_servers] -172.16.10.4 -172.16.10.5 -172.16.10.6 - -[monitoring_servers] -172.16.10.1 - -[grafana_servers] -172.16.10.1 - -[monitored_servers] -172.16.10.1 -172.16.10.2 -172.16.10.3 -172.16.10.4 -172.16.10.5 -172.16.10.6 -``` - -### Option 2: Use the cluster topology of multiple TiKV instances on each TiKV node - -Take two TiKV instances on each TiKV node as an example: - -| Name | Host IP | Services | -|-------|-------------|------------------| -| node1 | 172.16.10.1 | PD1 | -| node2 | 172.16.10.2 | PD2 | -| node3 | 172.16.10.3 | PD3 | -| node4 | 172.16.10.4 | TiKV1-1, TiKV1-2 | -| node5 | 172.16.10.5 | TiKV2-1, TiKV2-2 | -| node6 | 172.16.10.6 | TiKV3-1, TiKV3-2 | - -```ini -[tidb_servers] - -[pd_servers] -172.16.10.1 -172.16.10.2 -172.16.10.3 - -[tikv_servers] -TiKV1-1 ansible_host=172.16.10.4 deploy_dir=/data1/deploy tikv_port=20171 labels="host=tikv1" -TiKV1-2 ansible_host=172.16.10.4 deploy_dir=/data2/deploy tikv_port=20172 labels="host=tikv1" -TiKV2-1 ansible_host=172.16.10.5 deploy_dir=/data1/deploy tikv_port=20171 labels="host=tikv2" -TiKV2-2 ansible_host=172.16.10.5 deploy_dir=/data2/deploy tikv_port=20172 labels="host=tikv2" -TiKV3-1 ansible_host=172.16.10.6 deploy_dir=/data1/deploy tikv_port=20171 labels="host=tikv3" -TiKV3-2 ansible_host=172.16.10.6 deploy_dir=/data2/deploy tikv_port=20172 labels="host=tikv3" - -[monitoring_servers] -172.16.10.1 - -[grafana_servers] -172.16.10.1 - -[monitored_servers] -172.16.10.1 -172.16.10.2 -172.16.10.3 -172.16.10.4 -172.16.10.5 -172.16.10.6 - -... - -[pd_servers:vars] -location_labels = ["host"] -``` - -Edit the parameters in the service configuration file: - -1. For the cluster topology of multiple TiKV instances on each TiKV node, you need to edit the `block-cache-size` parameter in `tidb-ansible/conf/tikv.yml`: - - - `rocksdb defaultcf block-cache-size(GB)`: MEM * 80% / TiKV instance number * 30% - - `rocksdb writecf block-cache-size(GB)`: MEM * 80% / TiKV instance number * 45% - - `rocksdb lockcf block-cache-size(GB)`: MEM * 80% / TiKV instance number * 2.5% (128 MB at a minimum) - - `raftdb defaultcf block-cache-size(GB)`: MEM * 80% / TiKV instance number * 2.5% (128 MB at a minimum) - -2. For the cluster topology of multiple TiKV instances on each TiKV node, you need to edit the `high-concurrency`, `normal-concurrency` and `low-concurrency` parameters in the `tidb-ansible/conf/tikv.yml` file: - - ``` - readpool: - coprocessor: - # Notice: if CPU_NUM > 8, default thread pool size for coprocessors - # will be set to CPU_NUM * 0.8. - # high-concurrency: 8 - # normal-concurrency: 8 - # low-concurrency: 8 - ``` - - Recommended configuration: `number of instances * parameter value = CPU_Vcores * 0.8`. - -3. If multiple TiKV instances are deployed on a same physical disk, edit the `capacity` parameter in `conf/tikv.yml`: - - - `capacity`: total disk capacity / number of TiKV instances (the unit is GB) - -## Step 10: Edit variables in the `inventory.ini` file - -1. Edit the `deploy_dir` variable to configure the deployment directory. - - The global variable is set to `/home/tidb/deploy` by default, and it applies to all services. If the data disk is mounted on the `/data1` directory, you can set it to `/data1/deploy`. For example: - - ```bash - ## Global variables - [all:vars] - deploy_dir = /data1/deploy - ``` - - **Note:** To separately set the deployment directory for a service, you can configure the host variable while configuring the service host list in the `inventory.ini` file. It is required to add the first column alias, to avoid confusion in scenarios of mixed services deployment. - - ```bash - TiKV1-1 ansible_host=172.16.10.4 deploy_dir=/data1/deploy - ``` - -2. Set the `deploy_without_tidb` variable to `True`. - - ```bash - deploy_without_tidb = True - ``` - -> **Note:** If you need to edit other variables, see [the variable description table](https://github.com/pingcap/docs/blob/master/op-guide/ansible-deployment.md#edit-other-variables-optional). - -## Step 11: Deploy the TiKV cluster - -When `ansible-playbook` executes the Playbook, the default concurrent number is 5. If many target machines are deployed, you can add the `-f` parameter to specify the concurrency, such as `ansible-playbook deploy.yml -f 10`. - -The following example uses `tidb` as the user who runs the service. - -1. Check the `tidb-ansible/inventory.ini` file to make sure `ansible_user = tidb`. - - ```bash - ## Connection - # ssh via normal user - ansible_user = tidb - ``` - -2. Make sure the SSH mutual trust and sudo without password are successfully configured. - - - Run the following command and if all servers return `tidb`, then the SSH mutual trust is successfully configured: - - ```bash - ansible -i inventory.ini all -m shell -a 'whoami' - ``` - - - Run the following command and if all servers return `root`, then sudo without password of the `tidb` user is successfully configured: - - ```bash - ansible -i inventory.ini all -m shell -a 'whoami' -b - ``` - -3. Download the TiKV binary to the Control Machine. - - ```bash - ansible-playbook local_prepare.yml - ``` - -4. Initialize the system environment and modify the kernel parameters. - - ```bash - ansible-playbook bootstrap.yml - ``` - -5. Deploy the TiKV cluster. - - ```bash - ansible-playbook deploy.yml - ``` - -6. Start the TiKV cluster. - - ```bash - ansible-playbook start.yml - ``` - -You can check whether the TiKV cluster has been successfully deployed using the following command: - -```bash -curl 172.16.10.1:2379/pd/api/v1/stores -``` - -If you want to try the Go client, see [Try Two Types of APIs](../clients/go-client-api.md). - -## Stop the TiKV cluster - -If you want to stop the TiKV cluster, run the following command: - -```bash -ansible-playbook stop.yml -``` - -## Destroy the TiKV cluster - -> **Warning:** Before you clean the cluster data or destroy the TiKV cluster, make sure you do not need it any more. - -- If you do not need the data any more, you can clean up the data for test using the following command: - - ``` - ansible-playbook unsafe_cleanup_data.yml - ``` - -- If you do not need the TiKV cluster any more, you can destroy it using the following command: - - ```bash - ansible-playbook unsafe_cleanup.yml - ``` - - > **Note:** If the deployment directory is a mount point, an error might be reported, but the implementation result remains unaffected. You can just ignore the error. \ No newline at end of file diff --git a/docs/V2.1/op-guide/deploy-tikv-using-binary.md b/docs/V2.1/op-guide/deploy-tikv-using-binary.md deleted file mode 100644 index 15c9ba2407a..00000000000 --- a/docs/V2.1/op-guide/deploy-tikv-using-binary.md +++ /dev/null @@ -1,155 +0,0 @@ ---- -title: Install and Deploy TiKV Using Binary Files -summary: Use binary files to deploy a TiKV cluster on a single machine or on multiple nodes for testing. -category: operations ---- - -# Install and Deploy TiKV Using Binary Files - -This guide describes how to deploy a TiKV cluster using binary files. - -> **Warning:** Do not use binary files to deploy the TiKV cluster in the production environment. For production, [use Ansible to deploy the TiKV cluster](deploy-tikv-using-ansible.md). - -- To quickly understand and try TiKV, see [Deploy the TiKV cluster on a single machine](#deploy-the-tikv-cluster-on-a-single-machine). -- To try TiKV out and explore the features, see [Deploy the TiKV cluster on multiple nodes for testing](#deploy-the-tikv-cluster-on-multiple-nodes-for-testing). - -## Deploy the TiKV cluster on a single machine - -This section describes how to deploy TiKV on a single machine installed with the Linux system. Take the following steps: - -1. Download the official binary package. - - ```bash - # Download the package. - wget https://download.pingcap.org/tidb-latest-linux-amd64.tar.gz - wget http://download.pingcap.org/tidb-latest-linux-amd64.sha256 - - # Check the file integrity. If the result is OK, the file is correct. - sha256sum -c tidb-latest-linux-amd64.sha256 - - # Extract the package. - tar -xzf tidb-latest-linux-amd64.tar.gz - cd tidb-latest-linux-amd64 - ``` - -2. Start PD. - - ```bash - ./bin/pd-server --name=pd1 \ - --data-dir=pd1 \ - --client-urls="http://127.0.0.1:2379" \ - --peer-urls="http://127.0.0.1:2380" \ - --initial-cluster="pd1=http://127.0.0.1:2380" \ - --log-file=pd1.log - ``` - -3. Start TiKV. - - To start the 3 TiKV instances, open a new terminal tab or window, come to the `tidb-latest-linux-amd64` directory, and start the instances using the following command: - - ```bash - ./bin/tikv-server --pd-endpoints="127.0.0.1:2379" \ - --addr="127.0.0.1:20160" \ - --data-dir=tikv1 \ - --log-file=tikv1.log - - ./bin/tikv-server --pd-endpoints="127.0.0.1:2379" \ - --addr="127.0.0.1:20161" \ - --data-dir=tikv2 \ - --log-file=tikv2.log - - ./bin/tikv-server --pd-endpoints="127.0.0.1:2379" \ - --addr="127.0.0.1:20162" \ - --data-dir=tikv3 \ - --log-file=tikv3.log - ``` - -You can use the [pd-ctl](https://github.com/pingcap/pd/tree/master/tools/pd-ctl) tool to verify whether PD and TiKV are successfully deployed: - -``` -./bin/pd-ctl store -d -u http://127.0.0.1:2379 -``` - -If the state of all the TiKV instances is "Up", you have successfully deployed a TiKV cluster. - -## Deploy the TiKV cluster on multiple nodes for testing - -This section describes how to deploy TiKV on multiple nodes. If you want to test TiKV with a limited number of nodes, you can use one PD instance to test the entire cluster. - -Assume that you have four nodes, you can deploy 1 PD instance and 3 TiKV instances. For details, see the following table: - -| Name | Host IP | Services | -| :-- | :-- | :------------------- | -| Node1 | 192.168.199.113 | PD1 | -| Node2 | 192.168.199.114 | TiKV1 | -| Node3 | 192.168.199.115 | TiKV2 | -| Node4 | 192.168.199.116 | TiKV3 | - -To deploy a TiKV cluster with multiple nodes for test, take the following steps: - -1. Download the official binary package on each node. - - ```bash - # Download the package. - wget https://download.pingcap.org/tidb-latest-linux-amd64.tar.gz - wget http://download.pingcap.org/tidb-latest-linux-amd64.sha256 - - # Check the file integrity. If the result is OK, the file is correct. - sha256sum -c tidb-latest-linux-amd64.sha256 - - # Extract the package. - tar -xzf tidb-latest-linux-amd64.tar.gz - cd tidb-latest-linux-amd64 - ``` - -2. Start PD on Node1. - - ```bash - ./bin/pd-server --name=pd1 \ - --data-dir=pd1 \ - --client-urls="http://192.168.199.113:2379" \ - --peer-urls="http://192.168.199.113:2380" \ - --initial-cluster="pd1=http://192.168.199.113:2380" \ - --log-file=pd1.log - ``` - -3. Log in and start TiKV on other nodes: Node2, Node3 and Node4. - - Node2: - - ```bash - ./bin/tikv-server --pd-endpoints="192.168.199.113:2379" \ - --addr="192.168.199.114:20160" \ - --data-dir=tikv1 \ - --log-file=tikv1.log - ``` - - Node3: - - ```bash - ./bin/tikv-server --pd-endpoints="192.168.199.113:2379" \ - --addr="192.168.199.115:20160" \ - --data-dir=tikv2 \ - --log-file=tikv2.log - ``` - - Node4: - - ```bash - ./bin/tikv-server --pd-endpoints="192.168.199.113:2379" \ - --addr="192.168.199.116:20160" \ - --data-dir=tikv3 \ - --log-file=tikv3.log - ``` - -You can use the [pd-ctl](https://github.com/pingcap/pd/tree/master/tools/pd-ctl) tool to verify whether PD and TiKV are successfully deployed: - -``` -./pd-ctl store -d -u http://192.168.199.113:2379 -``` - -The result displays the store count and detailed information regarding each store. If the state of all the TiKV instances is "Up", you have successfully deployed a TiKV cluster. - -## What's next? - -If you want to try the Go client, see [Try Two Types of APIs](../clients/go-client-api.md). \ No newline at end of file diff --git a/docs/V2.1/op-guide/deploy-tikv-using-docker-compose.md b/docs/V2.1/op-guide/deploy-tikv-using-docker-compose.md deleted file mode 100644 index abdf13407b0..00000000000 --- a/docs/V2.1/op-guide/deploy-tikv-using-docker-compose.md +++ /dev/null @@ -1,162 +0,0 @@ ---- -title: Install and Deploy TiKV Using Docker Compose -summary: Use Docker Compose to quickly deploy a TiKV testing cluster on a single machine. -category: operations ---- - -# Install and Deploy TiKV Using Docker Compose - -This guide describes how to quickly deploy a TiKV testing cluster using [Docker Compose](https://github.com/pingcap/tidb-docker-compose/) on a single machine. Currently, this installation method only supports the Linux system. - -> **Warning:** Do not use Docker Compose to deploy the TiKV cluster in the production environment. For production, [use Ansible to deploy the TiKV cluster](deploy-tikv-using-ansible.md). - -## Prerequisites - -Make sure you have installed the following items on your machine: - -- Docker (17.06.0 or later) and Docker Compose - - ```bash - sudo yum install docker docker-compose - ``` - -- Git - - ``` - sudo yum install git - ``` - -## Install - -Download `tidb-docker-compose`. - -```bash -git clone https://github.com/pingcap/tidb-docker-compose.git -``` - -## Prepare cluster - -In this example, let's run a simple cluster with only 1 PD server and 1 TiKV server. See the following for the basic docker compose configuration file: - -```yaml -version: '2.1' - -services: - pd0: - image: pingcap/pd:latest - ports: - - "2379" - volumes: - - ./config/pd.toml:/pd.toml:ro - - ./data:/data - - ./logs:/logs - command: - - --name=pd0 - - --client-urls=http://0.0.0.0:2379 - - --peer-urls=http://0.0.0.0:2380 - - --advertise-client-urls=http://pd0:2379 - - --advertise-peer-urls=http://pd0:2380 - - --initial-cluster=pd0=http://pd0:2380 - - --data-dir=/data/pd0 - - --config=/pd.toml - - --log-file=/logs/pd0.log - restart: on-failure - - tikv0: - image: pingcap/tikv:latest - volumes: - - ./config/tikv.toml:/tikv.toml:ro - - ./data:/data - - ./logs:/logs - command: - - --addr=0.0.0.0:20160 - - --advertise-addr=tikv0:20160 - - --data-dir=/data/tikv0 - - --pd=pd0:2379 - - --config=/tikv.toml - - --log-file=/logs/tikv0.log - depends_on: - - "pd0" - restart: on-failure -``` - -All the following example docker compose config file will contain this base config. - -## Example - -### Run [YCSB](https://github.com/pingcap/go-ycsb) to connect to the TiKV cluster: - -1. Create a `ycsb-docker-compose.yml` file, add the above base config to this file, and then append the following section: - - ```yaml - ycsb: - image: pingcap/go-ycsb - ``` - -2. Start the cluster: - - ```bash - rm -rf data logs - docker-compose -f ycsb-docker-compose.yml pull - docker-compose -f ycsb-docker-compose.yml up -d - ``` - -3. Start YCSB: - - ```bash - docker-compose -f ycsb-docker-compose.yml run ycsb shell tikv -p tikv.pd=pd0:2379 - ``` - -4. Run YCSB: - - ```bash - INFO[0000] [pd] create pd client with endpoints [pd0:2379] - INFO[0000] [pd] leader switches to: http://pd0:2379, previous: - INFO[0000] [pd] init cluster id 6628733331417096653 - » read a - Read empty for a - » insert a field0=0 - Insert a ok - » read a - Read a ok - field0="0" - ``` - -### Use [Titan](https://github.com/meitu/titan) to connect to the TiKV cluster via the Redis protocol: - -1. Create a `titan-docker-compose.yml` file, add the above base config to this file, and then append the following section: - - ```yaml - titan: - image: meitu/titan - ports: - - "7369:7369" - command: - - --pd-addrs=tikv://pd0:2379 - depends_on: - - "tikv0" - restart: on-failure - ``` - -2. Start the cluster: - - ```bash - rm -rf data logs - docker-compose -f titan-docker-compose.yml pull - docker-compose -f titan-docker-compose.yml up -d - ``` - -3. Use `redis-cli` to communicate with Titan: - - ```bash - redis-cli -p 7369 - 127.0.0.1:7369> set a 1 - OK - 127.0.0.1:7369> get a - "1" - ``` - -## What's next? - -+ If you want to try the Go client, see [Try Two Types of APIs](../clients/go-client-api.md). You need to build your docker image and add it to the docker compose config file like above YCSB or Titan does. -+ If you want to run a full cluster with monitor support, please follow the [tidb-docker-compose guide](https://github.com/pingcap/tidb-docker-compose/blob/master/README.md), comment the `tidb` and `tispark` sections out in the [values.yaml](https://github.com/pingcap/tidb-docker-compose/blob/master/compose/values.yaml), generate the new docker compose config, then add your own binary image and run it. diff --git a/docs/V2.1/op-guide/deploy-tikv-using-docker.md b/docs/V2.1/op-guide/deploy-tikv-using-docker.md deleted file mode 100644 index 9150bd86c50..00000000000 --- a/docs/V2.1/op-guide/deploy-tikv-using-docker.md +++ /dev/null @@ -1,157 +0,0 @@ ---- -title: Install and Deploy TiKV Using Docker -summary: Use Docker to deploy a TiKV cluster on multiple nodes. -category: operations ---- - -# Install and Deploy TiKV Using Docker - -This guide describes how to deploy a multi-node TiKV cluster using Docker. - -> **Warning:** Do not use Docker to deploy the TiKV cluster in the production environment. For production, [use Ansible to deploy the TiKV cluster](deploy-tikv-using-ansible.md). - -## Prerequisites - -Make sure that Docker is installed on each machine. - -For more details about prerequisites, see [Hardware and Software Requirements](https://github.com/pingcap/docs/blob/master/op-guide/recommendation.md). - -## Deploy the TiKV cluster on multiple nodes - -Assume that you have 6 machines with the following details: - -| Name | Host IP | Services | Data Path | -| --------- | ------------- | ---------- | --------- | -| Node1 | 192.168.1.101 | PD1 | /data | -| Node2 | 192.168.1.102 | PD2 | /data | -| Node3 | 192.168.1.103 | PD3 | /data | -| Node4 | 192.168.1.104 | TiKV1 | /data | -| Node5 | 192.168.1.105 | TiKV2 | /data | -| Node6 | 192.168.1.106 | TiKV3 | /data | - -If you want to test TiKV with a limited number of nodes, you can also use one PD instance to test the entire cluster. - -### Step 1: Pull the latest images of TiKV and PD from Docker Hub - -Start Docker and pull the latest images of TiKV and PD from [Docker Hub](https://hub.docker.com) using the following command: - -```bash -docker pull pingcap/tikv:latest -docker pull pingcap/pd:latest -``` - -### Step 2: Log in and start PD - -Log in to the three PD machines and start PD respectively: - -1. Start PD1 on Node1: - - ```bash - docker run -d --name pd1 \ - -p 2379:2379 \ - -p 2380:2380 \ - -v /etc/localtime:/etc/localtime:ro \ - -v /data:/data \ - pingcap/pd:latest \ - --name="pd1" \ - --data-dir="/data/pd1" \ - --client-urls="http://0.0.0.0:2379" \ - --advertise-client-urls="http://192.168.1.101:2379" \ - --peer-urls="http://0.0.0.0:2380" \ - --advertise-peer-urls="http://192.168.1.101:2380" \ - --initial-cluster="pd1=http://192.168.1.101:2380,pd2=http://192.168.1.102:2380,pd3=http://192.168.1.103:2380" - ``` - -2. Start PD2 on Node2: - - ```bash - docker run -d --name pd2 \ - -p 2379:2379 \ - -p 2380:2380 \ - -v /etc/localtime:/etc/localtime:ro \ - -v /data:/data \ - pingcap/pd:latest \ - --name="pd2" \ - --data-dir="/data/pd2" \ - --client-urls="http://0.0.0.0:2379" \ - --advertise-client-urls="http://192.168.1.102:2379" \ - --peer-urls="http://0.0.0.0:2380" \ - --advertise-peer-urls="http://192.168.1.102:2380" \ - --initial-cluster="pd1=http://192.168.1.101:2380,pd2=http://192.168.1.102:2380,pd3=http://192.168.1.103:2380" - ``` - -3. Start PD3 on Node3: - - ```bash - docker run -d --name pd3 \ - -p 2379:2379 \ - -p 2380:2380 \ - -v /etc/localtime:/etc/localtime:ro \ - -v /data:/data \ - pingcap/pd:latest \ - --name="pd3" \ - --data-dir="/data/pd3" \ - --client-urls="http://0.0.0.0:2379" \ - --advertise-client-urls="http://192.168.1.103:2379" \ - --peer-urls="http://0.0.0.0:2380" \ - --advertise-peer-urls="http://192.168.1.103:2380" \ - --initial-cluster="pd1=http://192.168.1.101:2380,pd2=http://192.168.1.102:2380,pd3=http://192.168.1.103:2380" - ``` - -### Step 3: Log in and start TiKV - -Log in to the three TiKV machines and start TiKV respectively: - -1. Start TiKV1 on Node4: - - ```bash - docker run -d --name tikv1 \ - -p 20160:20160 \ - -v /etc/localtime:/etc/localtime:ro \ - -v /data:/data \ - pingcap/tikv:latest \ - --addr="0.0.0.0:20160" \ - --advertise-addr="192.168.1.104:20160" \ - --data-dir="/data/tikv1" \ - --pd="192.168.1.101:2379,192.168.1.102:2379,192.168.1.103:2379" - ``` - -2. Start TiKV2 on Node5: - - ```bash - docker run -d --name tikv2 \ - -p 20160:20160 \ - -v /etc/localtime:/etc/localtime:ro \ - -v /data:/data \ - pingcap/tikv:latest \ - --addr="0.0.0.0:20160" \ - --advertise-addr="192.168.1.105:20160" \ - --data-dir="/data/tikv2" \ - --pd="192.168.1.101:2379,192.168.1.102:2379,192.168.1.103:2379" - ``` - -3. Start TiKV3 on Node6: - - ```bash - docker run -d --name tikv3 \ - -p 20160:20160 \ - -v /etc/localtime:/etc/localtime:ro \ - -v /data:/data \ - pingcap/tikv:latest \ - --addr="0.0.0.0:20160" \ - --advertise-addr="192.168.1.106:20160" \ - --data-dir="/data/tikv3" \ - --pd="192.168.1.101:2379,192.168.1.102:2379,192.168.1.103:2379" - ``` - -You can check whether the TiKV cluster has been successfully deployed using the following command: - -``` -curl 192.168.1.101:2379/pd/api/v1/stores -``` - -If the state of all the TiKV instances is "Up", you have successfully deployed a TiKV cluster. - -## What's next? - -If you want to try the Go client, see [Try Two Types of APIs](../clients/go-client-api.md). \ No newline at end of file diff --git a/docs/V2.1/op-guide/grpc-config.md b/docs/V2.1/op-guide/grpc-config.md deleted file mode 100644 index ce1389f0c85..00000000000 --- a/docs/V2.1/op-guide/grpc-config.md +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: gRPC Configuration -summary: Learn how to configure gRPC. -category: operations ---- - -# gRPC Configuration - -TiKV uses gRPC, a remote procedure call (RPC) framework, to build a distributed transactional key-value database. gRPC is designed to be high-performance, but ill-configured gRPC leads to performance regression of TiKV. This document describes how to configure gRPC. - -## grpc-compression-type - -- Compression type for the gRPC channel -- Default: "none" -- Available values are “none”, “deflate” and “gzip” -- To exchange the CPU time for network I/O, you can set it to “deflate” or “gzip”. It is useful when the network bandwidth is limited - -## grpc-concurrency - -- The size of the thread pool that drives gRPC -- Default: 4. It is suitable for a commodity computer. You can double the size if TiKV is deployed in a high-end server (32 core+ CPU) -- Higher concurrency is for higher QPS, but it consumes more CPU - -## grpc-concurrent-stream - -- The number of max concurrent streams/requests on a connection -- Default: 1024. It is suitable for most workload -- Increase the number if you find that most of your requests are not time consuming, e.g., RawKV Get - -## grpc-keepalive-time - -- Time to wait before sending out a ping to check whether the server is still alive. This is only for the communication between TiKV instances -- Default: 10s - -## grpc-keepalive-timeout - -- Time to wait before closing the connection without receiving the `keepalive` ping ACK -- Default: 3s - -## grpc-raft-conn-num - -- The number of connections with each TiKV server to send Raft messages -- Default: 10 - -## grpc-stream-initial-window-size - -- Amount to Read Ahead on individual gRPC streams -- Default: 2MB -- Larger values can help throughput on high-latency connections \ No newline at end of file diff --git a/docs/V2.1/op-guide/horizontal-scale.md b/docs/V2.1/op-guide/horizontal-scale.md deleted file mode 100644 index e008581ee54..00000000000 --- a/docs/V2.1/op-guide/horizontal-scale.md +++ /dev/null @@ -1,124 +0,0 @@ ---- -title: Scale a TiKV Cluster -summary: Learn how to scale out or scale in a TiKV cluster. -category: operations ---- - -# Scale a TiKV Cluster - -You can scale out a TiKV cluster by adding nodes to increase the capacity without affecting online services. You can also scale in a TiKV cluster by deleting nodes to decrease the capacity without affecting online services. - -> **Note:** If your TiKV cluster is deployed using Ansible, see [Scale the TiKV Cluster Using TiDB-Ansible](ansible-deployment-scale.md). - -## Scale out or scale in PD - -Before increasing or decreasing the capacity of PD, you can view details of the current PD cluster. Assume you have three PD servers with the following details: - -| Name | ClientUrls | PeerUrls | -|:-----|:------------------|:------------------| -| pd1 | http://host1:2379 | http://host1:2380 | -| pd2 | http://host2:2379 | http://host2:2380 | -| pd3 | http://host3:2379 | http://host3:2380 | - -Get the information about the existing PD nodes through `pd-ctl`: - -```bash -./pd-ctl -u http://host1:2379 ->> member -``` - -For the usage of `pd-ctl`, see [PD Control User Guide](../tools/pd-control.md). - -### Add a PD node dynamically - -You can add a new PD node to the current PD cluster using the `join` parameter. To add `pd4`, use `--join` to specify the client URL of any PD server in the PD cluster, like: - -```bash -./bin/pd-server --name=pd4 \ - --client-urls="http://host4:2379" \ - --peer-urls="http://host4:2380" \ - --join="http://host1:2379" -``` - -### Remove a PD node dynamically - -You can remove `pd4` using `pd-ctl`: - -```bash -./pd-ctl -u http://host1:2379 ->> member delete pd4 -``` - -### Replace a PD node dynamically - -You might want to replace a PD node in the following scenarios: - -- You need to replace a faulty PD node with a healthy PD node. -- You need to replace a healthy PD node with a different PD node. - -To replace a PD node, first add a new node to the cluster, migrate all the data from the node you want to remove, and then remove the node. - -You can only replace one PD node at a time. If you want to replace multiple nodes, repeat the above steps until you have replaced all nodes. After completing each step, you can verify the process by checking the information of all nodes. - -## Scale out or scale in TiKV - -Before increasing or decreasing the capacity of TiKV, you can view details of the current TiKV cluster. Get the information about the existing TiKV nodes through `pd-ctl`: - -```bash -./pd-ctl -u http://host1:2379 ->> store -``` - -### Add a TiKV node dynamically - -To add a new TiKV node dynamically, start a TiKV node on a new machine. The newly started TiKV node will automatically register in the existing PD of the cluster. - -To reduce the pressure of the existing TiKV nodes, PD loads balance automatically, which means PD gradually migrates some data to the new TiKV node. - -### Remove a TiKV node dynamically - -To remove a TiKV node safely, you need to inform PD in advance. After that, PD is able to migrate the data on this TiKV node to other TiKV nodes, ensuring that data have enough replicas. - -For example, to remove the TiKV node with the store id 1, you can complete this using `pd-ctl`: - -```bash -./pd-ctl -u http://host1:2379 ->> store delete 1 -``` - -Then you can check the state of this TiKV node: - -```bash -./pd-ctl -u http://host1:2379 ->> store 1 -{ - "store": { - "id": 1, - "address": "127.0.0.1:20160", - "state": 1, - "state_name": "Offline" - }, - "status": { - ... - } -} -``` - -You can verify the state of this store using `state_name`: - - - `state_name=Up`: This store is in service. - - `state_name=Disconnected`: The heartbeats of this store cannot be detected currently, which might be caused by a failure or network interruption. - - `state_name=Down`: PD does not receive heartbeats from the TiKV store for more than an hour (the time can be configured using `max-down-time`). At this time, PD adds a replica for the data on this store. - - `state_name=Offline`: This store is shutting down, but the store is still in service. - - `state_name=Tombstone`: This store is shut down and has no data on it, so the instance can be removed. - -### Replace a TiKV node dynamically - -You might want to replace a TiKV node in the following scenarios: - -- You need to replace a faulty TiKV node with a healthy TiKV node. -- You need to replace a healthy TiKV node with a different TiKV node. - -To replace a TiKV node, first add a new node to the cluster, migrate all the data from the node you want to remove, and then remove the node. - -You can only replace one TiKV node at a time. To verify whether a node has been made offline, you can check the state information of the node in process. After verifying, you can make the next node offline. \ No newline at end of file diff --git a/docs/V2.1/op-guide/key-metrics.md b/docs/V2.1/op-guide/key-metrics.md deleted file mode 100644 index a5477d5cc02..00000000000 --- a/docs/V2.1/op-guide/key-metrics.md +++ /dev/null @@ -1,57 +0,0 @@ ---- -title: Key Metrics -summary: Learn some key metrics displayed on the Grafana Overview dashboard. -category: operations ---- - -# Key Metrics - -If your TiKV cluster is deployed using Ansible or Docker Compose, the monitoring system is deployed at the same time. For more details, see [Overview of the TiKV Monitoring Framework](monitor-overview.md). - -The Grafana dashboard is divided into a series of sub-dashboards which include Overview, PD, TiKV, and so on. You can use various metrics to help you diagnose the cluster. - -For routine operations, you can get an overview of the component (PD, TiKV) status and the entire cluster from the Overview dashboard, where the key metrics are displayed. This document provides a detailed description of these key metrics. - -## Key metrics description - -To understand the key metrics displayed on the Overview dashboard, check the following table: - -Service | Panel Name | Description | Normal Range ----- | ---------------- | ---------------------------------- | -------------- -Services Port Status | Services Online | the online nodes number of each service | -Services Port Status | Services Offline | the offline nodes number of each service | -PD | Storage Capacity | the total storage capacity of the TiKV cluster | -PD | Current Storage Size | the occupied storage capacity of the TiKV cluster | -PD | Number of Regions | the total number of Regions of the current cluster | -PD | Leader Balance Ratio | the leader ratio difference of the nodes with the biggest leader ratio and the smallest leader ratio | It is less than 5% for a balanced situation and becomes bigger when you restart a node. -PD | Region Balance Ratio | the region ratio difference of the nodes with the biggest Region ratio and the smallest Region ratio | It is less than 5% for a balanced situation and becomes bigger when you add or remove a node. -PD | Store Status -- Up Stores | the number of TiKV nodes that are up | -PD | Store Status -- Disconnect Stores | the number of TiKV nodes that encounter abnormal communication within a short time | -PD | Store Status -- LowSpace Stores | the number of TiKV nodes with an available space of less than 80% | -PD | Store Status -- Down Stores | the number of TiKV nodes that are down | The normal value is `0`. If the number is bigger than `0`, it means some node(s) are abnormal. -PD | Store Status -- Offline Stores | the number of TiKV nodes (still providing service) that are being made offline | -PD | Store Status -- Tombstone Stores | the number of TiKV nodes that are successfully offline | -PD | 99% completed_cmds_duration_seconds | the 99th percentile duration to complete a pd-server request | less than 5ms -PD | handle_requests_duration_seconds | the request duration of a PD request | -TiKV | leader | the number of leaders on each TiKV node | -TiKV | region | the number of Regions on each TiKV node | -TiKV | CPU | the CPU usage ratio on each TiKV node | -TiKV | Memory | the memory usage on each TiKV node | -TiKV | store size | the data amount on each TiKV node | -TiKV | cf size | the data amount on different CFs in the cluster | -TiKV | channel full | `No data points` is displayed in normal conditions. If a monitoring value displays, it means the corresponding TiKV node fails to handle the messages | -TiKV | server report failures | `No data points` is displayed in normal conditions. If `Unreachable` is displayed, it means TiKV encounters a communication issue. | -TiKV | scheduler pending commands | the number of commits on queue | Occasional value peaks are normal. -TiKV | coprocessor pending requests | the number of requests on queue | `0` or very small -TiKV | coprocessor executor count | the number of various query operations | -TiKV | coprocessor request duration | the time consumed by TiKV queries | -TiKV | raft store CPU | the CPU usage ratio of the raftstore thread | Currently, it is a single thread. A value of over 80% indicates that the CPU usage ratio is very high. -TiKV | Coprocessor CPU | the CPU usage ratio of the TiKV query thread, related to the application; complex queries consume a great deal of CPU | -System Info | Vcores | the number of CPU cores | -System Info | Memory | the total memory | -System Info | CPU Usage | the CPU usage ratio, 100% at a maximum | -System Info | Load [1m] | the overload within 1 minute | -System Info | Memory Available | the size of the available memory | -System Info | Network Traffic | the statistics of the network traffic | -System Info | TCP Retrans | the statistics about network monitoring and TCP | -System Info | IO Util | the disk usage ratio, 100% at a maximum; generally you need to consider adding a new node when the usage ratio is up to 80% ~ 90% | \ No newline at end of file diff --git a/docs/V2.1/op-guide/label-config.md b/docs/V2.1/op-guide/label-config.md deleted file mode 100644 index e142d173e4f..00000000000 --- a/docs/V2.1/op-guide/label-config.md +++ /dev/null @@ -1,96 +0,0 @@ ---- -title: Label Configuration -summary: Learn how to configure labels. -category: operations ---- - -# Label Configuration - -TiKV uses labels to label its location information and PD schedulers according to the topology of the cluster, to maximize TiKV's capability of disaster recovery. This document describes how to configure labels. - -## TiKV reports the topological information - -In order for PD to get the topology of the cluster, TiKV reports the topological information to PD according to the startup parameter or configuration of TiKV. Assume that the topology has three structures: zone > rack > host, use labels to specify the following information for each TiKV: - -- Startup parameter: - - ``` - tikv-server --labels zone=,rack=,host= - ``` - -- Configuration: - - ``` - [server] - labels = "zone=,rack=,host=" - ``` -## PD understands the TiKV topology - -After getting the topology of the TiKV cluster, PD also needs to know the hierarchical relationship of the topology. You can configure it through the PD configuration or `pd-ctl`: - -- PD configuration: - - ``` - [replication] - max-replicas = 3 - location-labels = ["zone", "rack", "host"] - ``` - -- PD controller: - - ``` - pd-ctl >> config set location-labels zone,rack,host - ``` - -To make PD understand that the labels represents the TiKV topology, keep `location-labels` corresponding to the TiKV `labels` name. See the following example. - -### Example - -PD makes optimal scheduling according to the topological information. You just need to care about what kind of topology can achieve the desired effect. - -If you use 3 replicas and hope that the TiDB cluster is always highly available even when a data zone goes down, you need at least 4 data zones. - -Assume that you have 4 data zones, each zone has 2 racks, and each rack has 2 hosts. You can start 2 TiKV instances on each host as follows: - -Startup TiKV: - -``` -# zone=z1 -tikv-server --labels zone=z1,rack=r1,host=h1 -tikv-server --labels zone=z1,rack=r1,host=h2 -tikv-server --labels zone=z1,rack=r2,host=h1 -tikv-server --labels zone=z1,rack=r2,host=h2 - -# zone=z2 -tikv-server --labels zone=z2,rack=r1,host=h1 -tikv-server --labels zone=z2,rack=r1,host=h2 -tikv-server --labels zone=z2,rack=r2,host=h1 -tikv-server --labels zone=z2,rack=r2,host=h2 - -# zone=z3 -tikv-server --labels zone=z3,rack=r1,host=h1 -tikv-server --labels zone=z3,rack=r1,host=h2 -tikv-server --labels zone=z3,rack=r2,host=h1 -tikv-server --labels zone=z3,rack=r2,host=h2 - -# zone=z4 -tikv-server --labels zone=z4,rack=r1,host=h1 -tikv-server --labels zone=z4,rack=r1,host=h2 -tikv-server --labels zone=z4,rack=r2,host=h1 -tikv-server --labels zone=z4,rack=r2,host=h2 -``` - -Configure PD: - -``` -use `pd-ctl` connect the PD: -# pd-ctl ->> config set location-labels zone,rack,host -``` - -Now the cluster can work well. 16 TiKV instances are distributed across 4 data zones, 8 racks and 16 machines. In this case, PD schedules different replicas of each datum to different data zones. - -- If one of the data zones goes down, the high availability of the TiDB cluster is not affected. -- If the data zone cannot recover within a period of time, PD will remove the replica from this data zone. - -PD maximizes the disaster recovery of the cluster according to the current topology. Therefore, if you want to reach a certain level of disaster recovery, deploy many machines in different sites according to the topology. The number of machines must be more than the number of `max-replicas`. \ No newline at end of file diff --git a/docs/V2.1/op-guide/monitor-a-tikv-cluster.md b/docs/V2.1/op-guide/monitor-a-tikv-cluster.md deleted file mode 100644 index 872eb36b0fe..00000000000 --- a/docs/V2.1/op-guide/monitor-a-tikv-cluster.md +++ /dev/null @@ -1,235 +0,0 @@ ---- -title: Monitor a TiKV Cluster -summary: Learn how to monitor the state of a TiKV cluster. -category: operations ---- - -# Monitor a TiKV Cluster - -Currently, you can use two types of interfaces to monitor the state of the TiKV cluster: - -- [The component state interface](#the-component-state-interface): use the HTTP interface to get the internal information of a component, which is called the component state interface. -- [The metrics interface](#the-metrics-interface): use the Prometheus interface to record the detailed information of various operations in the components, which is called the metrics interface. - -## The component state interface - -You can use this type of interface to monitor the basic information of components. This interface can get the details of the entire TiKV cluster and can act as the interface to monitor Keepalive. - -### The PD server - -The API address of the Placement Driver (PD) is `http://${host}:${port}/pd/api/v1/${api_name}` - -The default port number is 2379. - -For detailed information about various API names, see [PD API doc](https://cdn.rawgit.com/pingcap/docs/master/op-guide/pd-api-v1.html). - -You can use the interface to get the state of all the TiKV instances and the information about load balancing. It is the most important and frequently-used interface to get the state information of all the TiKV nodes. See the following example for the information about a 3-instance TiKV cluster deployed on a single machine: - -```bash -curl http://127.0.0.1:2379/pd/api/v1/stores -{ - "count": 3, - "stores": [ - { - "store": { - "id": 1, - "address": "127.0.0.1:20161", - "version": "2.1.0-rc.2", - "state_name": "Up" - }, - "status": { - "capacity": "937 GiB", - "available": "837 GiB", - "leader_weight": 1, - "region_count": 1, - "region_weight": 1, - "region_score": 1, - "region_size": 1, - "start_ts": "2018-09-29T00:05:47Z", - "last_heartbeat_ts": "2018-09-29T00:23:46.227350716Z", - "uptime": "17m59.227350716s" - } - }, - { - "store": { - "id": 2, - "address": "127.0.0.1:20162", - "version": "2.1.0-rc.2", - "state_name": "Up" - }, - "status": { - "capacity": "937 GiB", - "available": "837 GiB", - "leader_weight": 1, - "region_count": 1, - "region_weight": 1, - "region_score": 1, - "region_size": 1, - "start_ts": "2018-09-29T00:05:47Z", - "last_heartbeat_ts": "2018-09-29T00:23:45.65292648Z", - "uptime": "17m58.65292648s" - } - }, - { - "store": { - "id": 7, - "address": "127.0.0.1:20160", - "version": "2.1.0-rc.2", - "state_name": "Up" - }, - "status": { - "capacity": "937 GiB", - "available": "837 GiB", - "leader_count": 1, - "leader_weight": 1, - "leader_score": 1, - "leader_size": 1, - "region_count": 1, - "region_weight": 1, - "region_score": 1, - "region_size": 1, - "start_ts": "2018-09-29T00:05:47Z", - "last_heartbeat_ts": "2018-09-29T00:23:44.853636067Z", - "uptime": "17m57.853636067s" - } - } - ] -} -``` - -## The metrics interface - -You can use this type of interface to monitor the state and performance of the entire cluster. The metrics data is displayed in Prometheus and Grafana. See [Use Prometheus and Grafana](#use-prometheus-and-grafana) for how to set up the monitoring system. - -You can get the following metrics for each component: - -### The PD server - -- the total number of times that the command executes -- the total number of times that a certain command fails -- the duration that a command succeeds -- the duration that a command fails -- the duration that a command finishes and returns result - -### The TiKV server - -- Garbage Collection (GC) monitoring -- the total number of times that the TiKV command executes -- the duration that Scheduler executes commands -- the total number of times of the Raft propose command -- the duration that Raft executes commands -- the total number of times that Raft commands fail -- the total number of times that Raft processes the ready state - -## Use Prometheus and Grafana - -This section introduces the deployment architecture of Prometheus and Grafana in TiKV, and how to set up and configure the monitoring system. - -### The deployment architecture - -See the following diagram for the deployment architecture: - -![deployment architecture of Prometheus and Grafana in TiKV](../../images/monitor-architecture.png) - -> **Note:** You must add the Prometheus Pushgateway addresses to the startup parameters of the PD and TiKV components. - -### Set up the monitoring system - -See the following links for your reference: - -- Prometheus Pushgateway: [https://github.com/prometheus/pushgateway](https://github.com/prometheus/pushgateway) - -- Prometheus Server: [https://github.com/prometheus/prometheus#install](https://github.com/prometheus/prometheus#install) - -- Grafana: [http://docs.grafana.org](http://docs.grafana.org/) - -## Manual configuration - -This section describes how to manually configure PD and TiKV, PushServer, Prometheus, and Grafana. - -> **Note:** If your TiKV cluster is deployed using Ansible or Docker Compose, the configuration is automatically done, and generally, you do not need to configure it manually again. If your TiKV cluster is deployed using Docker, you can follow the configuration steps below. - -### Configure PD and TiKV - -+ PD: update the `toml` configuration file with the Pushgateway address and the the push frequency: - - ```toml - [metric] - # prometheus client push interval, set "0s" to disable prometheus. - interval = "15s" - # prometheus pushgateway address, leaves it empty will disable prometheus. - address = "host:port" - ``` - -+ TiKV: update the `toml` configuration file with the Pushgateway address and the the push frequency. Set the `job` field to `"tikv"`. - - ```toml - [metric] - # the Prometheus client push interval. Setting the value to 0s stops Prometheus client from pushing. - interval = "15s" - # the Prometheus pushgateway address. Leaving it empty stops Prometheus client from pushing. - address = "host:port" - # the Prometheus client push job name. Note: A node id will automatically append, e.g., "tikv_1". - job = "tikv" - ``` - -### Configure PushServer - -Generally, you can use the default port `9091` and do not need to configure PushServer. - -### Configure Prometheus - -Add the Pushgateway address to the `yaml` configuration file: - -```yaml - scrape_configs: -# The job name is added as a label `job=` to any timeseries scraped from this config. -- job_name: 'TiKV' - - # Override the global default and scrape targets from this job every 5 seconds. - scrape_interval: 5s - - honor_labels: true - - static_configs: - - targets: ['host:port'] # use the Pushgateway address -labels: - group: 'production' - ``` - -### Configure Grafana - -#### Create a Prometheus data source - -1. Log in to the Grafana Web interface. - - - Default address: [http://localhost:3000](http://localhost:3000) - - Default account name: admin - - Default password: admin - -2. Click the Grafana logo to open the sidebar menu. - -3. Click "Data Sources" in the sidebar. - -4. Click "Add data source". - -5. Specify the data source information: - - - Specify the name for the data source. - - For Type, select Prometheus. - - For Url, specify the Prometheus address. - - Specify other fields as needed. - -6. Click "Add" to save the new data source. - -#### Create a Grafana dashboard - -1. Click the Grafana logo to open the sidebar menu. - -2. On the sidebar menu, click "Dashboards" -> "Import" to open the "Import Dashboard" window. - -3. Click "Upload .json File" to upload a JSON file (Download [TiDB Grafana Config](https://grafana.com/tidb)). - -4. Click "Save & Open". - -5. A Prometheus dashboard is created. \ No newline at end of file diff --git a/docs/V2.1/op-guide/monitor-overview.md b/docs/V2.1/op-guide/monitor-overview.md deleted file mode 100644 index d8d5139c051..00000000000 --- a/docs/V2.1/op-guide/monitor-overview.md +++ /dev/null @@ -1,26 +0,0 @@ ---- -title: Overview of the TiKV Monitoring Framework -summary: Use Prometheus and Grafana to build the TiKV monitoring framework. -category: operations ---- - -# Overview of the TiKV Monitoring Framework - -The TiKV monitoring framework adopts two open-source projects: [Prometheus](https://github.com/prometheus/prometheus) and [Grafana](https://github.com/grafana/grafana). TiKV uses Prometheus to store the monitoring and performance metrics, and uses Grafana to visualize these metrics. - -## About Prometheus in TiKV - -As a time series database, Prometheus has a multi-dimensional data model and flexible query language. Prometheus consists of multiple components. Currently, TiKV uses the following components: - -- Prometheus Server: to scrape and store time series data -- Client libraries: to customize necessary metrics in the application -- Pushgateway: to receive the data from Client Push for the Prometheus main server -- AlertManager: for the alerting mechanism - -The diagram is as follows: - -![Prometheus in TiKV](../../images/prometheus-in-tikv.png) - -## About Grafana in TiKV - -[Grafana](https://github.com/grafana/grafana) is an open-source project for analyzing and visualizing metrics. TiKV uses Grafana to display the performance metrics. \ No newline at end of file diff --git a/docs/V2.1/op-guide/namespace-config.md b/docs/V2.1/op-guide/namespace-config.md deleted file mode 100644 index 48b15ac0baa..00000000000 --- a/docs/V2.1/op-guide/namespace-config.md +++ /dev/null @@ -1,125 +0,0 @@ ---- -title: Namespace Configuration -summary: Learn how to configure namespace in TiKV. -category: operations ---- - -# Namespace Configuration - -Namespace is a mechanism used to meet the requirements of resource isolation. In this mechanism, TiKV supports dividing all the TiKV nodes in the cluster among multiple separate namespaces and classifying Regions into the corresponding namespace by using a custom namespace classifier. - -In this case, there is actually a constraint for the scheduling policy: the namespace that a Region belongs to should match the namespace of TiKV where each replica of this Region resides. PD continuously performs the constraint check during runtime. When it finds unmatched namespaces, it will schedule the Regions to make the replica distribution conform to the namespace configuration. - -In a typical TiDB cluster, the most common resource isolation requirement is resource isolation based on the SQL table schema -- for example, using non-overlapping hosts to carry data for different services. Therefore, PD provides `tableNamespaceClassifier` based on the SQL table schema by default. You can also adjust the PD server's `--namespace-classifier` parameter to use another custom classifier. - -## Configure namespace - -You can use `pd-ctl` to configure the namespace based on the table schema. All related operations are integrated in the `table_ns` subcommand. Here is an example. - -1. Create 2 namespaces: - - ```bash - ./bin/area-pd-ctl - » table_ns - { - "count": 0, - "namespaces": [] - } - - » table_ns create ns1 - » table_ns create ns2 - » table_ns - { - "count": 2, - "namespaces": [ - { - "ID": 30, - "Name": "ns1" - }, - { - "ID": 31, - "Name": "ns2" - } - ] - } - ``` - -Then two namespaces, `ns1` and `ns2`, are created. But they do not work because they are not bound with any TiKV nodes or tables. - -2. Divide some TiKV nodes to the 2 namespaces: - - ```bash - » table_ns set_store 1 ns1 - » table_ns set_store 2 ns1 - » table_ns set_store 3 ns1 - » table_ns set_store 4 ns2 - » table_ns set_store 5 ns2 - » table_ns set_store 6 ns2 - » table_ns - { - "count": 2, - "namespaces": [ - { - "ID": 30, - "Name": "ns1", - "store_ids": { - "1": true, - "2": true, - "3": true - } - }, - { - "ID": 31, - "Name": "ns2", - "store_ids": { - "4": true, - "5": true, - "6": true - } - } - ] - } - ``` - -3. Divide some tables to the corresponding namespace (the table ID information can be obtained through TiDB's API): - - ```bash - » table_ns add ns1 1001 - » table_ns add ns2 1002 - » table_ns - { - "count": 2, - "namespaces": [ - { - "ID": 30, - "Name": "ns1", - "table_ids": { - "1001": true - }, - "store_ids": { - "1": true, - "2": true, - "3": true - } - }, - { - "ID": 31, - "Name": "ns2", - "table_ids": { - "1002": true - }, - "store_ids": { - "4": true, - "5": true, - "6": true - } - } - ] - } - ``` - -The namespace configuration is finished. PD will schedule the replicas of table 1001 to TiKV nodes 1,2, and 3 and schedule the replicas of table 1002 to TiKV nodes 4, 5, and 6. - -In addition, PD supports some other `table_ns` subcommands, such as the `remove` and `rm_store` commands which remove the table and TiKV node from the specified namespace respectively. PD also supports setting different scheduling configurations within the namespace. For more details, see [PD Control User Guide](https://github.com/pingcap/docs/blob/master/tools/pd-control.md). - -When the namespace configuration is updated, the namespace constraint may be violated. It will take a while for PD to complete the scheduling process. You can view all Regions that violate the constraint using the `pd-ctl` command `region check incorrect-ns`. \ No newline at end of file diff --git a/docs/V2.1/op-guide/pd-scheduler-config.md b/docs/V2.1/op-guide/pd-scheduler-config.md deleted file mode 100644 index d79d2cc0fc3..00000000000 --- a/docs/V2.1/op-guide/pd-scheduler-config.md +++ /dev/null @@ -1,100 +0,0 @@ ---- -title: PD Scheduler Configuration -summary: Learn how to configure PD Scheduler. -category: operations ---- - -# PD Scheduler Configuration - -PD Scheduler is responsible for scheduling the storage and computing resources. PD has many kinds of schedulers to meet the requirements in different scenarios. PD Scheduler is one of the most important component in PD. - -The basic workflow of PD Scheduler is as follows. First, the scheduler is triggered according to `minAdjacentSchedulerInterval` defined in `ScheduleController`. Then it tries to select the source store and the target store, create the corresponding operators and send a message to TiKV to do some operations. - -## Usage description - -This section describes the usage of PD Scheduler parameters. - -### `max-merge-region-keys && max-merge-region-size` - -If the Region size is smaller than `max-merge-region-size` and the number of keys in the Region is smaller than `max-merge-region-keys` at the same time, the Region will try to merge with adjacent Regions. The default value of both the two parameters is 0. Currently, `merge` is not enabled by default. - -### `split-merge-interval` - -`split-merge-interval` is the minimum interval time to allow merging after split. The default value is "1h". - -### `max-snapshot-count` - -If the snapshot count of one store is larger than the value of `max-snapshot-count`, it will never be used as a source or target store. The default value is 3. - -### `max-pending-peer-count` - -If the pending peer count of one store is larger than the value of `max-pending-peer-count`, it will never be used as a source or target store. The default value is 16. - -### `max-store-down-time` - -`max-store-down-time` is the maximum duration after which a store is considered to be down if it has not reported heartbeats. The default value is “30m”. - -### `leader-schedule-limit` - -`leader-schedule-limit` is the maximum number of coexistent leaders that are under scheduling. The default value is 4. - -### `region-schedule-limit` - -`region-schedule-limit` is the maximum number of coexistent Regions that are under scheduling. The default value is 4. - -### `replica-schedule-limit` - -`replica-schedule-limit` is the maximum number of coexistent replicas that are under scheduling. The default value is 8. - -### `merge-schedule-limit` - -`merge-schedule-limit` is the maximum number of coexistent merges that are under scheduling. The default value is 8. - -### `tolerant-size-ratio` - -`tolerant-size-ratio` is the ratio of buffer size for the balance scheduler. The default value is 5.0. - -### `low-space-ratio` - -`low-space-ratio` is the lowest usage ratio of a storage which can be regarded as low space. When a storage is in low space, the score turns to be high and varies inversely with the available size. - -### `high-space-ratio` - -`high-space-ratio` is the highest usage ratio of storage which can be regarded as high space. High space means there is a lot of available space of the storage, and the score varies directly with the used size. - -### `disable-raft-learner` - -`disable-raft-learner` is the option to disable `AddNode` and use `AddLearnerNode` instead. - -### `disable-remove-down-replica` - -`disable-remove-down-replica` is the option to prevent replica checker from removing replicas whose status are down. - -### `disable-replace-offline-replica` - -`disable-replace-offline-replica` is the option to prevent the replica checker from replacing offline replicas. - -### `disable-make-up-replica` - -`disable-make-up-replica` is the option to prevent the replica checker from making up replicas when the count of replicas is less than expected. - -### `disable-remove-extra-replica` - -`disable-remove-extra-replica` is the option to prevent the replica checker from removing extra replicas. - -### `disable-location-replacement` - -`disable-location-replacement` is the option to prevent the replica checker from moving the replica to a better location. - -## Customization - -The default schedulers include `balance-leader`, `balance-region` and `hot-region`. In addition, you can also customize the schedulers. For each scheduler, the configuration has three variables: `type`, `args` and `disable`. - -Here is an example to enable the `evict-leader` scheduler in the `config.toml` file: - -``` -[[schedule.schedulers]] -type = "evict-leader" -args = ["1"] -disable = false -``` \ No newline at end of file diff --git a/docs/V2.1/op-guide/recommendation.md b/docs/V2.1/op-guide/recommendation.md deleted file mode 100644 index f16566a13ce..00000000000 --- a/docs/V2.1/op-guide/recommendation.md +++ /dev/null @@ -1,79 +0,0 @@ ---- -title: Software and Hardware Requirements -summary: Learn the software and hardware requirements for deploying and running TiKV. -category: operations ---- - -# Software and Hardware Requirements - -As an open source distributed Key-Value database with high performance, TiKV can be deployed in the Intel architecture server and major virtualization environments and runs well. TiKV supports most of the major hardware networks and Linux operating systems. - -TiKV must work together with [Placement Driver](https://github.com/pingcap/pd/) (PD). PD is the cluster manager of TiKV, which periodically checks replication constraints to balance load and data automatically. - -## Linux OS version requirements - -| Linux OS Platform | Version | -| :-----------------------:| :----------: | -| Red Hat Enterprise Linux | 7.3 or later | -| CentOS | 7.3 or later | -| Oracle Enterprise Linux | 7.3 or later | -| Ubuntu LTS | 16.04 or later | - -> **Note:** -> -> - For Oracle Enterprise Linux, TiKV supports the Red Hat Compatible Kernel (RHCK) and does not support the Unbreakable Enterprise Kernel provided by Oracle Enterprise Linux. -> - A large number of TiKV tests have been run on the CentOS 7.3 system, and in our community there are a lot of best practices in which TiKV is deployed on the Linux operating system. Therefore, it is recommended to deploy TiKV on CentOS 7.3 or later. -> - The support for the Linux operating systems above includes the deployment and operation in physical servers as well as in major virtualized environments like VMware, KVM and XEN. - -## Server requirements - -You can deploy and run TiKV on the 64-bit generic hardware server platform in the Intel x86-64 architecture. The requirements and recommendations about server hardware configuration for development, test and production environments are as follows: - -### Development and test environments - -| Component | CPU | Memory | Local Storage | Network | Instance Number (Minimum Requirement) | -| :------: | :-----: | :-----: | :----------: | :------: | :----------------: | -| PD | 4 core+ | 8 GB+ | SAS, 200 GB+ | Gigabit network card | 1 | -| TiKV | 8 core+ | 32 GB+ | SAS, 200 GB+ | Gigabit network card | 3 | -| | | | | Total Server Number | 4 | - -> **Note**: -> -> - Do not deploy PD and TiKV on the same server. -> - For performance-related test, do not use low-performance storage and network hardware configuration, in order to guarantee the correctness of the test result. - -### Production environment - -| Component | CPU | Memory | Hard Disk Type | Network | Instance Number (Minimum Requirement) | -| :-----: | :------: | :------: | :------: | :------: | :-----: | -| PD | 4 core+ | 8 GB+ | SSD | 10 Gigabit network card (2 preferred) | 3 | -| TiKV | 16 core+ | 32 GB+ | SSD | 10 Gigabit network card (2 preferred) | 3 | -| Monitor | 8 core+ | 16 GB+ | SAS | Gigabit network card | 1 | -| | | | | Total Server Number | 7 | - -> **Note**: -> -> - It is strongly recommended to use higher configuration in the production environment. -> - It is recommended to keep the size of TiKV hard disk within 2 TB if you are using PCI-E SSD disks or within 1.5 TB if you are using regular SSD disks. - -## Network requirements - -TiKV requires the following network port configuration to run. Based on the TiKV deployment in actual environments, the administrator can open relevant ports in the network side and host side. - -| Component | Default Port | Description | -| :--:| :--: | :-- | -| TiKV | 20160 | the TiKV communication port | -| PD | 2380 | the inter-node communication port within the PD cluster | -| Pump | 8250 | the Pump communication port | -| Drainer | 8249 | the Drainer communication port | -| Prometheus | 9090 | the communication port for the Prometheus service| -| Pushgateway | 9091 | the aggregation and report port for TiKV, and PD monitor | -| Node_exporter | 9100 | the communication port to report the system information of each TiKV cluster node | -| Blackbox_exporter | 9115 | the `Blackbox_exporter` communication port, used to monitor the ports in the TiKV cluster | -| Grafana | 3000 | the port for the external Web monitoring service and client (Browser) access| -| Grafana | 8686 | the `grafana_collector` communication port, used to export the Dashboard as the PDF format | -| Kafka_exporter | 9308 | the `Kafka_exporter` communication port, used to monitor the binlog Kafka cluster | - -## Web browser requirements - -Based on the Prometheus and Grafana platform, TiKV provides a visual data monitoring solution to monitor the TiKV cluster status. To access the Grafana monitor interface, it is recommended to use a higher version of Microsoft IE, Google Chrome or Mozilla Firefox. diff --git a/docs/V2.1/op-guide/rocksdb-option-config.md b/docs/V2.1/op-guide/rocksdb-option-config.md deleted file mode 100644 index 6cbf0b9c20c..00000000000 --- a/docs/V2.1/op-guide/rocksdb-option-config.md +++ /dev/null @@ -1,385 +0,0 @@ ---- -title: RocksDB Option Configuration -summary: Learn how to configure RocksDB options. -category: operations ---- - -# RocksDB Option Configuration - -TiKV uses RocksDB as its underlying storage engine for storing both Raft logs and KV (key-value) pairs. [RocksDB](https://github.com/facebook/rocksdb/wiki) is a highly customizable persistent key-value store that can be tuned to run on a variety of production environments, including pure memory, Flash, hard disks or HDFS. It supports various compression algorithms and good tools for production support and debugging. - -## Configuration - -TiKV creates two RocksDB instances called `rocksdb` and `raftdb` separately. - -- `rocksdb` has three column families: - - - `rocksdb.defaultcf` is used to store actual KV pairs of TiKV - - `rocksdb.writecf` is used to store the commit information in the MVCC model - - `rocksdb.lockcf` is used to store the lock information in the MVCC model - -- `raftdb` has only one column family called `raftdb.defaultcf`, which is used to store the Raft logs. - -Each RocksDB instance and column family is configurable. Below explains the details of DBOptions for tuning the RocksDB instance and CFOptions for tuning the column family. - -### DBOptions - -#### max-background-jobs - -- The maximum number of concurrent background jobs (compactions and flushes) - -#### max-background-flushes - -- The maximum number of concurrent background memtable flush jobs - -#### max-sub-compactions - -- The maximum number of threads that will concurrently perform a compaction job by breaking the job into multiple smaller ones that run simultaneously - -#### max-open-files - -- The number of open files that can be used by RocksDB. You may need to increase this if your database has a large working set -- Value -1 means files opened are always kept open. You can estimate the number of files based on `target_file_size_base` and `target_file_size_multiplier` for level-based compaction -- If max-open-files = -1, RocksDB will prefetch index blocks and filter blocks into block cache at startup, so if your database has a large working set, it will take several minutes to open RocksDB - -#### max-manifest-file-size - -- The maximum size of RocksDB's MANIFEST file. For details, see [MANIFEST](https://github.com/facebook/rocksdb/wiki/MANIFEST) - -#### create-if-missing - -- If it is true, the database will be created when it is missing - -#### wal-recovery-mode - -RocksDB WAL(write-ahead log) recovery mode: - -- `0`: TolerateCorruptedTailRecords, tolerates incomplete record in trailing data on all logs -- `1`: AbsoluteConsistency, tolerates no We don't expect to find any corruption (all the I/O errors are considered as corruptions) in the WAL -- `2`: PointInTimeRecovery, recovers to point-in-time consistency -- `3`: SkipAnyCorruptedRecords, recovery after a disaster - -#### wal-dir - -- RocksDB write-ahead logs directory path. This specifies the absolute directory path for write-ahead logs -- If it is empty, the log files will be in the same directory as data -- When you set the path to the RocksDB directory in memory like in `/dev/shm`, you may want to set `wal-dir` to a directory on a persistent storage. For details, see [RocksDB documentation](https://github.com/facebook/rocksdb/wiki/How-to-persist-in-memory-RocksDB-database) - -#### wal-ttl-seconds - -See [wal-size-limit](#wal-size-limit) - -#### wal-size-limit - -`wal-ttl-seconds` and `wal-size-limit` affect how archived write-ahead logs will be deleted - -- If both are set to 0, logs will be deleted immediately and will not get into the archive -- If `wal-ttl-seconds` is 0 and `wal-size-limit` is not 0, - WAL files will be checked every 10 minutes and if the total size is greater - than `wal-size-limit`, WAL files will be deleted from the earliest position with the - earliest until `size_limit` is met. All empty files will be deleted -- If `wal-ttl-seconds` is not 0 and `wal-size-limit` is 0, - WAL files will be checked every wal-ttl-seconds / 2 and those that - are older than `wal-ttl-seconds` will be deleted -- If both are not 0, WAL files will be checked every 10 minutes and both `ttl` and `size` checks will be performed with ttl being first -- When you set the path to the RocksDB directory in memory like in `/dev/shm`, you may want to set `wal-ttl-seconds` to a value greater than 0 (like 86400) and backup your RocksDB on a regular basis. For details, see [RocksDB documentation](https://github.com/facebook/rocksdb/wiki/How-to-persist-in-memory-RocksDB-database) - -#### wal-bytes-per-sync - -- Allows OS to incrementally synchronize WAL to the disk while the log is being written - -#### max-total-wal-size - -- Once the total size of write-ahead logs exceeds this size, RocksDB will start forcing the flush of column families whose memtables are backed up by the oldest live WAL file -- If it is set to 0, we will dynamically set the WAL size limit to be [sum of all write_buffer_size * max_write_buffer_number] * 4 - -#### enable-statistics - -- RocksDB statistics provide cumulative statistics over time. Turning statistics on will introduce about 5%-10% overhead for RocksDB, but it is worthwhile to know the internal status of RocksDB - -#### stats-dump-period - -- Dumps statistics periodically in information logs - -#### compaction-readahead-size - -- According to [RocksDB FAQ](https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ): if you want to use RocksDB on multi disks or spinning disks, you should set this value to at least 2MB - -#### writable-file-max-buffer-size - -- The maximum buffer size that is used by `WritableFileWrite` - -#### use-direct-io-for-flush-and-compaction - -- Uses `O_DIRECT` for both reads and writes in background flush and compactions - -#### rate-bytes-per-sec - -- Limits the disk I/O of compaction and flush -- Compaction and flush can cause terrible spikes if they exceed a certain threshold. It is recommended to set this to 50% ~ 80% of the disk throughput for a more stable result. But for heavy write workload, limiting compaction and flush speed can cause write stalls too - -#### enable-pipelined-write - -- Enables/Disables the pipelined write. For details, see [Pipelined Write](https://github.com/facebook/rocksdb/wiki/Pipelined-Write) - -#### bytes-per-sync - -- Allows OS to incrementally synchronize files to the disk while the files are being written asynchronously in the background - -#### info-log-max-size - -- Specifies the maximum size of the RocksDB log file -- If the log file is larger than `max_log_file_size`, a new log file will be created -- If max_log_file_size == 0, all logs will be written to one log file - -#### info-log-roll-time - -- Time for the RocksDB log file to roll (in seconds) -- If it is specified with non-zero value, the log file will be rolled when its active time is longer than `log_file_time_to_roll` - -#### info-log-keep-log-file-num - -- The maximum number of RocksDB log files to be kept - -#### info-log-dir - -- Specifies the RocksDB info log directory -- If it is empty, the log files will be in the same directory as data -- If it is non-empty, the log files will be in the specified directory, and the absolute path of RocksDB data directory will be used as the prefix of the log file name - -### CFOptions - -#### compression-per-level - -- Per level compression. The compression method (if any) is used to compress a block - - - no: kNoCompression - - snappy: kSnappyCompression - - zlib: kZlibCompression - - bzip2: kBZip2Compression - - lz4: kLZ4Compression - - lz4hc: kLZ4HCCompression - - zstd: kZSTD - -- For details, see [Compression of RocksDB](https://github.com/facebook/rocksdb/wiki/Compression) - -#### block-size - -- Approximate size of user data packed per block. The block size specified here corresponds to the uncompressed data - -#### bloom-filter-bits-per-key - -- If you're doing point lookups, you definitely want to turn bloom filters on. Bloom filter is used to avoid unnecessary disk read -- Default: 10, which yields ~1% false positive rate -- Larger values will reduce false positive rate, but will increase memory usage and space amplification - -#### block-based-bloom-filter - -- False: one `sst` file has a corresponding bloom filter -- True: every block has a corresponding bloom filter - -#### level0-file-num-compaction-trigger - -- The number of files to trigger level-0 compaction -- A value less than 0 means that level-0 compaction will not be triggered by the number of files - -#### level0-slowdown-writes-trigger - -- Soft limit on the number of level-0 files. The write performance is slowed down at this point - -#### level0-stop-writes-trigger - -- The maximum number of level-0 files. The write operation is stopped at this point - -#### write-buffer-size - -- The amount of data to build up in memory (backed up by an unsorted log on the disk) before it is converted to a sorted on-disk file - -#### max-write-buffer-number - -- The maximum number of write buffers that are built up in memory - -#### min-write-buffer-number-to-merge - -- The minimum number of write buffers that will be merged together before writing to the storage - -#### max-bytes-for-level-base - -- Controls the maximum total data size for the base level (level 1). - -#### target-file-size-base - -- Target file size for compaction - -#### max-compaction-bytes - -- The maximum bytes for `compaction.max_compaction_bytes` - -#### compaction-pri - -There are four different algorithms to pick files to compact: - -- `0`: ByCompensatedSize -- `1`: OldestLargestSeqFirst -- `2`: OldestSmallestSeqFirst -- `3`: MinOverlappingRatio - -#### block-cache-size - -- Caches uncompressed blocks -- Big block-cache can speed up the read performance. Generally, this should be set to 30%-50% of the system's total memory - -#### cache-index-and-filter-blocks - -- Indicates if index/filter blocks will be put to the block cache -- If it is not specified, each "table reader" object will pre-load the index/filter blocks during table initialization - -#### pin-l0-filter-and-index-blocks - -- Pins level0 filter and index blocks in the cache - -#### read-amp-bytes-per-bit - -Enables read amplification statistics -- value => memory usage (percentage of loaded blocks memory) -- 0 => disable -- 1 => 12.50 % -- 2 => 06.25 % -- 4 => 03.12 % -- 8 => 01.56 % -- 16 => 00.78 % - -#### dynamic-level-bytes - -- Picks the target size of each level dynamically -- This feature can reduce space amplification. It is highly recommended to setit to true. For details, see [Dynamic Level Size for Level-Based Compaction]( https://rocksdb.org/blog/2015/07/23/dynamic-level.html) - -## Template - -This template shows the default RocksDB configuration for TiKV: - -``` -[rocksdb] -max-background-jobs = 8 -max-sub-compactions = 1 -max-open-files = 40960 -max-manifest-file-size = "20MB" -create-if-missing = true -wal-recovery-mode = 2 -wal-dir = "/tmp/tikv/store" -wal-ttl-seconds = 0 -wal-size-limit = 0 -max-total-wal-size = "4GB" -enable-statistics = true -stats-dump-period = "10m" -compaction-readahead-size = 0 -writable-file-max-buffer-size = "1MB" -use-direct-io-for-flush-and-compaction = false -rate-bytes-per-sec = 0 -enable-pipelined-write = true -bytes-per-sync = "0MB" -wal-bytes-per-sync = "0KB" -info-log-max-size = "1GB" -info-log-roll-time = "0s" -info-log-keep-log-file-num = 10 -info-log-dir = "" - -# Column Family default used to store actual data of the database. -[rocksdb.defaultcf] -compression-per-level = ["no", "no", "lz4", "lz4", "lz4", "zstd", "zstd"] -block-size = "64KB" -bloom-filter-bits-per-key = 10 -block-based-bloom-filter = false -level0-file-num-compaction-trigger = 4 -level0-slowdown-writes-trigger = 20 -level0-stop-writes-trigger = 36 -write-buffer-size = "128MB" -max-write-buffer-number = 5 -min-write-buffer-number-to-merge = 1 -max-bytes-for-level-base = "512MB" -target-file-size-base = "8MB" -max-compaction-bytes = "2GB" -compaction-pri = 3 -block-cache-size = "1GB" -cache-index-and-filter-blocks = true -pin-l0-filter-and-index-blocks = true -read-amp-bytes-per-bit = 0 -dynamic-level-bytes = true - -# Options for Column Family write -# Column Family write used to store commit information in MVCC model -[rocksdb.writecf] -compression-per-level = ["no", "no", "lz4", "lz4", "lz4", "zstd", "zstd"] -block-size = "64KB" -write-buffer-size = "128MB" -max-write-buffer-number = 5 -min-write-buffer-number-to-merge = 1 -max-bytes-for-level-base = "512MB" -target-file-size-base = "8MB" -# In normal cases it should be tuned to 10%-30% of the system's total memory. -block-cache-size = "256MB" -level0-file-num-compaction-trigger = 4 -level0-slowdown-writes-trigger = 20 -level0-stop-writes-trigger = 36 -cache-index-and-filter-blocks = true -pin-l0-filter-and-index-blocks = true -compaction-pri = 3 -read-amp-bytes-per-bit = 0 -dynamic-level-bytes = true - -[rocksdb.lockcf] -compression-per-level = ["no", "no", "no", "no", "no", "no", "no"] -block-size = "16KB" -write-buffer-size = "128MB" -max-write-buffer-number = 5 -min-write-buffer-number-to-merge = 1 -max-bytes-for-level-base = "128MB" -target-file-size-base = "8MB" -block-cache-size = "256MB" -level0-file-num-compaction-trigger = 1 -level0-slowdown-writes-trigger = 20 -level0-stop-writes-trigger = 36 -cache-index-and-filter-blocks = true -pin-l0-filter-and-index-blocks = true -compaction-pri = 0 -read-amp-bytes-per-bit = 0 -dynamic-level-bytes = true - -[raftdb] -max-sub-compactions = 1 -max-open-files = 40960 -max-manifest-file-size = "20MB" -create-if-missing = true -enable-statistics = true -stats-dump-period = "10m" -compaction-readahead-size = 0 -writable-file-max-buffer-size = "1MB" -use-direct-io-for-flush-and-compaction = false -enable-pipelined-write = true -allow-concurrent-memtable-write = false -bytes-per-sync = "0MB" -wal-bytes-per-sync = "0KB" -info-log-max-size = "1GB" -info-log-roll-time = "0s" -info-log-keep-log-file-num = 10 -info-log-dir = "" - -[raftdb.defaultcf] -compression-per-level = ["no", "no", "lz4", "lz4", "lz4", "zstd", "zstd"] -block-size = "64KB" -write-buffer-size = "128MB" -max-write-buffer-number = 5 -min-write-buffer-number-to-merge = 1 -max-bytes-for-level-base = "512MB" -target-file-size-base = "8MB" -# should tune to 256MB~2GB. -block-cache-size = "256MB" -level0-file-num-compaction-trigger = 4 -level0-slowdown-writes-trigger = 20 -level0-stop-writes-trigger = 36 -cache-index-and-filter-blocks = true -pin-l0-filter-and-index-blocks = true -compaction-pri = 0 -read-amp-bytes-per-bit = 0 -dynamic-level-bytes = true -``` \ No newline at end of file diff --git a/docs/V2.1/op-guide/security-config.md b/docs/V2.1/op-guide/security-config.md deleted file mode 100644 index 1d28bffedf9..00000000000 --- a/docs/V2.1/op-guide/security-config.md +++ /dev/null @@ -1,21 +0,0 @@ ---- -title: TiKV Security Configuration -summary: Learn about the security configuration in TiKV. -category: operations ---- - -# TiKV Security Configuration - -TiKV has SSL/TLS integration to encrypt the data exchanged between nodes. This document describes the security configuration in the TiKV cluster. - -## ca-path = "/path/to/ca.pem" - -The path to the file that contains the PEM encoding of the server’s CA certificates. - -## cert-path = "/path/to/cert.pem" - -The path to the file that contains the PEM encoding of the server’s certificate chain. - -## key-path = "/path/to/key.pem" - -The path to the file that contains the PEM encoding of the server’s private key. \ No newline at end of file diff --git a/docs/V2.1/op-guide/storage-config.md b/docs/V2.1/op-guide/storage-config.md deleted file mode 100644 index 9ab99a8fe46..00000000000 --- a/docs/V2.1/op-guide/storage-config.md +++ /dev/null @@ -1,108 +0,0 @@ ---- -title: TiKV Storage Configuration -summary: Learn how to configure TiKV Storage. -category: operations ---- - -# TiKV Storage Configuration - -In TiKV, Storage is the component responsible for handling read and write requests. Note that if you are using TiKV with TiDB, most read requests are handled by the Coprocessor component instead of Storage. - -## Configuration - -There are two sections related to Storage: `[readpool.storage]` and `[storage]`. - -### `[readpool.storage]` - -This configuration section mainly affects storage read operations. Most read requests from TiDB are not controlled by this configuration section. For configuring the read requests from TiDB, see [Coprocessor configurations](coprocessor-config.md). - -There are 3 thread pools for handling read operations, namely read-high, read-normal and read-low, which process high-priority, normal-priority and low-priority read requests respectively. The priority can be specified by corresponding fields in the gRPC request. - -#### `high-concurrency` - -- Specifies the thread pool size for handling high priority requests -- Default value: 4. It means at most 4 CPU cores are used -- Minimum value: 1 -- It must be larger than zero but should not exceed the number of CPU cores of the host machine -- If you are running multiple TiKV instances on the same machine, make sure that the sum of this configuration item does not exceed number of CPU cores. For example, assuming that you have a 48 core server running 3 TiKVs, then the `high-concurrency` value for each instance should be less than 16 -- Do not set this configuration item to a too small value, otherwise your read request QPS is limited. On the other hand, larger value is not always the most optimal choice because there could be larger resource contention - - -#### `normal-concurrency` - -- Specifies the thread pool size for handling normal priority requests -- Default value: 4 -- Minimum value: 1 - -#### `low-concurrency` - -- Specifies the thread pool size for handling low priority requests -- Default value: 4 -- Minimum value: 1 -- Generally, you don’t need to ensure that the sum of high + normal + low < number of CPU cores, because a single request is handled by only one of them - -#### `max-tasks-per-worker-high` - -- Specifies the max number of running operations for each thread in the read-high thread pool, which handles high priority read requests. Because a throttle of the thread-pool level instead of single thread level is performed, the max number of running operations for the read-high thread pool is limited to `max-tasks-per-worker-high * high-concurrency` -- Default value: 2000 -- Minimum value: 2000 -- If the number of running operations exceeds this configuration, new operations are simply rejected without being handled and it will contain an error header telling that TiKV is busy -- Generally, you don’t need to adjust this configuration unless you are following trustworthy advice - -#### `max-tasks-per-worker-normal` - -- Specifies the max running operations for each thread in the read-normal thread pool, which handles normal priority read requests. -- Default value: 2000 -- Minimum value: 2000 - -#### `max-tasks-per-worker-low` - -- Specifies the max running operations for each thread in the read-low thread pool, which handles low priority read requests -- Default value: 2000 -- Minimum value: 2000 - -#### `stack-size` - -- Sets the stack size for each thread in the three thread pools. For large requests, you need a large stack to handle -- Default value: 10MB -- Minimum value: 2MB - -### `[storage]` - -This configuration section mainly affects storage write operations, including where data is stored and the TiKV component Scheduler. Scheduler is the core component in Storage that coordinates and processes write requests. It contains a channel to coordinate requests and a thread pool to process requests. - -#### `data-dir` - -- Specifies the path to the data directory -- Default value: /tmp/tikv/store -- Make sure that the data directory is moved before changing this configuration - -#### `scheduler-notify-capacity` - -- Specifies the Scheduler channel size -- Default value: 10240 -- Do not set it too small, otherwise TiKV might crash -- Do not set it too large, because it might consume more memory -- Generally, you don’t need to adjust this configuration unless you are following trustworthy advice - -#### `scheduler-concurrency` - -- Specifies the number of slots of Scheduler’s latch, which controls concurrent write requests -- Default value: 2048000 -- You can set it to a larger value to reduce latch contention if there are a lot of write requests. But it will consume more memory - -#### `scheduler-worker-pool-size` - -- Specifies the Scheduler’s thread pool size. Write requests are finally handled by each worker thread of this thread pool -- Default value: 8 (>= 16 cores) or 4 (< 16 cores) -- Minimum value: 1 -- This configuration must be set larger than zero but should not exceed the number of CPU cores of the host machine -- On machines with more than 16 CPU cores, the default value of this configuration is 8, otherwise 4 -- If you have heavy write requests, you can set this configuration to a larger value. If you are running multiple TiKV instances on the same machine, make sure that the sum of this configuration item does not exceed the number of CPU cores -- You should not set this configuration item to a too small value, otherwise your write request QPS is limited. On the other hand, a larger value is not always the most optimal choice because there could be larger resource contention - -#### `scheduler-pending-write-threshold` - -- Specifies the maximum allowed byte size of pending writes -- Default value: 100MB -- If the size of pending write bytes exceeds this threshold, new requests are simply rejected with the “scheduler too busy” error and not be handled \ No newline at end of file diff --git a/docs/V2.1/overview.md b/docs/V2.1/overview.md deleted file mode 100644 index 42b31f4f37b..00000000000 --- a/docs/V2.1/overview.md +++ /dev/null @@ -1,60 +0,0 @@ ---- -title: Overview of TiKV -summary: Learn about the key features, architecture, and two types of APIs of TiKV. -category: overview ---- - -# Overview of TiKV - -TiKV (The pronunciation is: /'taɪkeɪvi:/ tai-K-V, etymology: titanium) is a distributed Key-Value database which is based on the design of Google Spanner and HBase, but it is much simpler without dependency on any distributed file system. - -As the storage layer of TiDB, TiKV can work separately and does not depend on the SQL layer of TiDB. To apply to different scenarios, TiKV provides [two types of APIs](#two-types-of-apis) for developers: the Raw Key-Value API and the Transactional Key-Value API. - -The key features of TiKV are as follows: - -- **Geo-Replication** - - TiKV uses [Raft](http://raft.github.io/) and the [Placement Driver](https://github.com/pingcap/pd/) to support Geo-Replication. - -- **Horizontal scalability** - - With Placement Driver and carefully designed Raft groups, TiKV excels in horizontal scalability and can easily scale to 100+ TBs of data. - -- **Consistent distributed transactions** - - Similar to Google's Spanner, TiKV supports externally-consistent distributed transactions. - -- **Coprocessor support** - - Similar to HBase, TiKV implements a Coprocessor framework to support distributed computing. - -- **Cooperates with [TiDB](https://github.com/pingcap/tidb)** - - Thanks to the internal optimization, TiKV and TiDB can work together to be a compelling database solution with high horizontal scalability, externally-consistent transactions, and support for RDMBS and NoSQL design patterns. - -## Architecture - -The TiKV server software stack is as follows: - -![The TiKV software stack](../images/tikv_stack.png) - -- **Placement Driver:** Placement Driver (PD) is the cluster manager of TiKV. PD periodically checks replication constraints to balance load and data automatically. -- **Store:** There is a RocksDB within each Store and it stores data into local disk. -- **Region:** Region is the basic unit of Key-Value data movement. Each Region is replicated to multiple Nodes. These multiple replicas form a Raft group. -- **Node:** A physical node in the cluster. Within each node, there are one or more Stores. Within each Store, there are many Regions. - -When a node starts, the metadata of the Node, Store and Region are recorded into PD. The status of each Region and Store is reported to PD regularly. - -## Two types of APIs - -TiKV provides two types of APIs for developers: - -- [The Raw Key-Value API](clients/go-client-api.md#try-the-raw-key-value-api) - - If your application scenario does not need distributed transactions or MVCC (Multi-Version Concurrency Control) and only need to guarantee the atomicity towards one key, you can use the Raw Key-Value API. - -- [The Transactional Key-Value API](clients/go-client-api.md#try-the-transactional-key-value-api) - - If your application scenario requires distributed ACID transactions and the atomicity of multiple keys within a transaction, you can use the Transactional Key-Value API. - -Compared to the Transactional Key-Value API, the Raw Key-Value API is more performant with lower latency and easier to use. \ No newline at end of file diff --git a/docs/V2.1/tools/pd-control.md b/docs/V2.1/tools/pd-control.md deleted file mode 100644 index 90fb38d03d1..00000000000 --- a/docs/V2.1/tools/pd-control.md +++ /dev/null @@ -1,808 +0,0 @@ ---- -title: PD Control User Guide -summary: Use PD Control to obtain the state information of a cluster and tune a cluster. -category: tools ---- - -# PD Control User Guide - -As a command line tool of PD, PD Control obtains the state information of the cluster and tunes the cluster. - -## Source code compiling - -1. [Go](https://golang.org/) Version 1.11 or later -2. In the root directory of the [PD project](https://github.com/pingcap/pd), use the `make` command to compile and generate `bin/pd-ctl` - -> **Note:** Generally, you do not need to compile source code because the PD Control tool already exists in the released Binary or Docker. For developer users, the `make` command can be used to compile source code. - -## Usage - -Single-command mode: - -```bash -./pd-ctl store -d -u http://127.0.0.1:2379 -``` - -Interactive mode: - -```bash -./pd-ctl -u http://127.0.0.1:2379 -``` - -Use environment variables: - -```bash -export PD_ADDR=http://127.0.0.1:2379 -./pd-ctl -``` - -Use TLS to encrypt: - -```bash -./pd-ctl -u https://127.0.0.1:2379 --cacert="path/to/ca" --cert="path/to/cert" --key="path/to/key" -``` - -## Command line flags - -### \-\-pd,-u - -+ PD address -+ Default address: http://127.0.0.1:2379 -+ Environment variable: PD_ADDR - -### \-\-detach,-d - -+ Use single command line mode (not entering readline) -+ Default: false - -### --cacert - -+ Specify the path to the certificate file of the trusted CA in PEM format -+ Default: "" - -### --cert - -+ Specify the path to the certificate of SSL in PEM format -+ Default: "" - -### --key - -+ Specify the path to the certificate key file of SSL in PEM format, which is the private key of the certificate specified by `--cert` -+ Default: "" - -### --version,-V - -+ Print the version information and exit -+ Default: false - -## Command - -### `cluster` - -Use this command to view the basic information of the cluster. - -Usage: - -```bash ->> cluster // To show the cluster information -{ - "id": 6493707687106161130, - "max_peer_count": 3 -} -``` - -### `config [show | set