Skip to content

Commit

Permalink
Add feature: Support AWS with Cilium
Browse files Browse the repository at this point in the history
Signed-off-by: lou-lan <[email protected]>
  • Loading branch information
lou-lan committed Jul 31, 2024
1 parent c39e413 commit 23d9467
Show file tree
Hide file tree
Showing 12 changed files with 628 additions and 464 deletions.
36 changes: 36 additions & 0 deletions .github/workflows/auto-pr-ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,42 @@ jobs:
cni: flannel
secrets: inherit

e2e_ipv4-ubuntu-latest-cilium:
needs: [ call_build_ci_image, prepare ]
if: ${{ always() && needs.prepare.outputs.e2e_enabled == 'true' && needs.prepare.outputs.ipfamily_ipv4only_e2e == 'true' }}
uses: ./.github/workflows/call-e2e.yaml
with:
ref: ${{ needs.prepare.outputs.ref }}
ipfamily: ipv4
e2e_labels: ${{ needs.prepare.outputs.e2e_labels }}
kind_node_image: ${{ needs.prepare.outputs.kindNodeImage }}
cni: cilium
secrets: inherit

e2e_ipv6-ubuntu-latest-cilium:
needs: [ call_build_ci_image, prepare ]
if: ${{ always() && needs.prepare.outputs.e2e_enabled == 'true' && needs.prepare.outputs.ipfamily_ipv6only_e2e == 'true' }}
uses: ./.github/workflows/call-e2e.yaml
with:
ref: ${{ needs.prepare.outputs.ref }}
ipfamily: ipv6
e2e_labels: ${{ needs.prepare.outputs.e2e_labels }}
kind_node_image: ${{ needs.prepare.outputs.kindNodeImage }}
cni: cilium
secrets: inherit

e2e_dual-ubuntu-latest-cilium:
needs: [ call_build_ci_image, prepare ]
if: ${{ always() && needs.prepare.outputs.e2e_enabled == 'true' && needs.prepare.outputs.ipfamily_dual_e2e == 'true' }}
uses: ./.github/workflows/call-e2e.yaml
with:
ref: ${{ needs.prepare.outputs.ref }}
ipfamily: dual
e2e_labels: ${{ needs.prepare.outputs.e2e_labels }}
kind_node_image: ${{ needs.prepare.outputs.kindNodeImage }}
cni: cilium
secrets: inherit

e2e_ipv4-ubuntu-latest-weave:
needs: [call_build_ci_image, prepare]
if: ${{ always() && needs.prepare.outputs.e2e_enabled == 'true' && needs.prepare.outputs.ipfamily_ipv4only_e2e == 'true' }}
Expand Down
1 change: 1 addition & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ nav:
- Failover: usage/EgressGatewayFailover.md
- Move EgressIP: usage/MoveIP.md
- Run EgressGateway on Aliyun Cloud: usage/Aliyun.md
- Run EgressGateway on AWS Cloud: usage/AwsWithCilium.md
- Troubleshooting: usage/Troubleshooting.md
- Concepts:
- Architecture: concepts/Architecture.md
Expand Down
149 changes: 149 additions & 0 deletions docs/usage/AwsWithCilium.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Using EgressGateway with AWS Cilium CNI

## Introduction

This article introduces the use of EgressGateway in a Cilium CNI networking environment on AWS Kubernetes. EgressGateway supports multiple nodes as high-availability (HA) exit gateways for pods. You can use EgressGateway to save on public IP costs while achieving fine-grained control over pods that need to access external networks.

Compared to Cilium's Egress feature, EgressGateway supports HA. If you don't need HA, consider using Cilium's Egress feature first.

The following sections will guide you step-by-step to install EgressGateway, create a sample pod, and configure an egress policy for the pod to access the internet via the gateway node.

## Create Cluster and Install Cilium

Refer to the [Cilium Quick Installation Guide](https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default) to create an AWS cluster and install Cilium. At the time of writing, the Cilium version used is 1.15.6. If you encounter unexpected issues with other versions, please let us know.

Ensure that the EC2 nodes added to your Kubernetes cluster have [public IPs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-instance-addressing.html). You can test this by `ssh root@host` into your node.

```shell
curl ipinfo.io
```

Using curl, you should see a response that includes your node's public IP.

## Install EgressGateway

Add the helm repository and install EgressGateway. We enable IPv4 with `feature.enableIPv4=true` and disable IPv6 with `feature.enableIPv6=false`. We also specify to exclude the cluster's CIDR from the gateway with `feature.clusterCIDR.extraCidr[0]=172.16.0.0/16`.

```shell
helm repo add egressgateway https://spidernet-io.github.io/egressgateway/
helm repo update

helm install egress --wait \
--debug egressgateway/egressgateway \
--set feature.enableIPv4=true \
--set feature.enableIPv6=false \
--set feature.clusterCIDR.extraCidr[0]=172.16.0.0/16
```

## Create EgressGateway CR

List the current nodes.

```shell
~ kubectl get nodes -A -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP
ip-172-16-103-117.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.103.117 34.239.162.85
ip-172-16-61-234.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.61.234 54.147.15.230
ip-172-16-62-200.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.62.200 54.147.16.130
```

We select `ip-172-16-103-117.ec2.internal` and `ip-172-16-62-200.ec2.internal` as gateway nodes. Label the nodes with `egress=true`.

```shell
kubectl label node ip-172-16-103-117.ec2.internal egress=true
kubectl label node ip-172-16-62-200.ec2.internal egress=true
```

Create the EgressGateway CR, using `egress: "true"` to select nodes as exit gateways.

```yaml
apiVersion: egressgateway.spidernet.io/v1beta1
kind: EgressGateway
metadata:
name: "egressgateway"
spec:
nodeSelector:
selector:
matchLabels:
egress: "true"
```
## Create a Test Pod
List the current nodes.
```shell
~ kubectl get nodes -A -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP
ip-172-16-103-117.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.103.117 34.239.162.85
ip-172-16-61-234.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.61.234 54.147.15.230
ip-172-16-62-200.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.62.200 54.147.16.130
```

We select `ip-172-16-61-234.ec2.internal` to run the pod.

```yaml
apiVersion: v1
kind: Pod
metadata:
name: mock-app
labels:
app: mock-app
spec:
nodeName: ip-172-16-61-234.ec2.internal
containers:
- name: nginx
image: nginx
```
Ensure the pods are in the Running state.
```shell
~ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
egressgateway-agent-zw426 1/1 Running 0 15m 172.16.103.117 ip-172-16-103-117.ec2.internal <none> <none>
egressgateway-agent-zw728 1/1 Running 0 15m 172.16.61.234 ip-172-16-61-234.ec2.internal <none> <none>
egressgateway-controller-6cc84c6985-9gbgd 1/1 Running 0 15m 172.16.51.178 ip-172-16-61-234.ec2.internal <none> <none>
mock-app 1/1 Running 0 12m 172.16.51.74 ip-172-16-61-234.ec2.internal <none> <none>
```

## Create EgressPolicy CR

We create the following YAML for the EgressGateway CR. We use `spec.podSelector` to match the pod created above and `spec.egressGatewayName` to specify the gateway created earlier. `spec.egressIP.useNodeIP` specifies using the node's IP to access the internet.

```yaml
apiVersion: egressgateway.spidernet.io/v1beta1
kind: EgressPolicy
metadata:
name: test-egw-policy
namespace: default
spec:
egressIP:
useNodeIP: true
appliedTo:
podSelector:
matchLabels:
app: mock-app
egressGatewayName: egressgateway
```
### Test Exit IP Address
Use exec to enter the container and run `curl ipinfo.io`. You should see that the pod on the current node is accessing the internet through the gateway node, and `ipinfo.io` will return the host IP. Since EgressGateway implements HA using master-backup, when an EIP node switches, the pod will automatically switch to the matching backup node, and the exit IP will change accordingly.

```shell
kubectl exec -it -n default mock-app bash
curl ipinfo.io
{
"ip": "34.239.162.85",
"hostname": "ec2-34-239-162-85.compute-1.amazonaws.com",
"city": "Ashburn",
"region": "Virginia",
"country": "US",
"loc": "39.0437,-77.4875",
"org": "AS14618 Amazon.com, Inc.",
"postal": "20147",
"timezone": "America/New_York",
"readme": "https://ipinfo.io/missingauth"
}
```
153 changes: 153 additions & 0 deletions docs/usage/AwsWithCilium.zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# 在 AWS Cilium CNI 下使用 EgressGateway

## 介绍

本文介绍了在 AWS Kubernetes 的 Cilium CNI 网络环境下,运行 EgressGateway。EgressGateway 支持多个 Node 作为 Pod 的高可用(HA)出口网关,你可以通过 EgressGateway 来节省公网 IP 费用,同时实现对需要访问外部网络的 Pod 进行精细化控制。

EgressGateway 相对于 Cilium 的 Egress 功能,支持 HA 高可用。如果你没有此需要,应当先考虑使用 Cilium 的 Egress 功能。

接下来的章节将逐步引导您安装 EgressGateway,创建一个示例 Pod,并为该 Pod 配置 Egress 策略,使其通过出口网关节点访问互联网。

## 创建集群及安装 Cilium

参考 [Cilium 安装指南](https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default) 文档创建 AWS 集群,并安装 Cilium。 编写本文时,使用的 Cilium 版本为 1.15.6,如果您在其他版本出现非预期情况,请和我们反馈。

你创建的 Kubernetes 集群时,加入的 EC2 节点要具备[公网 IP](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-instance-addressing.html)。你可以 `ssh root@host` 到您的节点进行测试。

```shell
curl ipinfo.io
```

通过 curl 您可以看到返回结果包含你 Node 的公网 IP。


## 安装 EgressGateway

添加 helm 仓库,并安装 EgressGateway。我们通过 `feature.enableIPv4=true` 来启用 IPv4,`feature.enableIPv6=false` 来禁用 IPv6。
我们通过 `feature.clusterCIDR.extraCidr[0]=172.16.0.0/16` 来指定排除集群的 CIDR 走到网关。

```shell
helm repo add egressgateway https://spidernet-io.github.io/egressgateway/
helm repo update

helm install egress --wait \
--debug egressgateway/egressgateway \
--set feature.enableIPv4=true \
--set feature.enableIPv6=false \
--set feature.clusterCIDR.extraCidr[0]=172.16.0.0/16
```

## 创建 EgressGateway CR

查看当前节点。

```shell
~ kubectl get nodes -A -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP
ip-172-16-103-117.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.103.117 34.239.162.85
ip-172-16-61-234.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.61.234 54.147.15.230
ip-172-16-62-200.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.62.200 54.147.16.130
```

我们选择 `ip-172-16-103-117.ec2.internal``ip-172-16-62-200.ec2.internal` 作为网关节点。给节点设置 `egress=true` 标签。

```shell
kubectl label node ip-172-16-103-117.ec2.internal egress=true
kubectl label node ip-172-16-62-200.ec2.internal egress=true
```

创建 EgressGateway CR,我们通过 `egress: "true"` 来选择节点作为出口网关。

```yaml
apiVersion: egressgateway.spidernet.io/v1beta1
kind: EgressGateway
metadata:
name: "egressgateway"
spec:
nodeSelector:
selector:
matchLabels:
egress: "true"
```
## 创建测试 Pod
查看当前节点。
```shell
~ kubectl get nodes -A -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP
ip-172-16-103-117.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.103.117 34.239.162.85
ip-172-16-61-234.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.61.234 54.147.15.230
ip-172-16-62-200.ec2.internal Ready <none> 25m v1.30.0-eks-036c24b 172.16.62.200 54.147.16.130
```

我们选择 ip-172-16-61-234.ec2.internal 节点运行 Pod。

```yaml
apiVersion: v1
kind: Pod
metadata:
name: mock-app
labels:
app: mock-app
spec:
nodeName: ip-172-16-61-234.ec2.internal
containers:
- name: nginx
image: nginx
```
查看确保 Pods 处于 Running 状态。
```shell
~ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
egressgateway-agent-zw426 1/1 Running 0 15m 172.16.103.117 ip-172-16-103-117.ec2.internal <none> <none>
egressgateway-agent-zw728 1/1 Running 0 15m 172.16.61.234 ip-172-16-61-234.ec2.internal <none> <none>
egressgateway-controller-6cc84c6985-9gbgd 1/1 Running 0 15m 172.16.51.178 ip-172-16-61-234.ec2.internal <none> <none>
mock-app 1/1 Running 0 12m 172.16.51.74 ip-172-16-61-234.ec2.internal <none> <none>
```

## 创建 EgressPolicy CR

我们创建下面 YAML,EgressGateway CR,我们使用 `spec.podSelector` 来匹配上面创建的 Pod。`spec.egressGatewayName` 则制定了了我们上面创建的网关。
使用 `spec.egressIP.useNodeIP` 来指定使用节点的 IP 作为访问互联网的地址。

```yaml
apiVersion: egressgateway.spidernet.io/v1beta1
kind: EgressPolicy
metadata:
name: test-egw-policy
namespace: default
spec:
egressIP:
useNodeIP: true
appliedTo:
podSelector:
matchLabels:
app: mock-app
egressGatewayName: egressgateway
```
### 测试出口 IP 地址
使用 exec 进入容器,运行 `curl ipinfo.io`,你可以看到当前节点的 Pod 已经使用网关节点访问互联网,`ipinfo.io` 会回显主机 IP。
由于 EgressGateway 使用主备实现 HA,当 EIP 节点发生切换时,Pod 会自动切换到匹配的备用节点,同时出口 IP 也会发生变化。

```shell
kubectl exec -it -n default mock-app bash
curl ipinfo.io
{
"ip": "34.239.162.85",
"hostname": "ec2-34-239-162-85.compute-1.amazonaws.com",
"city": "Ashburn",
"region": "Virginia",
"country": "US",
"loc": "39.0437,-77.4875",
"org": "AS14618 Amazon.com, Inc.",
"postal": "20147",
"timezone": "America/New_York",
"readme": "https://ipinfo.io/missingauth"
}
```
Loading

1 comment on commit 23d9467

@weizhoublue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.