Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Add the K8s agent set up document #6184

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/deployment/agents/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ If you are using a managed deployment of Flyte, you will need to contact your de
- Configuring your Flyte deployment for the SnowFlake agent.
* - {ref}`OpenAI Batch <deployment-agent-setup-openai-batch>`
- Submit requests to OpenAI GPT models for asynchronous batch processing.
* - {ref}`LinkedIn K8s Service Batch <deployment-agent-setup-k8sservice>`
- Configuring your Flyte deployment for the K8s service agent.
```

```{toctree}
Expand All @@ -49,4 +51,5 @@ sagemaker_inference
sensor
snowflake
openai_batch
k8sservice
```
178 changes: 178 additions & 0 deletions docs/deployment/agents/k8sservice.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
.. _deployment-agent-setup-k8sservice:

Kubernetes (K8s) Service Agent
==================
shuyingliang marked this conversation as resolved.
Show resolved Hide resolved

The Kubernetes (K8s) Service Agent enables machine learning (ML) users to efficiently handle non-training tasks—such as data loading, caching, and processing—concurrently with training jobs in Kubernetes clusters.
shuyingliang marked this conversation as resolved.
Show resolved Hide resolved
This capability is particularly valuable in deep learning applications, such as those in Graph Neural Networks (GNNs).

This guide offers a comprehensive overview of setting up the K8s Service Agent within your Flyte deployment.
shuyingliang marked this conversation as resolved.
Show resolved Hide resolved

Spin up a cluster
-----------------

.. tabs::

.. group-tab:: Flyte binary

You can spin up a demo cluster using the following command:

.. code-block:: bash

flytectl demo start

Or install Flyte using the :ref:`flyte-binary helm chart <deployment-deployment-cloud-simple>`.

.. group-tab:: Flyte core

If you've installed Flyte using the
`flyte-core helm chart <https://github.com/flyteorg/flyte/tree/master/charts/flyte-core>`__, please ensure:

* You have the correct kubeconfig and have selected the correct Kubernetes context.
* You have configured the correct flytectl settings in ``~/.flyte/config.yaml``.

.. note::

Add the Flyte chart repo to Helm if you're installing via the Helm charts.

.. code-block:: bash

helm repo add flyteorg https://flyteorg.github.io/flyte

Specify agent configuration
----------------------------

Enable the K8s service agent by adding the following config to the relevant YAML file(s):

.. code-block:: yaml

tasks:
task-plugins:
enabled-plugins:
- agent-service
default-for-task-types:
- dataservicetask: agent-service

.. code-block:: yaml

plugins:
agent-service:
agents:
k8sservice-agent:
endpoint: <AGENT_ENDPOINT>
insecure: true
agentForTaskTypes:
- dataservicetask: k8sservice-agent
- sensor: k8sservice-agent

Substitute ``<AGENT_ENDPOINT>`` with the endpoint of your MMCloud agent.


Setup the RBAC
----------------------

The k8s agent will create statefulset and expose the service end point for the statefulset pods.
shuyingliang marked this conversation as resolved.
Show resolved Hide resolved
RBAC needs to be set up to allow the K8s agent to CRUD statefulset and service.
shuyingliang marked this conversation as resolved.
Show resolved Hide resolved

The role `flyte-flyteagent-role` set up:
.. code-block:: yaml

# Example of the role/binding set up for the data service to create/update/delete resources in the sandbox flyte namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: flyte-flyteagent-role
namespace: flyte
labels:
app.kubernetes.io/name: flyteagent
app.kubernetes.io/instance: flyte
rules:
- apiGroups:
- apps
resources:
- statefulsets
- statefulsets/status
- statefulsets/scale
- statefulsets/finalizers
verbs:
- get
- list
- watch
- create
- update
- delete
- patch
- apiGroups:
- ""
resources:
- pods
- configmaps
- serviceaccounts
- secrets
- pods/exec
- pods/log
- pods/status
- services
verbs:
- '*'


The binding `flyte-flyteagent-rolebinding` for the role `flyte-flyteagent-role`
.. code-block:: yaml
# Example of the role/binding set up for the data service to create/update/delete resources in the sandbox flyte namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: flyte-flyteagent-rolebinding
namespace: flyte
labels:
app.kubernetes.io/name: flyteagent
app.kubernetes.io/instance: flyte
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: flyte-flyteagent-role
subjects:
- kind: ServiceAccount
name: flyteagent
namespace: flyte

Upgrade the deployment
----------------------

.. tabs::

.. group-tab:: Flyte binary

.. tabs::

.. group-tab:: Demo cluster

.. code-block:: bash

kubectl rollout restart deployment flyte-sandbox -n flyte

.. group-tab:: Helm chart

.. code-block:: bash

helm upgrade <RELEASE_NAME> flyteorg/flyte-binary -n <YOUR_NAMESPACE> --values <YOUR_YAML_FILE>

Replace ``<RELEASE_NAME>`` with the name of your release (e.g., ``flyte-backend``),
``<YOUR_NAMESPACE>`` with the name of your namespace (e.g., ``flyte``),
and ``<YOUR_YAML_FILE>`` with the name of your YAML file.

.. group-tab:: Flyte core

.. code-block::

helm upgrade <RELEASE_NAME> flyte/flyte-core -n <YOUR_NAMESPACE> --values values-override.yaml

Replace ``<RELEASE_NAME>`` with the name of your release (e.g., ``flyte``)
and ``<YOUR_NAMESPACE>`` with the name of your namespace (e.g., ``flyte``).

Wait for the upgrade to complete. You can check the status of the deployment pods by running the following command:

.. code-block::

kubectl get pods -n flyte
Loading