Skip to content

Commit

Permalink
Update walk through and user guides (#1288)
Browse files Browse the repository at this point in the history
**Pull Request Checklist**
- [x] Fixes #1273 
- [ ] ~Tests added~ Docs only
- [x] Documentation/examples added
- [x] [Good commit messages](https://cbea.ms/git-commit/) and/or PR
title

**Description of PR**
Main changes:
* Surface the "hera developer features" walk through section in the walk
through
before the "authentication" section so users start seeing Hera's unique
features
  earlier
  * Added the "Set class defaults" section here
* Separate out the experimental features into its own section as it's
big enough
* Rename the "advanced features" section to "further reading" to be less
scary
* Remove the under-utilised Pydantic integration section, move relevant
sections of text to the user guides - new "Integrated Pydantic Support"
section
  in script-basics.md for the RunnerScriptConstructor
* Add more motivation to the Runner IO to make it more obvious why it's
needed

---------

Signed-off-by: Elliot Gunton <[email protected]>
  • Loading branch information
elliotgunton authored Dec 11, 2024
1 parent 93da5f1 commit 964f400
Show file tree
Hide file tree
Showing 12 changed files with 241 additions and 126 deletions.
8 changes: 7 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -125,10 +125,16 @@ examples: ## Generate documentation files for examples
@(cd docs && poetry run python generate.py)

.PHONY: build-docs
build-docs: ## Generate (and host) documentation locally
build-docs: ## Generate documentation locally
@python -m pip install --exists-action=w --no-cache-dir -r docs/requirements.txt
@python -m mkdocs build --clean --site-dir build/docs/html --config-file mkdocs.yml

# If you run this target mkdocs will watch the `docs` folder, so any changes
# will be reflected in your browser momentarily (without refreshing!)
.PHONY: host-docs
host-docs: ## Host and open the documentation locally (and rebuild automatically)
@python -m mkdocs serve --open --clean --config-file mkdocs.yml

.PHONY: regenerate-example
regenerate-example: ## Regenerates the yaml for a single example, using EXAMPLE_FILENAME envvar
regenerate-example: install
Expand Down
Original file line number Diff line number Diff line change
@@ -1,92 +1,4 @@
# Advanced Hera Features

This section is used to publicize Hera's features beyond the essentials covered in the walk through. Note that these
features do not exist in Argo as they are specific to Hera.

## Pre-Build Hooks

Hera offers a pre-build hook feature through `hera.shared.register_pre_build_hook` with huge flexibility to do pre-build
processing on any type of `template` or `Workflow`. For example, it can be used to conditionally set the `image` of a
`Script`, or set which cluster to submit a `Workflow` to.

To use this feature, you can write a function that takes an object of type `template` or `Workflow`, does some
processing on the object, then returns it.

For a simple example, we'll write a function that adds an annotation with a key of "hera", and value of "This workflow
was written in Hera!"

```py
from hera.shared import register_pre_build_hook
from hera.workflows import Workflow

@register_pre_build_hook
def set_workflow_default_labels(workflow: Workflow) -> Workflow:
if workflow.annotations is None:
workflow.annotations = {}

workflow.annotations["hera-annotation"] = "This workflow was written in Hera!"
return workflow

```

Now, any time `build` is called on the Workflow (e.g. to submit it or dump it to yaml), it will add in the annotation!

## Load YAML from File

Hera's `Workflow` classes offer a collection of `to` and `from` functions for `dict`, `yaml` and `file`. This
means you can load YAML files and manipulate them as Hera objects!

```py
with Workflow.from_file("./workflow.yaml") as w:
w.entrypoint = "my-new-dag-entrypoint"

with DAG(name="my-new-dag-entrypoint"):
... # Add some tasks!

w.create() # And submit to Argo directly from Hera!
```

The following are all valid assertions:

```py
with Workflow(name="w") as w:
pass

assert w == Workflow.from_dict(w.to_dict())
assert w == Workflow.from_yaml(w.to_yaml())
assert w == Workflow.from_file(w.to_file())
```

## Submit WorkflowTemplates and ClusterWorkflowTemplates as Workflows

This feature is available for `WorkflowTemplates` and `ClusterWorkflowTemplates`, and helps you, as a dev, iterate on
your `WorkflowTemplate` until it's ready to be deployed. Calling `create_as_workflow` on a `WorkflowTemplate` will
create a `Workflow` on the fly which is submitted to the Argo cluster directly and given a generated name, meaning you
don't need to first submit the `WorkflowTemplate` itself! What this means is you don't need to keep deleting your
`WorkflowTemplate` and submitting it again, to then run `argo submit --from WorkflowTemplate/my-wt` while iterating
on your `WorkflowTemplate`.

```py
with WorkflowTemplate(
name="my-wt",
namespace="my-namespace",
workflows_service=ws,
) as wt:
cowsay = Container(name="cowsay", image="docker/whalesay", command=["cowsay", "foo"])
with Steps(name="steps"):
cowsay()

wt.create_as_workflow(generate_name="my-wt-test-1-") # submitted and given a generated name by Argo like "my-wt-test-1-abcde"
wt.create_as_workflow() # submitted and given a generated name by Argo like "my-wtabcde"
wt.create_as_workflow() # submitted and given a generated name by Argo like "my-wtvwxyz"
```

`generate_name` is an optional parameter in case you want to control the exact value of the generated name, similarly to
the regular `Workflow`, otherwise the name of the `WorkflowTemplate` will be used verbatim for `generate_name`. The
Workflow submitted will always use `generate_name` so that you can call it multiple times in a row without naming
conflicts.

## Experimental Features
# Experimental Features

From time to time, Hera will release a new feature under the "experimental feature" flag while we develop the feature
and ensure stability. Once the feature is stable and we have decided to support it long-term, it will "graduate" into
Expand Down
2 changes: 2 additions & 0 deletions docs/user-guides/script-annotations.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,8 @@ def read_dict_artifact(dict_artifact: Annotated[dict, Artifact(loader=ArtifactLo
return dict_artifact["my-key"]
```

#### Pydantic Integration

A dictionary artifact would have no validation on its contents, so having safe code relies on you knowing or manually
validating the keys that exist in it. Instead, by specifying a Pydantic type, the dictionary can be automatically
validated and parsed to that type:
Expand Down
28 changes: 28 additions & 0 deletions docs/user-guides/script-basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,3 +193,31 @@ give a real `image` here, but we assume it exists in this example. Finally, the
passed to `source` is dumped to a file, and then the filename is passed as the final `arg` to the `command`. Therefore,
the `source` will actually contain a list of parameters as dictionaries, which are dumped to a file which is passed to
`hera.workflows.runner`. Of course, this is all handled for you!

#### Integrated Pydantic Support

As Argo deals with a limited set of YAML objects (YAML is generally a superset of JSON), Pydantic support is practically
built-in to Hera through Pydantic's serialization to, and from, JSON. Using Pydantic objects (instead of dictionaries)
in Runner Script templates makes them less error-prone, and easier to write! Using Pydantic classes in function inputs
is as simple as inheriting from Pydantic's `BaseModel`.
[Read more about Pydantic models here](https://docs.pydantic.dev/latest/usage/models/).

```py
from pydantic import BaseModel
from hera.workflows import script
class MyModel(BaseModel):
my_int: int
my_string: str
@script(constructor="runner")
def my_pydantic_function(my_pydantic_input: MyModel):
print(my_pydantic_input.my_string, my_pydantic_input.my_int)
```

Your functions can also return objects that are serialized, passed to another `Step` as a string argument, and then
de-serialized in another function. This flow can be seen in
[the callable scripts example](../examples/workflows/scripts/callable_script.md).

Read on to [Script Annotations](script-annotations.md) to learn how to write Script template functions even more
effectively!
58 changes: 54 additions & 4 deletions docs/user-guides/script-runner-io.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,59 @@
# Script Runner IO

Hera provides the `Input` and `Output` Pydantic classes which can be used to more succinctly write your
script function inputs and outputs, and requires use of the Hera Runner. Use of these classes also requires the
`"script_pydantic_io"` experimental feature flag to be enabled:
Hera provides the `Input` and `Output` Pydantic classes to help combat sprawling function declarations of inputs and
outputs when using script annotations. It has the added bonus of letting you return values by referencing their name,
instead of setting outputs by position in a `Tuple`.

It lets you go from a function declaration and `return` like this:

```py
@script(constructor="runner")
def my_function(
param_int: Annotated[int, Parameter(name="param-input")] = 42,
an_object: Annotated[MyObject, Parameter(name="obj-input")] = MyObject(
a_dict={"my-key": "a-value"}, a_str="hello world!"
),
artifact_int: Annotated[int, Artifact(name="artifact-input", loader=ArtifactLoader.json)],
) -> Tuple[
Annotated[int, Parameter(name="param-int-output")],
Annotated[int, Parameter(name="another-param-int-output")],
Annotated[str, Parameter(name="a-str-param-output")],
]:
print(param_int)
...
return 42, -1, "Hello, world!" # Hope I didn't mix these up!
```

to a function that uses the `Input` and `Output` classes:

```py
from hera.workflows.io import Input, Output

class MyInput(Input):
param_int: Annotated[int, Parameter(name="param-input")] = 42
an_object: Annotated[MyObject, Parameter(name="obj-input")] = MyObject(
a_dict={"my-key": "a-value"}, a_str="hello world!"
)
artifact_int: Annotated[int, Artifact(name="artifact-input", loader=ArtifactLoader.json)]

class MyOutput(Output):
param_int_output: Annotated[int, Parameter(name="param-int-output")],
another_param_int_output: Annotated[int, Parameter(name="another-param-int-output")],
a_str_param_output: Annotated[str, Parameter(name="a-str-param-output")],

@script(constructor="runner")
def my_function(my_input: MyInput) -> MyOutput:
print(my_input.param_int)
...
return MyOutput(
param_int_output=42,
another_param_int_output=-1,
a_str_param_output="Hello, world!",
)
```

Using the IO classes requires use of the Hera Runner and the `"script_pydantic_io"` experimental feature flag to be
enabled:

```py
global_config.experimental_features["script_pydantic_io"] = True
Expand Down Expand Up @@ -99,7 +150,6 @@ class MyOutput(Output):
@script(constructor="runner")
def pydantic_io() -> MyOutput:
return MyOutput(exit_code=1, result="Test!", param_int=42, artifact_int=my_input.param_int)
```

See the full Pydantic IO example [here](../examples/workflows/experimental/script_runner_io.md)!
2 changes: 1 addition & 1 deletion docs/walk-through/authentication.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Authentication
# Authenticating in Hera

The way you authenticate generally depends on your unique organization setup. You could either directly authenticate against the Argo server, or you handle authentication directly through the reverse proxy, or even both.

Expand Down
18 changes: 18 additions & 0 deletions docs/walk-through/experimental-hera-features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@

# Experimental Hera Features

From time to time, Hera will release a new feature under the "experimental feature" flag while we develop the feature
and ensure stability. Once the feature is stable and we have decided to support it long-term, it will "graduate" into
a fully-supported feature.

To enable experimental features you must set the feature by name to `True` in the `global_config.experimental_features`
dictionary before using the feature:

```py
global_config.experimental_features["NAME_OF_FEATURE"] = True
```

Note that experimental features are subject to breaking changes in future releases of the same major version. We will
usually announce changes in [the Hera slack channel](https://cloud-native.slack.com/archives/C03NRMD9KPY).

Read about current experimental features in [the user guide](../user-guides/experimental-features.md).
118 changes: 118 additions & 0 deletions docs/walk-through/hera-developer-features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# Hera Developer Features

Here we showcase more features that help developers write Workflows in Hera. Note that these features do not exist in
Argo as they are specific to Hera, being a Python library.

## Set Class Defaults

You are able to set basic default values for any attributes in Hera's custom classes, such as `Script`, `Container` and
`Workflow`, by using `hera.shared.global_config.set_class_defaults`. You pass the class you want to set defaults on, and
then kwargs for all the attributes you want to set (with their respective values). For example, if you wanted to set some
default values for any `Container` objects you create, you would do:

```py
from hera.shared import global_config
from hera.workflows import Container

global_config.set_class_defaults(
Container,
image="my-image:latest",
image_pull_policy="Always",
command=["cowsay"],
)
```

And then use the `Container` class in your Workflows as normal, but now the `Container` will have pre-populated default
attributes when created:

```py
with Workflow(name="w") as w:
do_a_cowsay = Container(name="cowsay-container", args=["Hello, world!"])
with Steps(name="steps"):
do_a_cowsay()
```

Notice how we do not need to set `image` or `command`!

## Pre-Build Hooks

If [class defaults](#set-class-defaults) don't meet your needs, Hera also offers a pre-build hook feature through
`hera.shared.register_pre_build_hook` with huge flexibility to do pre-build processing on any type of `template` or
`Workflow`. For example, it can be used to conditionally set the `image` of a `Script`, or set which cluster to submit a
`Workflow` to.

To use this feature, you can write a function that takes an object of type `template` or `Workflow`, does some
processing on the object, then returns it.

For a simple example, we'll write a function that adds an annotation with a key of "hera-annotation", and value of "This
workflow was written in Hera!"

```py
from hera.shared import register_pre_build_hook
from hera.workflows import Workflow

@register_pre_build_hook
def set_workflow_default_labels(workflow: Workflow) -> Workflow:
if workflow.annotations is None:
workflow.annotations = {}

workflow.annotations["hera-annotation"] = "This workflow was written in Hera!"
return workflow
```

Now, any time `build` is called on the Workflow (e.g. to submit it or dump it to yaml), it will add in the annotation!

## Load YAML from File

Hera's `Workflow` classes offer a collection of `to` and `from` functions for `dict`, `yaml` and `file`. This
means you can load YAML files and manipulate them as Hera objects!

```py
with Workflow.from_file("./workflow.yaml") as w:
w.entrypoint = "my-new-dag-entrypoint"

with DAG(name="my-new-dag-entrypoint"):
... # Add some tasks!

w.create() # And submit to Argo directly from Hera!
```

The following are all valid assertions:

```py
with Workflow(name="w") as w:
pass

assert w == Workflow.from_dict(w.to_dict())
assert w == Workflow.from_yaml(w.to_yaml())
assert w == Workflow.from_file(w.to_file())
```

## Submit WorkflowTemplates and ClusterWorkflowTemplates as Workflows

This feature is available for `WorkflowTemplates` and `ClusterWorkflowTemplates`, and helps you, as a dev, iterate on
your `WorkflowTemplate` until it's ready to be deployed. Calling `create_as_workflow` on a `WorkflowTemplate` will
create a `Workflow` on the fly which is submitted to the Argo cluster directly and given a generated name, meaning you
don't need to first submit the `WorkflowTemplate` itself! What this means is you don't need to keep deleting your
`WorkflowTemplate` and submitting it again, to then run `argo submit --from WorkflowTemplate/my-wt` while iterating
on your `WorkflowTemplate`.

```py
with WorkflowTemplate(
name="my-wt",
namespace="my-namespace",
workflows_service=ws,
) as wt:
cowsay = Container(name="cowsay", image="docker/whalesay", command=["cowsay", "foo"])
with Steps(name="steps"):
cowsay()

wt.create_as_workflow(generate_name="my-wt-test-1-") # submitted and given a generated name by Argo like "my-wt-test-1-abcde"
wt.create_as_workflow() # submitted and given a generated name by Argo like "my-wtabcde"
wt.create_as_workflow() # submitted and given a generated name by Argo like "my-wtvwxyz"
```

`generate_name` is an optional parameter in case you want to control the exact value of the generated name, similarly to
the regular `Workflow`, otherwise the name of the `WorkflowTemplate` will be used verbatim for `generate_name`. The
Workflow submitted will always use `generate_name` so that you can call it multiple times in a row without naming
conflicts.
20 changes: 0 additions & 20 deletions docs/walk-through/pydantic-support.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Advanced Template Features
# Template Features

This section exemplifies `template` features found in Argo, but are beyond the scope of the Walk Through.
This section signposts some `template` features found in Argo and their equivalent in Hera, which are beyond the scope of the Walk Through.

## Template-Level Lifecycle Hooks

Expand Down
Loading

0 comments on commit 964f400

Please sign in to comment.