Skip to content

Commit 1de636b

Browse files
authored
Merge pull request #33287 [YAML] Document jinja templatization features.
2 parents e509e70 + 7df49a5 commit 1de636b

1 file changed

Lines changed: 43 additions & 0 deletions

File tree

  • website/www/site/content/en/documentation/sdks

website/www/site/content/en/documentation/sdks/yaml.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -708,6 +708,49 @@ options:
708708
streaming: true
709709
```
710710
711+
## Jinja Templatization
712+
713+
It is a common to want to run a single Beam pipeline in different contexts
714+
and/or with different configurations.
715+
When running a YAML pipeline using `apache_beam.yaml.main` or via gcloud,
716+
the yaml file can be parameterized with externally provided variables using
717+
the [jinja variable syntax](https://jinja.palletsprojects.com/en/stable/templates/#variables).
718+
The values are then passed via a `--jinja_variables` command line flag.
719+
720+
For example, one could start a pipeline with
721+
722+
```
723+
pipeline:
724+
transforms:
725+
- type: ReadFromCsv
726+
config:
727+
path: {{input_pattern}}
728+
```
729+
730+
and then run it with
731+
732+
```sh
733+
python -m apache_beam.yaml.main \
734+
--yaml_pipeline_file=pipeline.yaml \
735+
--jinja_variables='{"input_pattern": "gs://path/to/this/runs/files*.csv"}'
736+
```
737+
738+
Arbitrary [jinja control structures](https://jinja.palletsprojects.com/en/stable/templates/#list-of-control-structures),
739+
such as looping and conditionals, can be used as well if desired as long as the
740+
output results in a valid Beam YAML pipeline.
741+
742+
We also expose the [`datetime`](https://docs.python.org/3/library/datetime.html)
743+
module as a variable by default, which can be particularly useful in reading
744+
or writing dated sources and sinks, e.g.
745+
746+
```
747+
- type: WriteToJson
748+
config:
749+
path: "gs://path/to/{{ datetime.datetime.now().strftime('%Y/%m/%d') }}/dated-output.json"
750+
```
751+
752+
would write to files like `gs://path/to/2016/08/04/dated-output*.json`.
753+
711754
## Other Resources
712755

713756
* [Example pipeline](https://github.com/apache/beam/tree/master/sdks/python/apache_beam/yaml/examples)

0 commit comments

Comments
 (0)