@@ -708,6 +708,49 @@ options:
708708 streaming: true
709709```
710710
711+ ## Jinja Templatization
712+
713+ It is a common to want to run a single Beam pipeline in different contexts
714+ and/or with different configurations.
715+ When running a YAML pipeline using `apache_beam.yaml.main` or via gcloud,
716+ the yaml file can be parameterized with externally provided variables using
717+ the [jinja variable syntax](https://jinja.palletsprojects.com/en/stable/templates/#variables).
718+ The values are then passed via a `--jinja_variables` command line flag.
719+
720+ For example, one could start a pipeline with
721+
722+ ```
723+ pipeline:
724+ transforms:
725+ - type: ReadFromCsv
726+ config:
727+ path: {{input_pattern}}
728+ ```
729+
730+ and then run it with
731+
732+ ```sh
733+ python -m apache_beam.yaml.main \
734+ --yaml_pipeline_file=pipeline.yaml \
735+ --jinja_variables='{"input_pattern": "gs://path/to/this/runs/files*.csv"}'
736+ ```
737+
738+ Arbitrary [ jinja control structures] ( https://jinja.palletsprojects.com/en/stable/templates/#list-of-control-structures ) ,
739+ such as looping and conditionals, can be used as well if desired as long as the
740+ output results in a valid Beam YAML pipeline.
741+
742+ We also expose the [ ` datetime ` ] ( https://docs.python.org/3/library/datetime.html )
743+ module as a variable by default, which can be particularly useful in reading
744+ or writing dated sources and sinks, e.g.
745+
746+ ```
747+ - type: WriteToJson
748+ config:
749+ path: "gs://path/to/{{ datetime.datetime.now().strftime('%Y/%m/%d') }}/dated-output.json"
750+ ```
751+
752+ would write to files like ` gs://path/to/2016/08/04/dated-output*.json ` .
753+
711754## Other Resources
712755
713756* [ Example pipeline] ( https://github.com/apache/beam/tree/master/sdks/python/apache_beam/yaml/examples )
0 commit comments