Skip to content

Releases: pytask-dev/pytask

v0.4.2

08 Nov 11:57
7592664
Compare
Choose a tag to compare

Highlights

This release contains a new feature and some improvements for users.

  • 🚀 The new feature is the pytask.DataCatalog that allows users to manage dependencies and products in projects more easily. Read the tutorial to get started. 🚀
  • File changes are now detected by hashes instead of modification timestamps. It should prevent accidental executions when working with cloud storage providers like Dropbox or OneDrive and in many other situations. To save runtime, pytask uses a cache for the hashes when the modification timestamp has not changed.
  • Nodes now have signatures that separate how nodes are named and displayed from how nodes are identified internally. If you have written a custom node, please update it according to the how-to guide.
  • All of pytask's internal files are now stored in a .pytask folder in your project. The file .pytask.sqlite3 is moved to this location as well. Add .pytask to your .gitignore to prevent accidentally committing the folder.

What's Changed

Full Changelog: v0.4.1...v0.4.2

v0.4.1

11 Oct 08:16
ce6a825
Compare
Choose a tag to compare

What's Changed

Of course, it's a mandatory bug fix release after a bigger release.

Using the product annotation, Annotated[..., Product] did not work with multiple products.

Full Changelog: v0.4.0...v0.4.1

v0.4.0

07 Oct 18:18
b191559
Compare
Choose a tag to compare

News

pytask became three years old in July, which is a suitable event to rethink pytask's design and blow dust off of some of its oldest components.

Here are the highlights of v0.4.0 🚀 ⭐

Highlights

New interfaces for products.

Every argument can be declared as a product with the new' Product' annotation. The path can be passed as a default value.

from pathlib import Path

from pytask import Product
from typing_extensions import Annotated


def task_hello_earth(path: Annotated[Path, Product] = Path("hello_earth.txt")):
    path.write_text("Hello, earth!")

More explanation can be found at https://tinyurl.com/yrezszr4.

It is also possible to use the return of the task function as a product, which allows wrapping any third-party function as a task function. Read more about it here: https://tinyurl.com/pytask-return.

from pathlib import Path

from pytask import Product
from typing_extensions import Annotated


def task_hello_earth() -> Annotated[str, Path("hello_earth.txt")]:
    return "Hello, earth!"

Every task argument is a dependency

In older pytask versions, only paths were treated as task dependencies. That meant when you passed other arguments to the task, and they changed, it did not trigger a rerun of the task.

Now, every argument to a task can be a dependency, and you can hash them if they should trigger a rerun. It is explained in https://tinyurl.com/pytask-hash.

from pathlib import Path
from typing import Annotated

from pytask import Product
from pytask import PythonNode


def task_example(
    text: Annotated[str, PythonNode(value="Hello, World", hash=True)],
    path: Annotated[Path, Product] = Path("file.txt"),
) -> None:
    path.write_text(text)

A new functional interface

The functional interface for pytask has been reworked and accepts a list of task functions. You can use it within your terminal or a Jupyter notebook. Read this guide to learn more about it: https://tinyurl.com/pytask-functional.

from pathlib import Path
from typing import Annotated

from pytask import build


def create_text() -> Annotated[str, Path("hello_earth.txt")]:
    return "Hello, earth!"


session = build(tasks=[create_text])

Custom Nodes through Protocols

In the newest version, nodes (dependencies and products) and tasks follow protocols. It allows for customizations like PickleNodes that store any Python object as a pickle file and inject the object into the task when used as a dependency. It is explained in more detail in this guide: https://tinyurl.com/pytask-custom-nodes.

Other notable changes

  • Python 3.12 is supported, and support for Python 3.7 is dropped.
  • @pytask.mark.depends_on and @pytask.mark.produces are deprecated. There are better options to define dependencies and products explained in https://tinyurl.com/yrezszr4.
  • @pytask.mark.task is also deprecated and replaced by from pytask import task and @task.

What's Changed

Full Changelog: v0.3.2...v0.4.0

v0.4.0rc4

04 Oct 17:25
624ecdf
Compare
Choose a tag to compare
v0.4.0rc4 Pre-release
Pre-release

The last pre-release.

v0.4.0rc3

02 Oct 22:47
02bf5e4
Compare
Choose a tag to compare
v0.4.0rc3 Pre-release
Pre-release

A couple of new fixes. Most notably a fix for the ids of PythonNodes that should prevent rebuilds.

v0.4.0rc2

26 Sep 18:11
f3193ae
Compare
Choose a tag to compare
v0.4.0rc2 Pre-release
Pre-release

Another release candidate that fixes the installation via conda and adds full support for pytask-parallel.

v0.4.0rc1

22 Sep 10:20
Compare
Choose a tag to compare
v0.4.0rc1 Pre-release
Pre-release

This is the first release candidate for the v0.4.* release series.

The final release still requires some changes. For example, the documentation needs to be extended. But, the essential parts are already there, and it is time to collect some final feedback! Let me know what you think and what needs to be improved. You can comment in the discussion for this release #422.

To install the pre-release, use

$ pip install pytask --pre
$ conda install -c "conda-forge/label/pytask_rc" pytask

Now, let's take a look at the changes.

What's Changed

New

  • Dependencies and products of tasks have new interfaces that are explained in this tutorial.
  • You can also now declare products by allowing task functions to return. Follow this guide.
  • If you have inputs to task functions that should be hashed to detect any changes, follow this guide.
  • Before, only pathlib.Paths received special treatment as dependencies or products to task functions. Now, it is possible to define your own nodes that simplify, for example, loading pickle files as this guide explains. But many more extensions are possible, like defining data in an S3 bucket as a dependency or product.
  • The functional interface has been reworked and now accepts tasks directly, allowing you to execute pytask on the command line or in Jupyter notebooks. The documentation must still be written, but here is your starting point.

Removals

  • Python 3.7 is no longer supported.
  • @pytask.mark.parametrize is removed. Follow this tutorial instead.

Deprecations

  • @pytask.mark.depends_on, @pytask.mark.produces are deprecated and will be removed in v0.5.0.
  • @pytask.mark.task is deprecated. Use @pytask.task instead.
  • Paths defined as strings are deprecated and should be replaced with proper pathlib.Path objects.

Full list of changes

Full Changelog: v0.3.2...v0.4.0rc1

v0.3.2

07 Jun 09:40
39bfcdf
Compare
Choose a tag to compare

Highlights

This release contains the following highlights:

  • Previously, if you accidentally hit the save button on an unchanged task file, the task would be rerun by pytask, although nothing had changed. Now, pytask wouldn't rerun the task because it also compares the hashes of task files, not only the modification timestamp.
  • If you want to enforce rerunning tasks, there is now a --force flag. Take the function name/id of the task and run pytask -k <task id> --force, and the task + its necessary tasks will be executed. Or delete a product from the task you want to rerun.
  • The import mechanism for task modules has been reworked, and errors resolved. Thanks to @NickCrews!

Additionally, the @pytask.mark.parametrize decorator is deprecated and will be removed in pytask v0.4. If you use the decorator, you will have two options:

  1. (Recommended) Upgrade your code to the new approach for repeating tasks described in this tutorial.
  2. Or, pin pytask to pytask<0.4 and silence the deprecation warning by setting silence_parametrize_deprecation = true in your pyproject.toml under [tool.pytask.ini_options].

What's Changed

New Contributors

Full Changelog: v0.3.1...v0.3.2

v0.3.1

24 Jan 23:39
a2cd949
Compare
Choose a tag to compare

What's Changed

  • Fix bug when passing no path on the command line. by @tobiasraabe in #337
  • [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #341
  • [automated] Update plugin list by @github-actions in #340

Full Changelog: v0.3.0...v0.3.1

v0.3.0

22 Jan 13:45
88326dd
Compare
Choose a tag to compare

Highlights

This release includes a breaking change due to internal refactorings. The change affects how command line options and the configuration file are loaded and validated. For users, the changes are subtle; the help pages of the commands have prettier options and default values.

Make sure to upgrade pytask and the plugins to v0.3 or pin the packages to <0.3.

There is some delay until the updates for pytask and its plugins are available. Be aware of errors when using mixed v0.2 and v0.3 installations.

The most significant benefit is for developers who want to add command line options and configuration values. The parsing can now be handled with proper click types, for example, EnumChoice to implement choice options. Defaults are attached to command line options and are automatically displayed in the help pages.

What's Changed

Full Changelog: v0.2.7...v0.3.0