|
954 | 954 | </ul>
|
955 | 955 | </nav>
|
956 | 956 |
|
| 957 | +</li> |
| 958 | + |
| 959 | + <li class="md-nav__item"> |
| 960 | + <a href="#contributions" class="md-nav__link"> |
| 961 | + <span class="md-ellipsis"> |
| 962 | + Contributions |
| 963 | + </span> |
| 964 | + </a> |
| 965 | + |
| 966 | + <nav class="md-nav" aria-label="Contributions"> |
| 967 | + <ul class="md-nav__list"> |
| 968 | + |
| 969 | + <li class="md-nav__item"> |
| 970 | + <a href="#function-write_levels" class="md-nav__link"> |
| 971 | + <span class="md-ellipsis"> |
| 972 | + Function write_levels() |
| 973 | + </span> |
| 974 | + </a> |
| 975 | + |
| 976 | +</li> |
| 977 | + |
| 978 | + </ul> |
| 979 | + </nav> |
| 980 | + |
957 | 981 | </li>
|
958 | 982 |
|
959 | 983 | </ul>
|
|
1548 | 1572 | </ul>
|
1549 | 1573 | </nav>
|
1550 | 1574 |
|
| 1575 | +</li> |
| 1576 | + |
| 1577 | + <li class="md-nav__item"> |
| 1578 | + <a href="#contributions" class="md-nav__link"> |
| 1579 | + <span class="md-ellipsis"> |
| 1580 | + Contributions |
| 1581 | + </span> |
| 1582 | + </a> |
| 1583 | + |
| 1584 | + <nav class="md-nav" aria-label="Contributions"> |
| 1585 | + <ul class="md-nav__list"> |
| 1586 | + |
| 1587 | + <li class="md-nav__item"> |
| 1588 | + <a href="#function-write_levels" class="md-nav__link"> |
| 1589 | + <span class="md-ellipsis"> |
| 1590 | + Function write_levels() |
| 1591 | + </span> |
| 1592 | + </a> |
| 1593 | + |
| 1594 | +</li> |
| 1595 | + |
| 1596 | + </ul> |
| 1597 | + </nav> |
| 1598 | + |
1551 | 1599 | </li>
|
1552 | 1600 |
|
1553 | 1601 | </ul>
|
@@ -3353,6 +3401,304 @@ <h3 id="zappend.api.ConfigLike" class="doc doc-heading">
|
3353 | 3401 | <p>Type for a zappend configuration-like object.</p>
|
3354 | 3402 | </div>
|
3355 | 3403 |
|
| 3404 | +</div><h2 id="contributions">Contributions</h2> |
| 3405 | + |
| 3406 | + |
| 3407 | +<div class="doc doc-object doc-module"> |
| 3408 | + |
| 3409 | + |
| 3410 | + |
| 3411 | + |
| 3412 | + <div class="doc doc-contents first"> |
| 3413 | + |
| 3414 | + <p>This module contributes to zappend's core functionality.</p> |
| 3415 | +<p>The function signatures in this module are less stable, and their implementations |
| 3416 | +are considered experimental. They may also rely on external packages. For more |
| 3417 | +information, please refer to the individual function documentation. |
| 3418 | +Due to these reasons, this module is excluded from the project's automatic |
| 3419 | +coverage analysis.</p> |
| 3420 | + |
| 3421 | + |
| 3422 | + |
| 3423 | + <div class="doc doc-children"> |
| 3424 | + |
| 3425 | + |
| 3426 | + |
| 3427 | + |
| 3428 | + |
| 3429 | + |
| 3430 | + |
| 3431 | + |
| 3432 | + |
| 3433 | + |
| 3434 | + |
| 3435 | + </div> |
| 3436 | + |
| 3437 | + </div> |
| 3438 | + |
| 3439 | +</div><h3 id="function-write_levels">Function <code>write_levels()</code></h3> |
| 3440 | + |
| 3441 | + |
| 3442 | +<div class="doc doc-object doc-function"> |
| 3443 | + |
| 3444 | + |
| 3445 | + |
| 3446 | + |
| 3447 | + <div class="doc doc-contents first"> |
| 3448 | + |
| 3449 | + <p>Write a dataset given by <code>source_ds</code> or <code>source_path</code> to <code>target_path</code> |
| 3450 | +using the |
| 3451 | +<a href="https://xcube.readthedocs.io/en/latest/mldatasets.html">multi-level dataset format</a> |
| 3452 | +as specified by |
| 3453 | +<a href="https://github.com/xcube-dev/xcube">xcube</a>.</p> |
| 3454 | +<p>It resembles the <code>store.write_data(dataset, "<name>.levels", ...)</code> method |
| 3455 | +provided by the xcube filesystem data stores ("file", "s3", "memory", etc.). |
| 3456 | +The zappend version may be used for potentially very large datasets in terms |
| 3457 | +of dimension sizes or for datasets with very large number of chunks. |
| 3458 | +It is considerably slower than the xcube version (which basically uses |
| 3459 | +<code>xarray.to_zarr()</code> for each resolution level), but should run robustly with |
| 3460 | +stable memory consumption.</p> |
| 3461 | +<p>The function opens the source dataset and subdivides it into dataset slices |
| 3462 | +along the append dimension given by <code>append_dim</code>, which defaults |
| 3463 | +to <code>"time"</code>. The slice size in the append dimension is one. |
| 3464 | +Each slice is downsampled to the number of levels and each slice level |
| 3465 | +dataset is created/appended the target dataset's individual level |
| 3466 | +datasets.</p> |
| 3467 | +<p>The target dataset's chunk size in the spatial x- and y-dimensions will |
| 3468 | +be the same as the specified (or derived) tile size. |
| 3469 | +The append dimension will be one. The chunking will be reflected as the |
| 3470 | +<code>variables</code> configuration parameter passed to each <code>zappend()</code> call. |
| 3471 | +If configuration parameter <code>variables</code> is also given as part of |
| 3472 | +<code>zappend_config</code>, it will be merged with the chunk definitions.</p> |
| 3473 | +<p><strong>Important notes:</strong></p> |
| 3474 | +<ul> |
| 3475 | +<li>This function depends on <code>xcube.core.gridmapping.GridMapping</code> and |
| 3476 | + <code>xcube.core.subsampling.subsample_dataset()</code> of the <code>xcube</code> package.</li> |
| 3477 | +<li><code>write_levels()</code> is not as robust as zappend itself. For example, |
| 3478 | + there may be inconsistent dataset levels if the processing |
| 3479 | + is interrupted while a level is appended.</li> |
| 3480 | +<li>There is a remaining issue that with (coordinate) variables that |
| 3481 | + have a dimension that is not a dimension of any variable that has |
| 3482 | + one of the spatial dimensions, e.g., <code>time_bnds</code> with dimensions |
| 3483 | + <code>time</code> and <code>bnds</code>. Please exclude such variables using the parameter |
| 3484 | + <code>excluded_variables</code>.</li> |
| 3485 | +</ul> |
| 3486 | + |
| 3487 | + |
| 3488 | +<p><span class="doc-section-title">Parameters:</span></p> |
| 3489 | + <table> |
| 3490 | + <thead> |
| 3491 | + <tr> |
| 3492 | + <th>Name</th> |
| 3493 | + <th>Type</th> |
| 3494 | + <th>Description</th> |
| 3495 | + <th>Default</th> |
| 3496 | + </tr> |
| 3497 | + </thead> |
| 3498 | + <tbody> |
| 3499 | + <tr class="doc-section-item"> |
| 3500 | + <td><code>source_ds</code></td> |
| 3501 | + <td> |
| 3502 | + <code><span title="xarray.Dataset">Dataset</span> | None</code> |
| 3503 | + </td> |
| 3504 | + <td> |
| 3505 | + <div class="doc-md-description"> |
| 3506 | + <p>The source dataset. |
| 3507 | +Must be given in case <code>source_path</code> is not given.</p> |
| 3508 | + </div> |
| 3509 | + </td> |
| 3510 | + <td> |
| 3511 | + <code>None</code> |
| 3512 | + </td> |
| 3513 | + </tr> |
| 3514 | + <tr class="doc-section-item"> |
| 3515 | + <td><code>source_path</code></td> |
| 3516 | + <td> |
| 3517 | + <code>str | None</code> |
| 3518 | + </td> |
| 3519 | + <td> |
| 3520 | + <div class="doc-md-description"> |
| 3521 | + <p>The source dataset path. |
| 3522 | +If <code>source_ds</code> is provided and <code>link_level_zero</code> is true, |
| 3523 | +then <code>source_path</code> must also be provided in order |
| 3524 | +to determine the path of the level zero source.</p> |
| 3525 | + </div> |
| 3526 | + </td> |
| 3527 | + <td> |
| 3528 | + <code>None</code> |
| 3529 | + </td> |
| 3530 | + </tr> |
| 3531 | + <tr class="doc-section-item"> |
| 3532 | + <td><code>source_storage_options</code></td> |
| 3533 | + <td> |
| 3534 | + <code>dict[str, <span title="typing.Any">Any</span>] | None</code> |
| 3535 | + </td> |
| 3536 | + <td> |
| 3537 | + <div class="doc-md-description"> |
| 3538 | + <p>Storage options for the source |
| 3539 | +dataset's filesystem.</p> |
| 3540 | + </div> |
| 3541 | + </td> |
| 3542 | + <td> |
| 3543 | + <code>None</code> |
| 3544 | + </td> |
| 3545 | + </tr> |
| 3546 | + <tr class="doc-section-item"> |
| 3547 | + <td><code>source_append_offset</code></td> |
| 3548 | + <td> |
| 3549 | + <code>int | None</code> |
| 3550 | + </td> |
| 3551 | + <td> |
| 3552 | + <div class="doc-md-description"> |
| 3553 | + <p>Optional offset in the append dimension. |
| 3554 | +Only slices with indexes greater or equal the offset are |
| 3555 | +appended.</p> |
| 3556 | + </div> |
| 3557 | + </td> |
| 3558 | + <td> |
| 3559 | + <code>None</code> |
| 3560 | + </td> |
| 3561 | + </tr> |
| 3562 | + <tr class="doc-section-item"> |
| 3563 | + <td><code>target_path</code></td> |
| 3564 | + <td> |
| 3565 | + <code>str | None</code> |
| 3566 | + </td> |
| 3567 | + <td> |
| 3568 | + <div class="doc-md-description"> |
| 3569 | + <p>The target multi-level dataset path. |
| 3570 | +Filename extension should be <code>.levels</code>, by convention. |
| 3571 | +If not given, <code>target_dir</code> should be passed as part of the |
| 3572 | +<code>zappend_config</code>. (The name <code>target_path</code> is used here for |
| 3573 | +consistency with <code>source_path</code>.)</p> |
| 3574 | + </div> |
| 3575 | + </td> |
| 3576 | + <td> |
| 3577 | + <code>None</code> |
| 3578 | + </td> |
| 3579 | + </tr> |
| 3580 | + <tr class="doc-section-item"> |
| 3581 | + <td><code>num_levels</code></td> |
| 3582 | + <td> |
| 3583 | + <code>int | None</code> |
| 3584 | + </td> |
| 3585 | + <td> |
| 3586 | + <div class="doc-md-description"> |
| 3587 | + <p>Optional number of levels. |
| 3588 | +If not given, a reasonable number of levels is computed |
| 3589 | +from <code>tile_size</code>.</p> |
| 3590 | + </div> |
| 3591 | + </td> |
| 3592 | + <td> |
| 3593 | + <code>None</code> |
| 3594 | + </td> |
| 3595 | + </tr> |
| 3596 | + <tr class="doc-section-item"> |
| 3597 | + <td><code>tile_size</code></td> |
| 3598 | + <td> |
| 3599 | + <code>tuple[int, int] | None</code> |
| 3600 | + </td> |
| 3601 | + <td> |
| 3602 | + <div class="doc-md-description"> |
| 3603 | + <p>Optional tile size in the x- and y-dimension in pixels. |
| 3604 | +If not given, the tile size is computed from the source |
| 3605 | +dataset's chunk sizes in the x- and y-dimensions.</p> |
| 3606 | + </div> |
| 3607 | + </td> |
| 3608 | + <td> |
| 3609 | + <code>None</code> |
| 3610 | + </td> |
| 3611 | + </tr> |
| 3612 | + <tr class="doc-section-item"> |
| 3613 | + <td><code>xy_dim_names</code></td> |
| 3614 | + <td> |
| 3615 | + <code>tuple[str, str] | None</code> |
| 3616 | + </td> |
| 3617 | + <td> |
| 3618 | + <div class="doc-md-description"> |
| 3619 | + <p>Optional dimension names that identify the x- and y-dimensions. |
| 3620 | +If not given, derived from the source dataset's grid mapping, |
| 3621 | +if any.</p> |
| 3622 | + </div> |
| 3623 | + </td> |
| 3624 | + <td> |
| 3625 | + <code>None</code> |
| 3626 | + </td> |
| 3627 | + </tr> |
| 3628 | + <tr class="doc-section-item"> |
| 3629 | + <td><code>agg_methods</code></td> |
| 3630 | + <td> |
| 3631 | + <code>str | dict[str, <span title="typing.Any">Any</span>] | None</code> |
| 3632 | + </td> |
| 3633 | + <td> |
| 3634 | + <div class="doc-md-description"> |
| 3635 | + <p>An aggregation method for all data variables or a |
| 3636 | +mapping that provides the aggregation method for a variable |
| 3637 | +name. Possible aggregation methods are |
| 3638 | +<code>"first"</code>, <code>"min"</code>, <code>"max"</code>, <code>"mean"</code>, <code>"median"</code>.</p> |
| 3639 | + </div> |
| 3640 | + </td> |
| 3641 | + <td> |
| 3642 | + <code>None</code> |
| 3643 | + </td> |
| 3644 | + </tr> |
| 3645 | + <tr class="doc-section-item"> |
| 3646 | + <td><code>use_saved_levels</code></td> |
| 3647 | + <td> |
| 3648 | + <code>bool</code> |
| 3649 | + </td> |
| 3650 | + <td> |
| 3651 | + <div class="doc-md-description"> |
| 3652 | + <p>Whether a given, already written resolution level |
| 3653 | +serves as input to aggregation for the next level. If <code>False</code>, |
| 3654 | +the default, each resolution level other than zero is computed |
| 3655 | +from the source dataset. If <code>True</code>, the function may perform |
| 3656 | +significantly faster, but be aware that the aggregation |
| 3657 | +methods <code>"first"</code> and <code>"median"</code> will produce inaccurate results.</p> |
| 3658 | + </div> |
| 3659 | + </td> |
| 3660 | + <td> |
| 3661 | + <code>False</code> |
| 3662 | + </td> |
| 3663 | + </tr> |
| 3664 | + <tr class="doc-section-item"> |
| 3665 | + <td><code>link_level_zero</code></td> |
| 3666 | + <td> |
| 3667 | + <code>bool</code> |
| 3668 | + </td> |
| 3669 | + <td> |
| 3670 | + <div class="doc-md-description"> |
| 3671 | + <p>Whether to <em>not</em> write the level zero of the target |
| 3672 | +multi-level dataset and link it instead. In this case, a link |
| 3673 | +file <code>{target_path}/0.link</code> will be written. |
| 3674 | +If <code>False</code>, the default, a level dataset <code>{target_path}/0.zarr</code> |
| 3675 | +will be written instead.</p> |
| 3676 | + </div> |
| 3677 | + </td> |
| 3678 | + <td> |
| 3679 | + <code>False</code> |
| 3680 | + </td> |
| 3681 | + </tr> |
| 3682 | + <tr class="doc-section-item"> |
| 3683 | + <td><code>zappend_config</code></td> |
| 3684 | + <td> |
| 3685 | + </td> |
| 3686 | + <td> |
| 3687 | + <div class="doc-md-description"> |
| 3688 | + <p>Configuration passed to zappend as <code>zappend(slice, **zappend_config)</code> |
| 3689 | +for each slice in the append dimension. The zappend <code>config</code> |
| 3690 | +parameter is not supported.</p> |
| 3691 | + </div> |
| 3692 | + </td> |
| 3693 | + <td> |
| 3694 | + <code>{}</code> |
| 3695 | + </td> |
| 3696 | + </tr> |
| 3697 | + </tbody> |
| 3698 | + </table> |
| 3699 | + |
| 3700 | + </div> |
| 3701 | + |
3356 | 3702 | </div>
|
3357 | 3703 |
|
3358 | 3704 |
|
|
0 commit comments