Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introducing the experimental package and refactoring test structure #433

Merged
merged 12 commits into from
Jan 22, 2025
4 changes: 2 additions & 2 deletions tests/conftest.py → conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@
from nemo.collections.nlp.models.language_modeling.megatron_gpt_model import MegatronGPTModel
from nemo.collections.nlp.parts.nlp_overrides import NLPDDPStrategy
from nemo_aligner.models.nlp.gpt.megatron_gpt_ppo_actor import MegatronGPTActorModel
from nemo_aligner.testing.utils import Utils
from nemo_aligner.utils.train_script_utils import init_distributed, resolve_and_create_trainer
from tests.test_mcore_utilities import Utils

dir_path = os.path.dirname(os.path.abspath(__file__))
# TODO: This file exists because in cases where TRTLLM MPI communicators are involved,
Expand Down Expand Up @@ -67,7 +67,7 @@ def run_only_on_device_fixture(request, device):

@pytest.fixture
def init_model_parallel():
from tests.test_mcore_utilities import Utils
from nemo_aligner.testing.utils import Utils

def initialize(*args, **kwargs):
Utils.initialize_model_parallel(*args, **kwargs)
Expand Down
5 changes: 5 additions & 0 deletions docs/user-guide-experimental/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Experimental Docs

This directory contains documentation for features that are still experimental or under development and not yet ready for general use.

More context can be found in the [experimental/README.md](../../nemo_aligner/experimental/README.md) file.
Empty file.
File renamed without changes.
File renamed without changes.
50 changes: 50 additions & 0 deletions nemo_aligner/experimental/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Experimental Package

The `experimental` sub-package contains projects that are under active development and may not be fully stable.

## Experimental Project Directory Structure:

```
NeMo-Aligner/
├── docs/
│ ├── user-guide/
│ │ └── ppo.html
│ └── user-guide-experimental/ <----- experimental docs
│ └── new-thing.html
├── nemo_aligner/
│ ├── algorithms/
│ ├── data/
│ │ ├── datasets.py
│ │ └── tests/
│ │ └── datasets_test.py
│ └── experimental/ <----- experimental sub-package
│ ├── <proj-name>/
│ ├── dataset.py <----- experimental dataset
│ ├── new_algo.py <----- experimental algo
│ ├── model.py <----- experimental model
│ └── tests/
│ └── model_test.py <----- experimental model test
└── tests/
└── functional/
└── dpo.sh
└── test_cases/
└── dpo-llama3
└── functional_experimental/ <----- experimental functional tests (mirrors functional/ structure)
├── new_algo.sh
└── test_cases/
└── new_algo-llama3
```

The directories below exist to organize experimental projects (source code), tests, and documentation.

- [nemo_aligner/experimental/](../../nemo_aligner/experimental/): Main experimental sub-package containing projects under development
- [tests/functional_experimental/](../../tests/functional_experimental/): Functional tests for experimental projects
- [docs/user-guide-experimental/](../../docs/user-guide-experimental/): Documentation directory for experimental features and algorithms

The `experimental` sub-package follows a modular structure where each project has its own directory (sub-package) containing implementation and tests.

## Guidelines for "experimental/" Projects

- **Scope**: Projects can include new model definitions, training loops, utilities, or unit tests.
- **Independence**: Projects should ideally be independent. Dependence on other projects signals it might benefit from being added to core with tests (and documentation if applicable).
- **Testing**: Must include at least one functional test [example](../../tests/functional/test_cases/dpo-llama3).
Empty file.
File renamed without changes.
Empty file.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
3 changes: 3 additions & 0 deletions tests/functional_experimental/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Experimental Functional Tests

More context can be found in the [experimental/README.md](../../nemo_aligner/experimental/README.md) file.
Empty file.
Empty file.
4 changes: 2 additions & 2 deletions tests/run_mpi_unit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ if [[ $NUM_GPUS_AVAILABLE -lt 2 ]]; then
fi

export PYTHONPATH=$(realpath ..):${PYTHONPATH:-}
CUDA_VISIBLE_DEVICES=0,1 mpirun -np 2 --allow-run-as-root pytest .. -rA -s -x -vv --mpi $@ || true
CUDA_VISIBLE_DEVICES=0,1 mpirun -np 2 --allow-run-as-root pytest ../nemo_aligner -rA -s -x -vv --mpi $@ || true

if [[ -f PYTEST_SUCCESS ]]; then
if [[ -f ../PYTEST_SUCCESS ]]; then
echo SUCCESS
else
echo FAILURE
Expand Down
4 changes: 2 additions & 2 deletions tests/run_unit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ if [[ $NUM_GPUS_AVAILABLE -lt 2 ]]; then
fi

export PYTHONPATH=$(realpath ..):${PYTHONPATH:-}
CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node 2 -m pytest .. -rA -s -x -vv $@ || true
CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node 2 -m pytest ../nemo_aligner -rA -s -x -vv $@ || true

if [[ -f PYTEST_SUCCESS ]]; then
if [[ -f ../PYTEST_SUCCESS ]]; then
echo SUCCESS
else
echo FAILURE
Expand Down
Loading