Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .codespellrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[codespell]
# Ref: https://github.com/codespell-project/codespell#using-a-config-file
skip = .git*,*.svg,package-lock.json,*-lock.yaml,*.lock,*.css,.codespellrc,playground,*.jsonl,.cache,*/math.json,*setup.cfg
check-hidden = true
# Ignore embedded images, camelCase/PascalCase identifiers, and URLs
ignore-regex = ^\s*"image/\S+": ".*|\b[a-z]+[A-Z]\w*\b|\b[A-Z][a-z]+[A-Z]\w*\b|https?://\S+
# Domain-specific terms and variable names that are not typos
# ot: file extension, fro: Frobenius norm in PyTorch, alse: part of regex (F|f)alse, eles: variable name (elements)
ignore-words-list = ans,rouge,aci,nd,medias,te,ot,fro,alse,eles
25 changes: 25 additions & 0 deletions .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Codespell configuration is within .codespellrc
---
name: Codespell

on:
push:
branches: [main]
pull_request:
branches: [main]

permissions:
contents: read

jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4
- name: Annotate locations with typos
uses: codespell-project/codespell-problem-matcher@v1
- name: Codespell
uses: codespell-project/actions-codespell@v2
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,4 @@ wheels/
**/.claude/
**/workspace/
**/CLAUDE.md
**/logs/
**/logs/.npm/
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,7 +260,7 @@ For training, please refer to [`./deepanalyze/ms-swift/requirements.txt`](./deep
answer = deepanalyze.generate(prompt, workspace=workspace)
print(answer["reasoning"])
```
You shoud get a deep research report, which can be rendered as a PDF.:
You should get a deep research report, which can be rendered as a PDF.:
```text
# Comprehensive Analysis of Student Enrollment Patterns and Institutional Transfers

Expand Down
2 changes: 1 addition & 1 deletion deepanalyze/SkyRL/skyagent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

SkyAgent is a generic agent layer for training and evaluating agents.

SkyAgent is designed primarly for researchers to have a unified interface around implementing agentic tasks. A modular design allows researchers to
SkyAgent is designed primarily for researchers to have a unified interface around implementing agentic tasks. A modular design allows researchers to
1. bring in their own tasks
2. use any training backend or simply run evaluation
3. modify runtime implementation for a given task
Expand Down
2 changes: 1 addition & 1 deletion deepanalyze/SkyRL/skyagent/skyagent/agents/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,7 @@ def _post_process_results(
has_finish_action_list.append(result.get("finish", False))
finish_reason_list.append(result.get("finish_reason", None))

# Encode messages, get assitant mask and position ids
# Encode messages, get assistant mask and position ids
prompt_encodings = self.tokenizer.apply_chat_template(
all_prompts,
# return_tensors="pt",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ def get_instruction(cls, instance: Dict[str, Any]) -> str:
system_prompt = {
"role": "system",
"content": "Please solve the problem with the following tools and return the final answer inside the finish tool. \
If there are additional requirments such as the answer should be included inside \\boxed{}, please return the answer in the format of \
If there are additional requirements such as the answer should be included inside \\boxed{}, please return the answer in the format of \
<function=finish> \
<parameter=answer>\\boxed{'The final answer goes here.'}</parameter> \
</function>"
Expand Down
4 changes: 2 additions & 2 deletions deepanalyze/SkyRL/skyagent/skyagent/tasks/swebench/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,10 +85,10 @@ def _get_swebench_workspace_dir_name(instance: pd.Series, dataset: str) -> str:

# Phase 1. READING: read the problem and reword it in clearer terms
# 1.1 If there are code or config snippets. Express in words any best practices or conventions in them.
# 1.2 Hightlight message errors, method names, variables, file names, stack traces, and technical details.
# 1.2 Highlight message errors, method names, variables, file names, stack traces, and technical details.
# 1.3 Explain the problem in clear terms.
# 1.4 Enumerate the steps to reproduce the problem.
# 1.5 Hightlight any best practices to take into account when testing and fixing the issue
# 1.5 Highlight any best practices to take into account when testing and fixing the issue

# Phase 2. RUNNING: install and run the tests on the repository
# 2.1 Follow the readme
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -506,7 +506,7 @@ def compute_score(solution_str: str,
if "\\pi" in extracted_model_output or "\\pi" in ground_truth:
equivs = []
for pi in [math.pi, 3.14]:
equivs.append(math_equal(extracted_model_output, ground_truth, tiemout=True, pi=pi))
equivs.append(math_equal(extracted_model_output, ground_truth, timeout=True, pi=pi))
correct = any(equivs)
else:
correct = math_equal(extracted_model_output, ground_truth, timeout=True)
Expand Down
2 changes: 1 addition & 1 deletion deepanalyze/SkyRL/skyagent/skyagent/tools/prompt.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,5 @@
2. **Key Extraction for Evidence**: Identify and extract the **most relevant information** from the content, you never miss any important information, output the **full original context** of the content as far as possible, it can be more than three paragraphs.
3. **Summary Output for Summary**: Organize into a concise paragraph with logical flow, prioritizing clarity and judge the contribution of the information to the goal.

**Final Output Format using JSON format has "rational", "evidence", "summary" feilds**
**Final Output Format using JSON format has "rational", "evidence", "summary" fields**
"""
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ class TimeoutException(Exception):


def timeout_handler(signum, frame):
print("timeout occured: alarm went off")
print("timeout occurred: alarm went off")
raise TimeoutException


Expand Down Expand Up @@ -195,7 +195,7 @@ def compile_code(code: str, timeout: int):
# else condition allows future extensibility to other platforms
compiled_sol = tmp_sol.Solution()
else:
# do nothing in the other case since function is accesible
# do nothing in the other case since function is accessible
compiled_sol = tmp_sol

assert compiled_sol is not None
Expand Down Expand Up @@ -389,9 +389,9 @@ def grade_stdio(
if stripped_prediction_line == stripped_gt_out_line:
continue

## CASE 2: element-wise comparision
## CASE 2: element-wise comparison
## if there are floating elements
## use `decimal` library for good floating point comparision
## use `decimal` library for good floating point comparison
## otherwise gotcha: np.isclose(50000000000000000, 50000000000000001) = True
## note that we should always be able to convert to decimals

Expand Down
6 changes: 3 additions & 3 deletions deepanalyze/SkyRL/skyrl-train/docs/configuration/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -468,7 +468,7 @@ Weight Transfer Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- ``generator.weight_sync_backend``: Backend to use for weight synchronization. Currently, we support ``nccl`` and ``gloo``.
- ``generator.override_existing_update_group``: Whether to override the existing update group for the inference engine. This is applicable only for remote inference engines. During training, `skyrl-train` forms a custom process group ("update group") with the rank 0 training worker and all the inference engine ranks. If ``override_existing_update_group=enable``, then during initialization, a previous weight update group will be overriden in the inference engine. For example, if you have a remote server setup and you run training for the same model multiple times, it is helpful to override the previous update group. We recommend leaving this to ``auto`` - since it will automatically determine if the previous update group should be overridden based on ``run_engines_locally``.
- ``generator.override_existing_update_group``: Whether to override the existing update group for the inference engine. This is applicable only for remote inference engines. During training, `skyrl-train` forms a custom process group ("update group") with the rank 0 training worker and all the inference engine ranks. If ``override_existing_update_group=enable``, then during initialization, a previous weight update group will be overridden in the inference engine. For example, if you have a remote server setup and you run training for the same model multiple times, it is helpful to override the previous update group. We recommend leaving this to ``auto`` - since it will automatically determine if the previous update group should be overridden based on ``run_engines_locally``.

Inference Engine Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -481,8 +481,8 @@ Inference Engine Configuration
- ``generator.vllm_v1_disable_multiproc``: If ``true``, this will set ``VLLM_ENABLE_V1_MULTIPROCESSING=0`` in the environment, which makes the scheduling deterministic. This is useful for reproducibility.
- ``generator.enable_prefix_caching``: Whether to enable prefix caching for the inference engine. Applicable only when ``backend="vllm"``. This can be left to the default ``true`` in most cases. Note that in the case of remote inference engines, you would need to match the setting used when you initialized the remote servers.
- ``generator.enable_chunked_prefill``: Whether to enable chunked prefill for the inference engine. Applicable only when ``backend="vllm"``. With vLLM, this can be left to the default ``true`` in most cases.
- ``generator.max_num_seqs``: Continous batching parameter for vLLM. Maximum number of sequences to pack into a batch.
- ``generator.max_num_batched_tokens``: Continous batching parameter for vLLM. Maximum number of tokens to pack into a batch.
- ``generator.max_num_seqs``: Continuous batching parameter for vLLM. Maximum number of sequences to pack into a batch.
- ``generator.max_num_batched_tokens``: Continuous batching parameter for vLLM. Maximum number of tokens to pack into a batch.


Generation Parameters
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ CPU tests
GPU tests
~~~~~~~~~

The GPU tests require a node with atleast 8 GPUs. They have been tested on a 8xH100 node, but should work even on 8xA100 nodes. We are actively working on making these more accessible.
The GPU tests require a node with at least 8 GPUs. They have been tested on a 8xH100 node, but should work even on 8xA100 nodes. We are actively working on making these more accessible.

.. code-block:: bash

Expand Down
8 changes: 4 additions & 4 deletions deepanalyze/SkyRL/skyrl-train/docs/tutorials/tools_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Core Concepts

**ToolGroup**: A ``ToolGroup`` is a collection of related tools that share the same context or states. Tool groups enable all tools within the group to access and modify the shared state, such as a shared database connection or cache.

**Environment**: An ``Environment`` is a class that defines the task for the agent to solve, and can integrate one ore more tool groups for the agent to use. See the following doc for more details on how to build an environment: :doc:`new_env`.
**Environment**: An ``Environment`` is a class that defines the task for the agent to solve, and can integrate one or more tool groups for the agent to use. See the following doc for more details on how to build an environment: :doc:`new_env`.


ToolGroup and the @tool Decorator
Expand Down Expand Up @@ -84,14 +84,14 @@ Search ToolGroup
Environment Integration
------------------------

Tools groups can be integrated into any environment in SkyGym-RL. The base environment class for text-based environments is ``BaseTextEnv``, which provides simple utilities for managing and using multiple tool groups in a single envrionment.
Tools groups can be integrated into any environment in SkyGym-RL. The base environment class for text-based environments is ``BaseTextEnv``, which provides simple utilities for managing and using multiple tool groups in a single environment.

The following sub-sections walk through integrating and using tools in an environment.

Tool Initialization
~~~~~~~~~~~~~~~~~~~

To incorporate tools into an envrionment, first build and initialize the tool groups during environment construction:
To incorporate tools into an environment, first build and initialize the tool groups during environment construction:


.. code-block:: python
Expand All @@ -112,7 +112,7 @@ To incorporate tools into an envrionment, first build and initialize the tool gr
Tool Execution
~~~~~~~~~~~~~~

To use a tool and get the result, you can call the ``_execute_tool`` (provided by ``BaseTextEnv``) method with the tool group name, tool name, and the tool input. Tools are most often used in the envrionment ``step`` method.
To use a tool and get the result, you can call the ``_execute_tool`` (provided by ``BaseTextEnv``) method with the tool group name, tool name, and the tool input. Tools are most often used in the environment ``step`` method.

.. code-block:: python

Expand Down
2 changes: 1 addition & 1 deletion deepanalyze/SkyRL/skyrl-train/import_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2783,7 +2783,7 @@ def propagate_frozenset(unordered_import_structure):

else:
# If k is not a frozenset, it means that the dictionary is not "level": some keys (top-level)
# are frozensets, whereas some are not -> frozenset keys are at an unkown depth-level of the
# are frozensets, whereas some are not -> frozenset keys are at an unknown depth-level of the
# dictionary.
#
# We recursively propagate the frozenset for this specific dictionary so that the frozensets
Expand Down
2 changes: 1 addition & 1 deletion deepanalyze/SkyRL/skyrl-train/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ skyrl-gym = { path = "./skyrl-gym" , editable = true }
torch = { index = "pytorch-cu128" }
torchvision = { index = "pytorch-cu128" }
flash-attn = { url = "https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.0.post2/flash_attn-2.8.0.post2+cu12torch2.7cxx11abiFALSE-cp312-cp312-linux_x86_64.whl" }
# NOTE (sumanthrh): We explictly use a flashinfer wheel from their index.
# NOTE (sumanthrh): We explicitly use a flashinfer wheel from their index.
# The wheels on PyPI don't come with pre-compiled kernels and the package will JIT compile them at runtime which is slow.
# additionally, different inference engines may pin different compatible flashinfer versions, so we provide the option to pin different versions for vllm/sglang
flashinfer-python = [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ class Experience:
base_action_log_probs: (B, A)
values: (B, A)
returns: (B, A)
advatanges: (B, A)
advantages: (B, A)
attention_mask: (B, S)
action_mask: (B, A)
kl: (B, A)
Expand Down Expand Up @@ -124,7 +124,7 @@ class BufferItem:
base_action_log_probs: (A)
values: (1)
returns: (1)
advatanges: (1)
advantages: (1)
attention_mask: (S)
loss_mask: (A)
action_mask: (A)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def setup_distributed(self, timeout=timedelta(minutes=30)) -> None:
if local_rank != -1:
torch.cuda.set_device(local_rank)

# Initializes the distributed backend which will take care of sychronizing nodes/GPUs
# Initializes the distributed backend which will take care of synchronizing nodes/GPUs
deepspeed.init_distributed(timeout=timeout)
self.world_size = dist.get_world_size()
self.accumulated_gradient = (
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ def apply_monkey_patch(
), f"num_attention_heads {num_attention_heads} must be divisible by ulysses_sp_size {ulysses_sp_size}"
assert (
num_key_value_heads % ulysses_sp_size == 0 or ulysses_sp_size % num_key_value_heads == 0
), f"num_key_value_heads {num_key_value_heads} must be divisible by ulysses_sp_size {ulysses_sp_size}or vise versa. Upon ulysses_sp_size % num_key_value_heads == 0,kv heads are repeated to ensure correctness."
), f"num_key_value_heads {num_key_value_heads} must be divisible by ulysses_sp_size {ulysses_sp_size}or vice versa. Upon ulysses_sp_size % num_key_value_heads == 0,kv heads are repeated to ensure correctness."
# transformers<=4.47.1
if use_remove_padding or ulysses_sp_size > 1:
if hasattr(module, "_flash_attention_forward"):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ def get_train_dataset(self):
# make sure the dataset is large enough to train on
assert (
len(prompts_dataset) >= self.cfg.trainer.train_batch_size
), f"dataset should be atleast as large as `train_batch_size` {self.cfg.trainer.train_batch_size}, got size {len(prompts_dataset)}"
), f"dataset should be at least as large as `train_batch_size` {self.cfg.trainer.train_batch_size}, got size {len(prompts_dataset)}"
return prompts_dataset

def get_eval_dataset(self):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ async def update_named_weights(self, request: NamedWeightsUpdateRequest):
raise ValueError(f"Expected update weight request with 'names' entry, got keys: {request.keys()}")

if not len(request["names"]):
raise ValueError("Update weight request should have atleast one entry in 'names'")
raise ValueError("Update weight request should have at least one entry in 'names'")

engine = self._get_engine()
# Use IPC if handles are provided
Expand Down Expand Up @@ -387,7 +387,7 @@ async def update_named_weights(self, request: NamedWeightsUpdateRequest):
raise ValueError(f"Expected update weight request with 'names' entry, got keys: {request.keys()}")

if not len(request["names"]):
raise ValueError("Update weight request should have atleast one entry in 'names'")
raise ValueError("Update weight request should have at least one entry in 'names'")

engine = self._get_engine()
# Use IPC if handles are provided
Expand Down Expand Up @@ -491,7 +491,7 @@ async def _destroy_weights_update_group(self):
# raise ValueError(f"Expected update weight request with 'names' entry, got keys: {request.keys()}")

# if not len(request["names"]):
# raise ValueError("Update weight request should have atleast one entry in 'names'")
# raise ValueError("Update weight request should have at least one entry in 'names'")

# engine = self._get_engine()
# # Use IPC if handles are provided
Expand Down
4 changes: 2 additions & 2 deletions deepanalyze/SkyRL/skyrl-train/skyrl_train/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -831,8 +831,8 @@ def get_llm_for_sequence_regression(
# https://github.com/huggingface/transformers/issues/26877
model.config.use_cache = False

# NOTE: For reward model training only, intialize value_head manually
# because deepspeed.zero.Init() will not intialize them.
# NOTE: For reward model training only, initialize value_head manually
# because deepspeed.zero.Init() will not initialize them.
# TODO: Find a better way to clarify reward model training.
if init_value_head:
value_head = getattr(model, value_head_prefix)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -327,7 +327,7 @@ def register(cls, name: Union[str, StrEnum], func: Callable):
If ray is initialized, this function will get or create a named ray actor (RegistryActor)
for the registry, and sync the registry to the actor.

If ray is not initalized, the function will be stored in the local registry only.
If ray is not initialized, the function will be stored in the local registry only.

To make sure all locally registered functions are available to all ray processes,
call sync_with_actor() after ray.init().
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ def logprobs_from_logits_v2(
logits_labels - logsumexp_values
) # log_softmax(x_i) = x_i - logsumexp(x)
else:
# logsumexp approach is unstable with bfloat16, fall back to slightly less efficent approach
# logsumexp approach is unstable with bfloat16, fall back to slightly less efficient approach
logprobs_labels = []
for row_logits, row_labels in zip(
logits, labels
Expand Down
Loading