Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
45bccd3
fix permissions alignment
easong-openai Sep 16, 2025
a8026d3
fix: read-only escalations (#3673)
dylan-hurd-oai Sep 16, 2025
51216eb
merge(upstream): merge upstream/main into upstream-merge (by-bucket)\…
just-every-code Sep 16, 2025
5e2c4f7
Update azure model provider example (#3680)
pakrym-oai Sep 16, 2025
f30e6d1
Merge remote-tracking branch 'origin/main' into upstream-merge
just-every-code Sep 16, 2025
fbac43f
merge(upstream): integrate openai/codex@5e2c4f7e3 (docs update); veri…
just-every-code Sep 16, 2025
116c497
chore(merge): enforce policy removals (images + cargo caches)
github-actions[bot] Sep 16, 2025
2446873
Persist search items (#3745)
aibrahim-oai Sep 16, 2025
59d96c3
Merge remote-tracking branch 'origin/main' into upstream-merge
just-every-code Sep 16, 2025
1128565
fix: Record EnvironmentContext in SendUserTurn (#3678)
dylan-hurd-oai Sep 16, 2025
84d3460
Merge upstream/main: adopt upstream updates; no conflicts; preserve f…
just-every-code Sep 16, 2025
02b4652
merge(upstream): merge upstream/main into upstream-merge with buckete…
just-every-code Sep 16, 2025
7fe4021
Review mode core updates (#3701)
dedrisian-oai Sep 16, 2025
57505da
merge(upstream): sync with openai/codex@main (by-bucket); preserve fo…
just-every-code Sep 16, 2025
b8d2b1a
restyle thinking outputs (#3755)
nornagon-openai Sep 16, 2025
6e73b39
merge(upstream): merge openai/codex main into upstream-merge (by-buck…
just-every-code Sep 17, 2025
72733e3
Add dev message upon review out (#3758)
dedrisian-oai Sep 17, 2025
870b5b2
merge(upstream): incorporate review module and tests; keep fork core …
just-every-code Sep 17, 2025
694565f
chore(merge): enforce policy removals (images + cargo caches)
github-actions[bot] Sep 17, 2025
791d7b1
fix: make GitHub Action publish to npm using trusted publishing (#3431)
bolinfest Sep 17, 2025
cc7c755
merge(upstream): integrate upstream/main into upstream-merge (by-bucket)
just-every-code Sep 17, 2025
38dfd6a
docs(upstream-merge): add MERGE_PLAN and MERGE_REPORT under .github/auto
just-every-code Sep 17, 2025
c4a09f1
chore(merge): enforce policy removals (images + cargo caches)
github-actions[bot] Sep 17, 2025
5d87f5d
fix: ensure pnpm is installed before running `npm install` (#3763)
bolinfest Sep 17, 2025
eb49084
merge(upstream): upstream/main into upstream-merge (by-bucket)\n\n- K…
just-every-code Sep 17, 2025
5332f6e
fix: make publish-npm its own job with specific permissions (#3767)
bolinfest Sep 17, 2025
a6b70b0
Merge remote-tracking branch 'origin/main' into upstream-merge
just-every-code Sep 17, 2025
0ee9520
Merge upstream/main into upstream-merge: preserve workflow removal pe…
just-every-code Sep 17, 2025
e5fdb5b
fix: specify --repo when calling gh (#3806)
bolinfest Sep 17, 2025
208089e
AGENTS.md: Add instruction to install missing commands (#3807)
abhishek-oai Sep 17, 2025
530382d
Use agent reply text in turn notifications (#3756)
nornagon-openai Sep 17, 2025
459a7f5
Merge upstream/main into upstream-merge: keep our workflows and AGENT…
just-every-code Sep 17, 2025
47b03f8
merge(upstream): sync with openai/codex main into upstream-merge
just-every-code Sep 17, 2025
cd82c17
docs(upstream-merge): add MERGE_PLAN, MERGE_REPORT and PR body for th…
just-every-code Sep 17, 2025
cffba05
chore(merge): enforce policy removals (images + cargo caches)
github-actions[bot] Sep 17, 2025
c950548
chore: update "Codex CLI harness, sandboxing, and approvals" section …
tibo-openai Sep 17, 2025
1ec80a9
merge(upstream): merge upstream/main into upstream-merge (by-bucket)\…
just-every-code Sep 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 17 additions & 13 deletions codex-rs/core/gpt_5_codex_prompt.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,37 +26,41 @@ When using the planning tool:

## Codex CLI harness, sandboxing, and approvals

The Codex CLI harness supports several different sandboxing, and approval configurations that the user can choose from.
The Codex CLI harness supports several different configurations for sandboxing and escalation approvals that the user can choose from.

Filesystem sandboxing defines which files can be read or written. The options are:
- **read-only**: You can only read files.
- **workspace-write**: You can read files. You can write to files in this folder, but not outside it.
- **danger-full-access**: No filesystem sandboxing.
Filesystem sandboxing defines which files can be read or written. The options for `sandbox_mode` are:
- **read-only**: The sandbox only permits reading files.
- **workspace-write**: The sandbox permits reading files, and editing files in `cwd` and `writable_roots`. Editing files in other directories requires approval.
- **danger-full-access**: No filesystem sandboxing - all commands are permitted.

Network sandboxing defines whether network can be accessed without approval. Options are
Network sandboxing defines whether network can be accessed without approval. Options for `network_access` are:
- **restricted**: Requires approval
- **enabled**: No approval needed

Approvals are your mechanism to get user consent to perform more privileged actions. Although they introduce friction to the user because your work is paused until the user responds, you should leverage them to accomplish your important work. Do not let these settings or the sandbox deter you from attempting to accomplish the user's task unless it is set to "never", in which case never ask for approvals.

Approval options are
Approvals are your mechanism to get user consent to run shell commands without the sandbox. Possible configuration options for `approval_policy` are
- **untrusted**: The harness will escalate most commands for user approval, apart from a limited allowlist of safe "read" commands.
- **on-failure**: The harness will allow all commands to run in the sandbox (if enabled), and failures will be escalated to the user for approval to run again without the sandbox.
- **on-request**: Commands will be run in the sandbox by default, and you can specify in your tool call if you want to escalate a command to run without sandboxing. (Note that this mode is not always available. If it is, you'll see parameters for it in the `shell` command description.)
- **never**: This is a non-interactive mode where you may NEVER ask the user for approval to run commands. Instead, you must always persist and work around constraints to solve the task for the user. You MUST do your utmost best to finish the task and validate your work before yielding. If this mode is paired with `danger-full-access`, take advantage of it to deliver the best outcome for the user. Further, in this mode, your default testing philosophy is overridden: Even if you don't see local patterns for testing, you may add tests and scripts to validate your work. Just remove them before yielding.

When you are running with approvals `on-request`, and sandboxing enabled, here are scenarios where you'll need to request approval:
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /tmp)
When you are running with `approval_policy == on-request`, and sandboxing enabled, here are scenarios where you'll need to request approval:
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /var)
- You need to run a GUI app (e.g., open/xdg-open/osascript) to open browsers or files.
- You are running sandboxed and need to run a command that requires network access (e.g. installing packages)
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval.
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `with_escalated_permissions` and `justification` parameters - do not message the user before requesting approval for the command.
- You are about to take a potentially destructive action such as an `rm` or `git reset` that the user did not explicitly ask for
- (for all of these, you should weigh alternative paths that do not require approval)

When sandboxing is set to read-only, you'll need to request approval for any command that isn't a read.
When `sandbox_mode` is set to read-only, you'll need to request approval for any command that isn't a read.

You will be told what filesystem sandboxing, network sandboxing, and approval mode are active in a developer or user message. If you are not told about this, assume that you are running with workspace-write, network sandboxing enabled, and approval on-failure.

Although they introduce friction to the user because your work is paused until the user responds, you should leverage them when necessary to accomplish important work. If the completing the task requires escalated permissions, Do not let these settings or the sandbox deter you from attempting to accomplish the user's task unless it is set to "never", in which case never ask for approvals.

When requesting approval to execute a command that will require escalated privileges:
- Provide the `with_escalated_permissions` parameter with the boolean value true
- Include a short, 1 sentence explanation for why you need to enable `with_escalated_permissions` in the justification parameter

## Special user requests

- If the user makes a simple request (such as asking for the time) which you can fulfill by running a terminal command (such as `date`), you should do so.
Expand Down
3 changes: 3 additions & 0 deletions codex-rs/core/src/codex.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2113,6 +2113,7 @@ async fn submission_loop(
debug!("Agent loop exited");
}

// Intentionally omit upstream review thread spawning; our fork handles review flows differently.
/// Takes a user message as input and runs a loop where, at each turn, the model
/// replies with either:
///
Expand All @@ -2134,6 +2135,8 @@ async fn run_agent(sess: Arc<Session>, sub_id: String, input: Vec<InputItem>) {
if sess.tx_event.send(event).await.is_err() {
return;
}
// Continue with our fork's history and input handling.


// Debug logging for ephemeral images
let ephemeral_count = input
Expand Down
10 changes: 5 additions & 5 deletions codex-rs/core/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ use toml_edit::Item as TomlItem;
use toml_edit::Table as TomlTable;

const OPENAI_DEFAULT_MODEL: &str = "gpt-5";
const OPENAI_DEFAULT_REVIEW_MODEL: &str = "gpt-5";
const OPENAI_DEFAULT_REVIEW_MODEL: &str = "gpt-5-codex";
pub const GPT_5_CODEX_MEDIUM_MODEL: &str = "gpt-5-codex";

/// Maximum number of bytes of the documentation that will be embedded. Larger
Expand Down Expand Up @@ -2202,7 +2202,7 @@ model_verbosity = "high"
assert_eq!(
Config {
model: "o3".to_string(),
review_model: "gpt-5".to_string(),
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("o3").expect("known model slug"),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
Expand Down Expand Up @@ -2265,7 +2265,7 @@ model_verbosity = "high"
)?;
let expected_gpt3_profile_config = Config {
model: "gpt-3.5-turbo".to_string(),
review_model: "gpt-5".to_string(),
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("gpt-3.5-turbo").expect("known model slug"),
model_context_window: Some(16_385),
model_max_output_tokens: Some(4_096),
Expand Down Expand Up @@ -2344,7 +2344,7 @@ model_verbosity = "high"
)?;
let expected_zdr_profile_config = Config {
model: "o3".to_string(),
review_model: "gpt-5".to_string(),
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("o3").expect("known model slug"),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
Expand Down Expand Up @@ -2409,7 +2409,7 @@ model_verbosity = "high"
)?;
let expected_gpt5_profile_config = Config {
model: "gpt-5".to_string(),
review_model: "gpt-5".to_string(),
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("gpt-5").expect("known model slug"),
model_context_window: Some(400_000),
model_max_output_tokens: Some(128_000),
Expand Down
5 changes: 3 additions & 2 deletions codex-rs/core/src/conversation_history.rs
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,9 @@ fn is_api_message(message: &ResponseItem) -> bool {
| ResponseItem::CustomToolCall { .. }
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::LocalShellCall { .. }
| ResponseItem::Reasoning { .. } => true,
ResponseItem::WebSearchCall { .. } | ResponseItem::Other => false,
| ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. } => true,
ResponseItem::Other => false,
}
}

Expand Down
106 changes: 106 additions & 0 deletions codex-rs/core/src/environment_context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,33 @@ impl EnvironmentContext {
shell,
}
}

/// Compares two environment contexts, ignoring the shell. Useful when
/// comparing turn to turn, since the initial environment_context will
/// include the shell, and then it is not configurable from turn to turn.
#[cfg(test)]
pub fn equals_except_shell(&self, other: &EnvironmentContext) -> bool {
let EnvironmentContext {
cwd,
approval_policy,
sandbox_mode,
network_access,
writable_roots,
// should compare all fields except shell
shell: _,
} = other;

self.cwd == *cwd
&& self.approval_policy == *approval_policy
&& self.sandbox_mode == *sandbox_mode
&& self.network_access == *network_access
&& self.writable_roots == *writable_roots
}
}

// Note: The core no longer exposes `TurnContext` here; callers construct
// `EnvironmentContext` directly via `EnvironmentContext::new(...)`.

impl EnvironmentContext {
/// Serializes the environment context to XML. Libraries like `quick-xml`
/// require custom macros to handle Enums with newtypes, so we just do it
Expand Down Expand Up @@ -140,6 +165,9 @@ impl From<EnvironmentContext> for ResponseItem {

#[cfg(test)]
mod tests {
use crate::shell::BashShell;
use crate::shell::ZshShell;

use super::*;
use pretty_assertions::assert_eq;

Expand Down Expand Up @@ -210,4 +238,82 @@ mod tests {

assert_eq!(context.serialize_to_xml(), expected);
}

#[test]
fn equals_except_shell_compares_approval_policy() {
// Approval policy
let context1 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo"], false)),
None,
);
let context2 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::Never),
Some(workspace_write_policy(vec!["/repo"], true)),
None,
);
assert!(!context1.equals_except_shell(&context2));
}

#[test]
fn equals_except_shell_compares_sandbox_policy() {
let context1 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(SandboxPolicy::new_read_only_policy()),
None,
);
let context2 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(SandboxPolicy::new_workspace_write_policy()),
None,
);

assert!(!context1.equals_except_shell(&context2));
}

#[test]
fn equals_except_shell_compares_workspace_write_policy() {
let context1 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo", "/tmp", "/var"], false)),
None,
);
let context2 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo", "/tmp"], true)),
None,
);

assert!(!context1.equals_except_shell(&context2));
}

#[test]
fn equals_except_shell_ignores_shell() {
let context1 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo"], false)),
Some(Shell::Bash(BashShell {
shell_path: "/bin/bash".into(),
bashrc_path: "/home/user/.bashrc".into(),
})),
);
let context2 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo"], false)),
Some(Shell::Zsh(ZshShell {
shell_path: "/bin/zsh".into(),
zshrc_path: "/home/user/.zshrc".into(),
})),
);

assert!(context1.equals_except_shell(&context2));
}
}
3 changes: 3 additions & 0 deletions codex-rs/core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ pub use model_provider_info::built_in_model_providers;
pub use model_provider_info::create_oss_provider_with_base_url;
mod conversation_manager;
mod event_mapping;
pub mod review_format;
pub use codex_protocol::protocol::InitialHistory;
pub use conversation_manager::ConversationManager;
pub use conversation_manager::NewConversation;
Expand Down Expand Up @@ -93,6 +94,8 @@ pub use crate::client_common::Prompt;
pub use crate::client_common::TextFormat;
pub use crate::client_common::ResponseEvent;
pub use crate::client_common::ResponseStream;
// Upstream also exports REVIEW_PROMPT; include it to preserve compatibility.
pub use crate::client_common::REVIEW_PROMPT;
pub use codex_protocol::models::ContentItem;
pub use codex_protocol::models::ReasoningItemContent;
pub use codex_protocol::models::ResponseItem;
7 changes: 7 additions & 0 deletions codex-rs/core/src/protocol.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,13 @@ use crate::model_provider_info::ModelProviderInfo;
use crate::parse_command::ParsedCommand;
use crate::plan_tool::UpdatePlanArgs;

// Re-export review types from the shared protocol crate so callers can use
// `codex_core::protocol::ReviewFinding` and friends.
pub use codex_protocol::protocol::ReviewCodeLocation;
pub use codex_protocol::protocol::ReviewFinding;
pub use codex_protocol::protocol::ReviewLineRange;
pub use codex_protocol::protocol::ReviewOutputEvent;

/// Submission Queue Entry - requests from user
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct Submission {
Expand Down
55 changes: 55 additions & 0 deletions codex-rs/core/src/review_format.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
use crate::protocol::ReviewFinding;

// Note: We keep this module UI-agnostic. It returns plain strings that
// higher layers (e.g., TUI) may style as needed.

fn format_location(item: &ReviewFinding) -> String {
let path = item.code_location.absolute_file_path.display();
let start = item.code_location.line_range.start;
let end = item.code_location.line_range.end;
format!("{path}:{start}-{end}")
}

/// Format a full review findings block as plain text lines.
///
/// - When `selection` is `Some`, each item line includes a checkbox marker:
/// "[x]" for selected items and "[ ]" for unselected. Missing indices
/// default to selected.
/// - When `selection` is `None`, the marker is omitted and a simple bullet is
/// rendered ("- Title — path:start-end").
pub fn format_review_findings_block(
findings: &[ReviewFinding],
selection: Option<&[bool]>,
) -> String {
let mut lines: Vec<String> = Vec::new();

// Header
let header = if findings.len() > 1 {
"Full review comments:"
} else {
"Review comment:"
};
lines.push(header.to_string());

for (idx, item) in findings.iter().enumerate() {
lines.push(String::new());

let title = &item.title;
let location = format_location(item);

if let Some(flags) = selection {
// Default to selected if index is out of bounds.
let checked = flags.get(idx).copied().unwrap_or(true);
let marker = if checked { "[x]" } else { "[ ]" };
lines.push(format!("- {marker} {title} — {location}"));
} else {
lines.push(format!("- {title} — {location}"));
}

for body_line in item.body.lines() {
lines.push(format!(" {body_line}"));
}
}

lines.join("\n")
}
5 changes: 3 additions & 2 deletions codex-rs/core/src/rollout/policy.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,9 @@ pub(crate) fn should_persist_response_item(item: &ResponseItem) -> bool {
| ResponseItem::FunctionCall { .. }
| ResponseItem::FunctionCallOutput { .. }
| ResponseItem::CustomToolCall { .. }
| ResponseItem::CustomToolCallOutput { .. } => true,
ResponseItem::WebSearchCall { .. } | ResponseItem::Other => false,
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::WebSearchCall { .. } => true,
ResponseItem::Other => false,
}
}

Expand Down
12 changes: 6 additions & 6 deletions codex-rs/core/src/shell.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,20 @@ use std::path::PathBuf;

#[derive(Debug, PartialEq, Eq, Clone, Serialize, Deserialize)]
pub struct ZshShell {
shell_path: String,
zshrc_path: String,
pub(crate) shell_path: String,
pub(crate) zshrc_path: String,
}

#[derive(Debug, PartialEq, Eq, Clone, Serialize, Deserialize)]
pub struct BashShell {
shell_path: String,
bashrc_path: String,
pub(crate) shell_path: String,
pub(crate) bashrc_path: String,
}

#[derive(Debug, PartialEq, Eq, Clone, Serialize, Deserialize)]
pub struct PowerShellConfig {
exe: String, // Executable name or path, e.g. "pwsh" or "powershell.exe".
bash_exe_fallback: Option<PathBuf>, // In case the model generates a bash command.
pub(crate) exe: String, // Executable name or path, e.g. "pwsh" or "powershell.exe".
pub(crate) bash_exe_fallback: Option<PathBuf>, // In case the model generates a bash command.
}

#[derive(Debug, PartialEq, Eq, Clone, Serialize, Deserialize)]
Expand Down
Loading