feat: ingest Google Drive projects from inventory policy roots#1
feat: ingest Google Drive projects from inventory policy roots#1mschwar wants to merge 26 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new inventory_policy discovery mode to build_ledger.py, enabling project ingestion based on external inventory artifacts and classification policies, primarily for cloud-drive surfaces like Google Drive. Key additions include logic for parsing inventory records, summarizing candidates, and building ledger entries from metadata. The update also improves Windows path handling in Markdown links and increases the Git subprocess timeout. Review feedback highlighted redundant fallback logic in candidate discovery, unreachable code in project hash generation, and a redundant integer cast.
| if not scoped_records: | ||
| scoped_records = [record for record in root_records if str(record.get("path", "")) == root_name] |
There was a problem hiding this comment.
This fallback logic appears to be redundant or potentially misleading. For the root candidate, normalized_inventory_path is equal to root_name, so the filter at line 433 already captures the root record. For child candidates, if scoped_records is empty, inheriting the root's records would result in incorrect metadata (such as last_touch_at) being associated with the child project. Since child candidates are only created if a record exists in the inventory (see line 465), scoped_records should already contain at least that record.
| or git_meta["normalized_remote_url"] | ||
| or f"google-drive:{candidate['inventory_path'].lower()}" | ||
| ) | ||
| project_hash = sha256_hex(project_key or candidate["path_key"]) |
There was a problem hiding this comment.
|
|
||
| obsidian_enabled = summary["has_obsidian"] | ||
| git_enabled = git_meta["git"] or summary["has_git"] | ||
| markdown_count = int(summary["markdown_count"]) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fda06a9187
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "WindowsPowerShell", | ||
| "project-ledger", | ||
| "repos" | ||
| "path": "/central/repos/apps", |
There was a problem hiding this comment.
Revert machine-specific absolute default roots
python build_ledger.py now fails out of the box on any machine that does not have this exact /central/... layout, because the default children/git_repos roots are absolute host-specific paths and discover_children immediately raises FileNotFoundError when the first root is missing. This breaks the documented default workflow and makes the repository non-runnable without local manual config edits.
Useful? React with 👍 / 👎.
| inventory_records = load_inventory_jsonl(root_cfg["_inventory_jsonl_path"]) | ||
| policy = read_json(root_cfg["_policy_path"]) |
There was a problem hiding this comment.
Validate inventory_policy required fields before use
discover_inventory_policy_candidates unconditionally indexes _inventory_jsonl_path and _policy_path, but collect_entries only sets those internal keys when the optional config fields are present. If an inventory_policy root has a missing/typoed inventory_jsonl or policy_path, the run crashes with a raw KeyError instead of a clear config validation error, making operator recovery harder.
Useful? React with 👍 / 👎.
Summary\n- Ingest Google Drive-backed projects from inventory policy roots\n- Expand ledger coverage and root/path handling\n- Update README, config, and tests for the new scan behavior\n\n## Verification\n- python build_ledger.py\n- python -m unittest tests.test_build_ledger\n\n## Notes\n- Branch is based on the current local main commit that is ahead of origin/main by 3 commits.