diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..9749a8e --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,116 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +Open AD Kit provides containerized, microservice-based components for the [Autoware](https://github.com/autowarefoundation/autoware) autonomous driving stack. It packages Autoware into independent Docker images for modular deployment across cloud and edge (amd64/arm64). This is a SOAFEE Blueprint project under the Autoware Foundation. + +## Build Commands + +### Build container images (requires Docker buildx) + +```bash +# Build all component images (default target) for current platform, ROS Humble +./build.sh + +# Build options +./build.sh --platform linux/arm64 # Cross-build for ARM +./build.sh --platform jp62 # Jetson Linux 6.2 (arm64, CUDA always included) +./build.sh --ros-distro jazzy # Use ROS Jazzy (default: humble) +./build.sh --no-cuda # Skip CUDA image variants +./build.sh --target common # Build only base images (stage 1) +./build.sh --target components # Build components (stages 1+2, default) +./build.sh --target universe # Build everything (stages 1+2+3) +``` + +The build script first clones the Autoware repo and imports source via `vcs`, then runs `docker buildx bake` through three stages. + +### Setup runtime environment + +```bash +./setup.sh # Installs Docker, NVIDIA Container Toolkit +``` + +### Documentation (MkDocs) + +```bash +make prepare # Build the MkDocs dev container +make serve # Serve docs locally at localhost:8000 +make build # Build static site +make clean # Remove site/ directory +``` + +## Architecture + +### Three-stage image hierarchy + +The build produces layered Docker images defined in `components/docker-bake.hcl`: + +1. **Common** (`components/common/`): Base and devel images built on top of ROS. Each has a CUDA variant and a JP62 variant. + - `common-base` / `common-base-cuda` / `common-base-jp62` — runtime base + - `common-devel` / `common-devel-cuda` / `common-devel-jp62` — build-time with Autoware dependencies +2. **Components** (7 independent images): Each built from `common-devel`, installed onto `common-base`. Each has its own Dockerfile under `components//`: + - `sensing-perception` (has CUDA variant), `localization-mapping`, `planning-control`, `vehicle-system`, `api`, `visualizer`, `simulator` +3. **Universe** (`components/universe/`): Merges all component install spaces into a single image. Has a CUDA variant. + +### Platform variants + +The image matrix spans `{platform} x {ros-distro}`: + +| Platform | Arch | CUDA | Base image | Dockerfile | +|----------------|-------------|----------|---------------------------------------------|----------------------------| +| `amd64` | linux/amd64 | optional | `ros:{distro}-ros-base-{ubuntu}` | `Dockerfile` | +| `arm64` | linux/arm64 | no | `ros:{distro}-ros-base-{ubuntu}` | `Dockerfile` | +| `amd64` + cuda | linux/amd64 | yes | `ros:{distro}-ros-base-{ubuntu}` | `Dockerfile` (cuda stages) | +| `jp62` | linux/arm64 | always | `nvcr.io/nvidia/l4t-tensorrt:r10.3.0-devel` | `Dockerfile.jp62` | + +JP62 images fulfill the same contract as CUDA images (`common-base-cuda` / `common-devel-cuda`), so downstream `Dockerfile.cuda` component files work unmodified by receiving JP62 images as their `COMMON_BASE_CUDA_IMAGE` / `COMMON_DEVEL_CUDA_IMAGE` args. + +**JP62-specific concerns** (`components/common/Dockerfile.jp62`): +- L4T base has no ROS — installed from apt (`ros-humble-desktop`) +- L4T OpenCV 4.8.0 replaced with Ubuntu 4.5.4 (apt pin in `components/common/jp62/opencv-preferences`) +- L4T CMake 3.14 replaced with system CMake >= 3.22 +- NVIDIA packages from L4T repos (not `ubuntu2204/sbsa`) — `setup-dev-env.sh` must use `--no-nvidia --no-cuda-drivers` to avoid conflicts +- `CUDAARCHS=87` (Orin) set to avoid native detection failures under QEMU +- spconv/cumm installed from pre-built Jetson ARM `.deb` packages +- `ros-humble-tensorrt-cmake-module` and `ros-humble-cudnn-cmake-module` installed explicitly (skipped by `--no-nvidia` ansible) +- colcon mixin index must be explicitly registered (not inherited from ros: Docker base image) +- CMake 3.28 from Kitware PPA required: system cmake 3.22 has a `find_library()` bug where the `ament_cmake_export_libraries` template's `set(_lib "NOTFOUND")` pattern causes the search to be skipped. Additionally, the template reuses a shared `_lib` cache variable across packages, causing cross-package pollution ([ament_cmake#182](https://github.com/ament/ament_cmake/issues/182)). Both fixed by cmake 3.28 + a sed patch in the Dockerfile. Pinned to 3.28 to stay below 4.0 +- JP62 images must be built on native Jetson (arm64); x86 cross-compilation via QEMU hits intermittent `find_library` failures in cmake subprocess calls +- Autoware 1.7.1 hardcodes `-gencode arch=compute_101` (Blackwell) in 14 CUDA CMakeLists.txt files; CUDA 12.6 on JP62 only supports up to `compute_90`. Run `components/common/jp62/patch-cuda-arch.sh autoware/src` after cloning sources to gate these behind `CUDA_VERSION >= 12.8` + +### Deployment samples + +- `deployments/samples/planning-simulation/` — planning stack with AWSIM simulator +- `deployments/samples/logging-simulation/` — end-to-end replay with rosbag +- `deployments/demos/zenoh-bridge/` — remote visualization via Zenoh bridge + +### Platform configurations + +- `platforms/autosd/` — Automotive-grade Linux (CentOS Stream AutoSD) + +### Tag scheme + +CI tags images as `{variant}-{platform}-{distro}[-{date}]`: +- `base-amd64-humble`, `devel-cuda-amd64-humble`, `base-jp62-humble` +- Component/universe: `sensing-perception-amd64-humble`, `universe-jp62-humble` + +Multi-arch manifests (amd64+arm64) strip the platform: `base-humble-{date}`. CUDA and JP62 images are single-arch. + +## CI/CD + +GitHub Actions workflows in `.github/workflows/`: + +- **build-all-images.yaml**: Main CI. Builds all images for amd64+arm64, humble+jazzy. Triggered on push to main, monthly schedule, or manual dispatch. Pushes to `ghcr.io`. +- **release-all-images.yaml**: Runs every 6 hours. Detects latest Autoware release tag and publishes versioned images. +- **deploy-docs.yaml**: Builds and deploys MkDocs site to GitHub Pages on push to main. +- **semantic-pull-request.yaml**: Enforces conventional commit style on PR titles. + +## Key Conventions + +- Image registry: `ghcr.io/autowarefoundation/openadkit` +- ROS distributions: `humble` (Ubuntu Jammy) and `jazzy` (Ubuntu Noble) +- PR titles must follow [Conventional Commits](https://www.conventionalcommits.org/) (enforced by CI) +- The `autoware/` directory is git-ignored — it's cloned at build time by `build.sh` +- No unit test suite; validation happens through Docker build success in CI diff --git a/build.sh b/build.sh index b70bf51..b0765c7 100755 --- a/build.sh +++ b/build.sh @@ -9,7 +9,7 @@ print_help() { echo "Usage: build.sh [OPTIONS]" echo "Options:" echo " -h | --help Display this help message" - echo " --platform Specify the platform (linux/amd64 or linux/arm64, default: current platform)" + echo " --platform Specify the platform (linux/amd64, linux/arm64, or jp62; default: current platform)" echo " --ros-distro Specify ROS distribution (humble or jazzy, default: humble)" echo " --no-cuda Do not build CUDA images (default: false)" echo " --target Specify the target images to build (common, components, universe, default: components)" @@ -76,6 +76,12 @@ set_platform() { platform="linux/arm64" fi fi + + # JP62 (Jetson Linux 6.2) is always arm64 + if [ "$platform" = "jp62" ]; then + is_jp62=true + platform="linux/arm64" + fi } # Clone autoware repositories @@ -97,6 +103,14 @@ clone_repositories() { fi } +# Apply platform-specific source patches +apply_patches() { + if [ "$is_jp62" = "true" ]; then + echo "Applying JP62 patches to Autoware sources..." + "$SCRIPT_DIR/components/common/jp62/patch-cuda-arch.sh" "$WORKSPACE_ROOT/autoware/src" + fi +} + # Build images build_images() { # https://github.com/docker/buildx/issues/484 @@ -114,6 +128,7 @@ build_images() { echo "Building images with:" echo " Target: $target" echo " Platform: $platform" + echo " JP62: $([ "$is_jp62" = "true" ] && echo "yes" || echo "no")" echo " ROS distro: $ros_distro" echo " Base image: $base_image" echo " CUDA: $([ "$option_no_cuda" = "true" ] && echo "disabled" || echo "enabled")" @@ -123,26 +138,38 @@ build_images() { # ========================================================================= # Stage 1: Common images # ========================================================================= - docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ - --set "*.context=$WORKSPACE_ROOT" \ - --set "*.ssh=default" \ - --set "*.platform=$platform" \ - --set "*.args.ROS_DISTRO=$ros_distro" \ - --set "*.args.BASE_IMAGE=$base_image" \ - --set "common-base.tags=${image_common}:base" \ - --set "common-devel.tags=${image_common}:devel" \ - common-base common-devel - - if [ "$option_no_cuda" != "true" ]; then + if [ "$is_jp62" = "true" ]; then + # JP62: build Jetson-specific common images (CUDA is always included) + docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ + --set "*.context=$WORKSPACE_ROOT" \ + --set "*.ssh=default" \ + --set "*.platform=$platform" \ + --set "*.args.ROS_DISTRO=$ros_distro" \ + --set "common-base-jp62.tags=${image_common}:base-jp62" \ + --set "common-devel-jp62.tags=${image_common}:devel-jp62" \ + common-base-jp62 common-devel-jp62 + else docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ --set "*.context=$WORKSPACE_ROOT" \ --set "*.ssh=default" \ --set "*.platform=$platform" \ --set "*.args.ROS_DISTRO=$ros_distro" \ --set "*.args.BASE_IMAGE=$base_image" \ - --set "common-base-cuda.tags=${image_common}:base-cuda" \ - --set "common-devel-cuda.tags=${image_common}:devel-cuda" \ - common-base-cuda common-devel-cuda + --set "common-base.tags=${image_common}:base" \ + --set "common-devel.tags=${image_common}:devel" \ + common-base common-devel + + if [ "$option_no_cuda" != "true" ]; then + docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ + --set "*.context=$WORKSPACE_ROOT" \ + --set "*.ssh=default" \ + --set "*.platform=$platform" \ + --set "*.args.ROS_DISTRO=$ros_distro" \ + --set "*.args.BASE_IMAGE=$base_image" \ + --set "common-base-cuda.tags=${image_common}:base-cuda" \ + --set "common-devel-cuda.tags=${image_common}:devel-cuda" \ + common-base-cuda common-devel-cuda + fi fi if [ "$target" = "common" ]; then @@ -153,32 +180,61 @@ build_images() { # ========================================================================= # Stage 2: Component images # ========================================================================= - docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ - --set "*.context=$WORKSPACE_ROOT" \ - --set "*.ssh=default" \ - --set "*.platform=$platform" \ - --set "*.args.ROS_DISTRO=$ros_distro" \ - --set "*.args.COMMON_BASE_IMAGE=${image_common}:base" \ - --set "*.args.COMMON_DEVEL_IMAGE=${image_common}:devel" \ - --set "sensing-perception.tags=${image_component}:sensing-perception" \ - --set "localization-mapping.tags=${image_component}:localization-mapping" \ - --set "planning-control.tags=${image_component}:planning-control" \ - --set "vehicle-system.tags=${image_component}:vehicle-system" \ - --set "api.tags=${image_component}:api" \ - --set "visualizer.tags=${image_component}:visualizer" \ - --set "simulator.tags=${image_component}:simulator" \ - sensing-perception localization-mapping planning-control vehicle-system api visualizer simulator - - if [ "$option_no_cuda" != "true" ]; then + if [ "$is_jp62" = "true" ]; then + # JP62: feed JP62 common images into the CUDA Dockerfiles. + # The JP62 base/devel images fulfill the same contract as common-base-cuda / common-devel-cuda. docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ --set "*.context=$WORKSPACE_ROOT" \ --set "*.ssh=default" \ --set "*.platform=$platform" \ --set "*.args.ROS_DISTRO=$ros_distro" \ - --set "*.args.COMMON_BASE_CUDA_IMAGE=${image_common}:base-cuda" \ - --set "*.args.COMMON_DEVEL_CUDA_IMAGE=${image_common}:devel-cuda" \ - --set "sensing-perception-cuda.tags=${image_component}:sensing-perception-cuda" \ + --set "*.args.COMMON_BASE_CUDA_IMAGE=${image_common}:base-jp62" \ + --set "*.args.COMMON_DEVEL_CUDA_IMAGE=${image_common}:devel-jp62" \ + --set "sensing-perception-cuda.tags=${image_component}:sensing-perception-jp62" \ sensing-perception-cuda + + docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ + --set "*.context=$WORKSPACE_ROOT" \ + --set "*.ssh=default" \ + --set "*.platform=$platform" \ + --set "*.args.ROS_DISTRO=$ros_distro" \ + --set "*.args.COMMON_BASE_IMAGE=${image_common}:base-jp62" \ + --set "*.args.COMMON_DEVEL_IMAGE=${image_common}:devel-jp62" \ + --set "localization-mapping.tags=${image_component}:localization-mapping-jp62" \ + --set "planning-control.tags=${image_component}:planning-control-jp62" \ + --set "vehicle-system.tags=${image_component}:vehicle-system-jp62" \ + --set "api.tags=${image_component}:api-jp62" \ + --set "visualizer.tags=${image_component}:visualizer-jp62" \ + --set "simulator.tags=${image_component}:simulator-jp62" \ + localization-mapping planning-control vehicle-system api visualizer simulator + else + docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ + --set "*.context=$WORKSPACE_ROOT" \ + --set "*.ssh=default" \ + --set "*.platform=$platform" \ + --set "*.args.ROS_DISTRO=$ros_distro" \ + --set "*.args.COMMON_BASE_IMAGE=${image_common}:base" \ + --set "*.args.COMMON_DEVEL_IMAGE=${image_common}:devel" \ + --set "sensing-perception.tags=${image_component}:sensing-perception" \ + --set "localization-mapping.tags=${image_component}:localization-mapping" \ + --set "planning-control.tags=${image_component}:planning-control" \ + --set "vehicle-system.tags=${image_component}:vehicle-system" \ + --set "api.tags=${image_component}:api" \ + --set "visualizer.tags=${image_component}:visualizer" \ + --set "simulator.tags=${image_component}:simulator" \ + sensing-perception localization-mapping planning-control vehicle-system api visualizer simulator + + if [ "$option_no_cuda" != "true" ]; then + docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ + --set "*.context=$WORKSPACE_ROOT" \ + --set "*.ssh=default" \ + --set "*.platform=$platform" \ + --set "*.args.ROS_DISTRO=$ros_distro" \ + --set "*.args.COMMON_BASE_CUDA_IMAGE=${image_common}:base-cuda" \ + --set "*.args.COMMON_DEVEL_CUDA_IMAGE=${image_common}:devel-cuda" \ + --set "sensing-perception-cuda.tags=${image_component}:sensing-perception-cuda" \ + sensing-perception-cuda + fi fi if [ "$target" = "components" ]; then @@ -189,40 +245,59 @@ build_images() { # ========================================================================= # Stage 3: Universe images # ========================================================================= - docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ - --set "*.context=$WORKSPACE_ROOT" \ - --set "*.ssh=default" \ - --set "*.platform=$platform" \ - --set "*.args.ROS_DISTRO=$ros_distro" \ - --set "*.args.COMMON_BASE_IMAGE=${image_common}:base" \ - --set "*.args.COMMON_DEVEL_IMAGE=${image_common}:devel" \ - --set "*.args.SENSING_PERCEPTION_IMAGE=${image_component}:sensing-perception" \ - --set "*.args.LOCALIZATION_MAPPING_IMAGE=${image_component}:localization-mapping" \ - --set "*.args.PLANNING_CONTROL_IMAGE=${image_component}:planning-control" \ - --set "*.args.VEHICLE_SYSTEM_IMAGE=${image_component}:vehicle-system" \ - --set "*.args.API_IMAGE=${image_component}:api" \ - --set "*.args.VISUALIZER_IMAGE=${image_component}:visualizer" \ - --set "*.args.SIMULATOR_IMAGE=${image_component}:simulator" \ - --set "universe.tags=${image_component}:universe" \ - universe - - if [ "$option_no_cuda" != "true" ]; then + if [ "$is_jp62" = "true" ]; then + docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ + --set "*.context=$WORKSPACE_ROOT" \ + --set "*.ssh=default" \ + --set "*.platform=$platform" \ + --set "*.args.ROS_DISTRO=$ros_distro" \ + --set "*.args.COMMON_BASE_CUDA_IMAGE=${image_common}:base-jp62" \ + --set "*.args.COMMON_DEVEL_CUDA_IMAGE=${image_common}:devel-jp62" \ + --set "*.args.SENSING_PERCEPTION_CUDA_IMAGE=${image_component}:sensing-perception-jp62" \ + --set "*.args.LOCALIZATION_MAPPING_IMAGE=${image_component}:localization-mapping-jp62" \ + --set "*.args.PLANNING_CONTROL_IMAGE=${image_component}:planning-control-jp62" \ + --set "*.args.VEHICLE_SYSTEM_IMAGE=${image_component}:vehicle-system-jp62" \ + --set "*.args.API_IMAGE=${image_component}:api-jp62" \ + --set "*.args.VISUALIZER_IMAGE=${image_component}:visualizer-jp62" \ + --set "*.args.SIMULATOR_IMAGE=${image_component}:simulator-jp62" \ + --set "universe-cuda.tags=${image_component}:universe-jp62" \ + universe-cuda + else docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ --set "*.context=$WORKSPACE_ROOT" \ --set "*.ssh=default" \ --set "*.platform=$platform" \ --set "*.args.ROS_DISTRO=$ros_distro" \ - --set "*.args.COMMON_BASE_CUDA_IMAGE=${image_common}:base-cuda" \ - --set "*.args.COMMON_DEVEL_CUDA_IMAGE=${image_common}:devel-cuda" \ - --set "*.args.SENSING_PERCEPTION_CUDA_IMAGE=${image_component}:sensing-perception-cuda" \ + --set "*.args.COMMON_BASE_IMAGE=${image_common}:base" \ + --set "*.args.COMMON_DEVEL_IMAGE=${image_common}:devel" \ + --set "*.args.SENSING_PERCEPTION_IMAGE=${image_component}:sensing-perception" \ --set "*.args.LOCALIZATION_MAPPING_IMAGE=${image_component}:localization-mapping" \ --set "*.args.PLANNING_CONTROL_IMAGE=${image_component}:planning-control" \ --set "*.args.VEHICLE_SYSTEM_IMAGE=${image_component}:vehicle-system" \ --set "*.args.API_IMAGE=${image_component}:api" \ --set "*.args.VISUALIZER_IMAGE=${image_component}:visualizer" \ --set "*.args.SIMULATOR_IMAGE=${image_component}:simulator" \ - --set "universe-cuda.tags=${image_component}:universe-cuda" \ - universe-cuda + --set "universe.tags=${image_component}:universe" \ + universe + + if [ "$option_no_cuda" != "true" ]; then + docker buildx bake --allow=ssh --load --progress=plain -f "$bake_file" \ + --set "*.context=$WORKSPACE_ROOT" \ + --set "*.ssh=default" \ + --set "*.platform=$platform" \ + --set "*.args.ROS_DISTRO=$ros_distro" \ + --set "*.args.COMMON_BASE_CUDA_IMAGE=${image_common}:base-cuda" \ + --set "*.args.COMMON_DEVEL_CUDA_IMAGE=${image_common}:devel-cuda" \ + --set "*.args.SENSING_PERCEPTION_CUDA_IMAGE=${image_component}:sensing-perception-cuda" \ + --set "*.args.LOCALIZATION_MAPPING_IMAGE=${image_component}:localization-mapping" \ + --set "*.args.PLANNING_CONTROL_IMAGE=${image_component}:planning-control" \ + --set "*.args.VEHICLE_SYSTEM_IMAGE=${image_component}:vehicle-system" \ + --set "*.args.API_IMAGE=${image_component}:api" \ + --set "*.args.VISUALIZER_IMAGE=${image_component}:visualizer" \ + --set "*.args.SIMULATOR_IMAGE=${image_component}:simulator" \ + --set "universe-cuda.tags=${image_component}:universe-cuda" \ + universe-cuda + fi fi set +x @@ -239,5 +314,6 @@ set_ros_distro set_build_options set_platform clone_repositories +apply_patches build_images remove_dangling_images diff --git a/components/common/Dockerfile.jp62 b/components/common/Dockerfile.jp62 new file mode 100644 index 0000000..ea07914 --- /dev/null +++ b/components/common/Dockerfile.jp62 @@ -0,0 +1,251 @@ +# Dockerfile.jp62 - Builds common-base and common-devel images for Jetson Linux 6.2 +# +# JetPack 6.2 base: L4T r36.4.x with CUDA 12.6, cuDNN 9.3, TensorRT 10.3 +# Platform: linux/arm64 only (Jetson Orin) +# CUDA is always present — there is no non-CUDA JP62 variant. +# +# Image contract: +# common-base-jp62 ≈ common-base-cuda (ROS + CUDA runtime, ready for setup-dev-env.sh) +# common-devel-jp62 ≈ common-devel-cuda (+ Autoware dev tools + common packages built) +# +# Downstream component Dockerfile.cuda files consume these via: +# ARG COMMON_BASE_CUDA_IMAGE=...:base-jp62-humble +# ARG COMMON_DEVEL_CUDA_IMAGE=...:devel-jp62-humble + +ARG JP62_BASE_IMAGE=nvcr.io/nvidia/l4t-tensorrt:r10.3.0-devel +ARG ROS_DISTRO=humble + +# ============================================================================= +# Stage: jp62-setup +# L4T-specific bootstrapping that has no equivalent in the x86 path: +# locale, OpenCV downgrade, CMake upgrade, ROS 2 from scratch, L4T NVIDIA +# packages, CUDA env, spconv/cumm. +# ============================================================================= +FROM $JP62_BASE_IMAGE AS jp62-setup +SHELL ["/bin/bash", "-o", "pipefail", "-c"] +ARG ROS_DISTRO + +# --- Locale & timezone ------------------------------------------------------- +RUN apt-get update && apt-get install -y locales \ + && echo 'en_US.UTF-8 UTF-8' > /etc/locale.gen \ + && locale-gen \ + && update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 +ENV LANG=en_US.UTF-8 +RUN ln -snf /usr/share/zoneinfo/UTC /etc/localtime && echo UTC > /etc/timezone + +# --- APT sources ------------------------------------------------------------- +RUN apt-get install -y software-properties-common apt-transport-https \ + && add-apt-repository universe + +# AutonomouStuff repository (pacmod3 packages) +RUN echo "deb [trusted=yes] https://s3.amazonaws.com/autonomoustuff-repo/ jammy main" \ + | tee /etc/apt/sources.list.d/autonomoustuff-public.list > /dev/null + +# --- Replace L4T OpenCV 4.8.0 with Ubuntu OpenCV 4.5.4 ---------------------- +# L4T ships a non-APT OpenCV 4.8.0 that conflicts with ROS cv_bridge. +# We remove all traces of it and pin APT to the Ubuntu version. +RUN rm -rf /usr/lib/libopencv* /usr/lib/cmake/opencv4 /usr/include/opencv4 \ + /usr/local/lib/libopencv* /usr/local/lib/cmake/opencv4 /usr/local/include/opencv4 \ + /usr/share/OpenCV /usr/share/opencv4 \ + /opt/opencv* +COPY components/common/jp62/opencv-preferences /etc/apt/preferences.d/opencv-preferences +RUN apt-get update && apt-get install -y libopencv-dev + +# --- Install CMake >= 3.24 from Kitware APT repository ---------------------- +# L4T ships CMake 3.14 which is too old. Ubuntu 22.04 system cmake is 3.22, +# but 3.22 has a bug where find_library() fails when the result variable is +# pre-set to "NOTFOUND" via set(). The ament_cmake_export_libraries template +# relies on this pattern (set(_lib "NOTFOUND") before find_library(_lib ...)), +# causing "exports the library X which couldn't be found" during colcon builds. +# Fixed in CMake 3.24+. Pinned to 3.28.x to stay below 4.0 (which drops +# cmake_minimum_required < 3.5 compat needed by some Autoware dependencies). +# See: https://cmake.org/cmake/help/v3.24/command/find_library.html +RUN rm -f /usr/local/bin/cmake /usr/local/bin/ctest /usr/local/bin/cpack /usr/local/bin/ccmake +RUN apt-get update && apt-get install -y ca-certificates gpg wget \ + && wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null \ + | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null \ + && echo "deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main" \ + | tee /etc/apt/sources.list.d/kitware.list >/dev/null \ + && apt-get update \ + && apt-get install -y cmake-data=3.28.* cmake=3.28.* + +# --- Install ROS 2 from apt ------------------------------------------------- +# L4T does not ship ROS; install following the official Ubuntu debs guide. +RUN apt-get install -y curl gnupg lsb-release \ + && curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key \ + -o /usr/share/keyrings/ros-archive-keyring.gpg \ + && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] \ + http://packages.ros.org/ros2/ubuntu $(. /etc/os-release && echo $UBUNTU_CODENAME) main" \ + | tee /etc/apt/sources.list.d/ros2.list > /dev/null \ + && apt-get update \ + && apt-get install -y ros-${ROS_DISTRO}-desktop ros-dev-tools + +# Fix ament_cmake_export_libraries cache variable pollution (ament_cmake#182). +# The ament template reuses a shared cache variable "_lib" across all packages' +# find_library() calls. When find_package(A) caches _lib=/path/to/libA.so, then +# find_package(B) does set(_lib "NOTFOUND") which sets a normal variable but +# doesn't clear the cache entry. Combined with cmake 3.22's behavior of skipping +# search when the variable is "already set", this causes cross-package failures. +# Fix: force-reset the cache entry and unset the normal variable before each +# find_library, ensuring a fresh search. Requires cmake >= 3.24 (Kitware PPA). +RUN find /opt/ros/${ROS_DISTRO}/share -name "ament_cmake_export_libraries-extras.cmake" \ + -exec sed -i 's/set(_lib "NOTFOUND")/set(_lib "_lib-NOTFOUND" CACHE FILEPATH "" FORCE)\n unset(_lib)/' {} + + +# rosdep update may fail on transient 503s from GitHub CDN; tolerate it here +# since the devel stage runs rosdep update again before resolving keys. +RUN rosdep init || true; rosdep update || true + +# --- NVIDIA L4T APT sources (r36.4 / JetPack 6.2) --------------------------- +RUN apt-key adv --fetch-key http://repo.download.nvidia.com/jetson/jetson-ota-public.asc \ + && echo "deb https://repo.download.nvidia.com/jetson/common r36.4 main" > /etc/apt/sources.list.d/nvidia-l4t-apt-source.list \ + && echo "deb https://repo.download.nvidia.com/jetson/t234 r36.4 main" >> /etc/apt/sources.list.d/nvidia-l4t-apt-source.list \ + && echo "deb https://repo.download.nvidia.com/jetson/ffmpeg r36.4 main" >> /etc/apt/sources.list.d/nvidia-l4t-apt-source.list \ + && mkdir -p /opt/nvidia/l4t-packages \ + && touch /opt/nvidia/l4t-packages/.nv-l4t-disable-boot-fw-update-in-preinstall \ + && apt-get update + +# L4T runtime packages: CUDA driver stubs, DLA compiler, cuDNN headers +RUN apt-get install -o DPkg::Options::="--force-confold" -y \ + nvidia-l4t-core \ + nvidia-l4t-cuda \ + nvidia-l4t-dla-compiler \ + libcudnn9-dev-cuda-12 + +# --- CUDA environment -------------------------------------------------------- +# CUDAARCHS=87 targets Orin (avoids "native" detection that fails under QEMU). +ENV CUDA_HOME=/usr/local/cuda +ENV CMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc +ENV PATH=/usr/local/cuda/bin:${PATH} +ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH} +ENV CUDAARCHS=87 + +# --- spconv / cumm (sparse convolution for BEVFusion / perception) ----------- +RUN wget -q https://github.com/autowarefoundation/spconv_cpp/releases/download/spconv_v2.3.8%2Bcumm_v0.5.3%2Bcu128/cumm_0.5.3_arm64-jetson.deb \ + && wget -q https://github.com/autowarefoundation/spconv_cpp/releases/download/spconv_v2.3.8%2Bcumm_v0.5.3%2Bcu128/spconv_2.3.8_arm64-jetson.deb \ + && apt-get install -y ./cumm_0.5.3_arm64-jetson.deb ./spconv_2.3.8_arm64-jetson.deb \ + && rm -f cumm_0.5.3_arm64-jetson.deb spconv_2.3.8_arm64-jetson.deb + +# ============================================================================= +# Stage: common-base-jp62 +# Base runtime image — contract-equivalent to common-base-cuda on x86. +# Provides: ROS, CUDA runtime, Autoware base deps via setup-dev-env.sh. +# ============================================================================= +FROM jp62-setup AS common-base-jp62 +SHELL ["/bin/bash", "-o", "pipefail", "-c"] +ARG ROS_DISTRO +ENV CCACHE_DIR="/root/.ccache" +WORKDIR /autoware + +# Copy Autoware setup scripts and openadkit helper scripts +COPY autoware/setup-dev-env.sh autoware/ansible-galaxy-requirements.yaml autoware/amd64.env autoware/amd64_jazzy.env autoware/arm64.env /autoware/ +COPY autoware/ansible/ /autoware/ansible/ +COPY components/common/scripts/ /scripts/ +RUN chmod -R +x /scripts/ + +# Disable suggested/recommended APT packages +RUN echo 'APT::Install-Recommends "false";' > /etc/apt/apt.conf.d/99-disable-extra-packages; \ + echo 'APT::Install-Suggests "false";' >> /etc/apt/apt.conf.d/99-disable-extra-packages + +# Install base APT packages and add GitHub to known hosts +RUN rm -f /etc/apt/apt.conf.d/docker-clean \ + && echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' >/etc/apt/apt.conf.d/keep-cache +RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \ + apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install --no-install-recommends \ + gosu \ + ssh \ + && /scripts/cleanup_apt.sh \ + && mkdir -p ~/.ssh \ + && ssh-keyscan github.com >> ~/.ssh/known_hosts + +# Set up Autoware base environment. +# --no-nvidia --no-cuda-drivers: L4T already provides the full NVIDIA stack; +# the ansible nvidia/cuda-driver modules would conflict with the L4T packages. +RUN --mount=type=ssh \ + --mount=type=cache,target=/var/cache/apt,sharing=locked \ + ./setup-dev-env.sh -y --module base --no-nvidia --no-cuda-drivers --runtime openadkit --ros-distro "$ROS_DISTRO" \ + && pipx uninstall ansible \ + && /scripts/cleanup_apt.sh true \ + && echo "source /opt/ros/${ROS_DISTRO}/setup.bash" > /etc/bash.bashrc + +# ============================================================================= +# Stage: common-devel-jp62 +# Development image — contract-equivalent to common-devel-cuda on x86. +# Provides: everything in base + Autoware dev tools + universe common packages. +# ============================================================================= +FROM common-base-jp62 AS common-devel-jp62 +SHELL ["/bin/bash", "-o", "pipefail", "-c"] +ARG ROS_DISTRO + +# Set up full Autoware development environment. +# --no-nvidia --no-cuda-drivers: the CUDA ansible role installs from the +# ubuntu2204/sbsa repository, which conflicts with L4T's pre-installed CUDA. +# All NVIDIA/CUDA packages are already provided by the jp62-setup stage. +RUN --mount=type=ssh \ + --mount=type=cache,target=/var/cache/apt,sharing=locked \ + ./setup-dev-env.sh -y --module all --no-nvidia --no-cuda-drivers --ros-distro "$ROS_DISTRO" openadkit \ + && ./setup-dev-env.sh -y --module dev-tools --ros-distro "$ROS_DISTRO" openadkit \ + && pipx uninstall ansible \ + && apt-get update \ + && apt-get install -y --no-install-recommends \ + ros-${ROS_DISTRO}-tensorrt-cmake-module \ + ros-${ROS_DISTRO}-cudnn-cmake-module \ + && apt-mark manual ros-${ROS_DISTRO}-desktop ros-${ROS_DISTRO}-builtin-interfaces \ + && /scripts/cleanup_apt.sh \ + && echo "source /opt/ros/${ROS_DISTRO}/setup.bash" > /etc/bash.bashrc + +# Register the default colcon mixin index. +# The x86 path inherits this from the ros: base image; L4T does not have it. +RUN colcon mixin add default https://raw.githubusercontent.com/colcon/colcon-mixin-repository/master/index.yaml; \ + for i in 1 2 3; do colcon mixin update default && break || sleep 10; done + +# Install rosdep dependencies for autoware-universe common packages +RUN --mount=type=ssh \ + --mount=type=cache,target=/var/cache/apt,sharing=locked \ + --mount=type=bind,source=autoware/src/core,target=/autoware/src/core \ + --mount=type=bind,source=autoware/src/universe/autoware_universe/common,target=/autoware/src/universe/autoware_universe/common \ + --mount=type=bind,source=autoware/src/universe/autoware_universe/evaluator,target=/autoware/src/universe/autoware_universe/evaluator \ + --mount=type=bind,source=autoware/src/universe/external/eagleye,target=/autoware/src/universe/external/eagleye \ + --mount=type=bind,source=autoware/src/universe/external/glog,target=/autoware/src/universe/external/glog \ + --mount=type=bind,source=autoware/src/universe/external/llh_converter,target=/autoware/src/universe/external/llh_converter \ + --mount=type=bind,source=autoware/src/universe/external/managed_transform_buffer,target=/autoware/src/universe/external/managed_transform_buffer \ + --mount=type=bind,source=autoware/src/universe/external/morai_msgs,target=/autoware/src/universe/external/morai_msgs \ + --mount=type=bind,source=autoware/src/universe/external/muSSP,target=/autoware/src/universe/external/muSSP \ + --mount=type=bind,source=autoware/src/universe/external/pointcloud_to_laserscan,target=/autoware/src/universe/external/pointcloud_to_laserscan \ + --mount=type=bind,source=autoware/src/universe/external/rtklib_ros_bridge,target=/autoware/src/universe/external/rtklib_ros_bridge \ + --mount=type=bind,source=autoware/src/universe/external/tier4_autoware_msgs,target=/autoware/src/universe/external/tier4_autoware_msgs \ + --mount=type=bind,source=autoware/src/middleware/external,target=/autoware/src/middleware/external \ + apt-get update \ + && source /opt/ros/"$ROS_DISTRO"/setup.bash \ + && for i in 1 2 3; do rosdep update && break || sleep 10; done \ + && /scripts/resolve_rosdep_keys.sh /autoware/src "${ROS_DISTRO}" --dependency-types=exec > /tmp/rosdep-universe-common-exec-pkgs.txt \ + && /scripts/resolve_rosdep_keys.sh /autoware/src "${ROS_DISTRO}" > /tmp/rosdep-universe-common-pkgs.txt \ + && cat /tmp/rosdep-universe-common-pkgs.txt | xargs apt-get install -y --no-install-recommends \ + && dpkg -l | grep "^ii.*ros-${ROS_DISTRO}" | awk '{print $2}' | xargs apt-mark manual \ + && /scripts/cleanup_apt.sh + +# Intermediate target for debugging — buildable with --target common-devel-jp62-debug +FROM common-devel-jp62 AS common-devel-jp62-debug +RUN cmake --version && echo "Debug image ready" + +# Build autoware-universe common packages +FROM common-devel-jp62 AS common-devel-jp62-build +SHELL ["/bin/bash", "-o", "pipefail", "-c"] +ARG ROS_DISTRO +ENV CCACHE_DIR="/root/.ccache" +RUN --mount=type=cache,target="${CCACHE_DIR}" \ + --mount=type=bind,source=autoware/src/core,target=/autoware/src/core \ + --mount=type=bind,source=autoware/src/universe/autoware_universe/common,target=/autoware/src/universe/autoware_universe/common \ + --mount=type=bind,source=autoware/src/universe/autoware_universe/evaluator,target=/autoware/src/universe/autoware_universe/evaluator \ + --mount=type=bind,source=autoware/src/universe/external/eagleye,target=/autoware/src/universe/external/eagleye \ + --mount=type=bind,source=autoware/src/universe/external/glog,target=/autoware/src/universe/external/glog \ + --mount=type=bind,source=autoware/src/universe/external/llh_converter,target=/autoware/src/universe/external/llh_converter \ + --mount=type=bind,source=autoware/src/universe/external/managed_transform_buffer,target=/autoware/src/universe/external/managed_transform_buffer \ + --mount=type=bind,source=autoware/src/universe/external/morai_msgs,target=/autoware/src/universe/external/morai_msgs \ + --mount=type=bind,source=autoware/src/universe/external/muSSP,target=/autoware/src/universe/external/muSSP \ + --mount=type=bind,source=autoware/src/universe/external/pointcloud_to_laserscan,target=/autoware/src/universe/external/pointcloud_to_laserscan \ + --mount=type=bind,source=autoware/src/universe/external/rtklib_ros_bridge,target=/autoware/src/universe/external/rtklib_ros_bridge \ + --mount=type=bind,source=autoware/src/universe/external/tier4_autoware_msgs,target=/autoware/src/universe/external/tier4_autoware_msgs \ + --mount=type=bind,source=autoware/src/middleware/external,target=/autoware/src/middleware/external \ + --mount=type=bind,source=autoware/src/launcher,target=/autoware/src/launcher \ + source /opt/ros/"$ROS_DISTRO"/setup.bash \ + && /scripts/build_and_clean.sh "${CCACHE_DIR}" /opt/autoware diff --git a/components/common/jp62/README.md b/components/common/jp62/README.md new file mode 100644 index 0000000..4721205 --- /dev/null +++ b/components/common/jp62/README.md @@ -0,0 +1,179 @@ +# Jetson Linux 6.2 (JP62) Base Layer + +## Overview + +This directory contains the JP62-specific files for building Autoware common images on NVIDIA Jetson Orin (JetPack 6.2). The corresponding Dockerfile is at `components/common/Dockerfile.jp62`. + +JP62 images fulfill the same contract as x86 CUDA images (`common-base-cuda` / `common-devel-cuda`), so downstream component `Dockerfile.cuda` files work unmodified — they simply receive JP62 images as their `COMMON_BASE_CUDA_IMAGE` / `COMMON_DEVEL_CUDA_IMAGE` build args. + +## Files + +| File | Purpose | +|------|---------| +| `../Dockerfile.jp62` | Multi-stage Dockerfile: `jp62-setup` → `common-base-jp62` → `common-devel-jp62` | +| `opencv-preferences` | APT pin to prefer Ubuntu OpenCV 4.5.4 over L4T's 4.8.0 | +| `patch-cuda-arch.sh` | Patches Autoware CMakeLists.txt files to gate unsupported CUDA architectures (see below) | + +## Architecture + +``` +nvcr.io/nvidia/l4t-tensorrt:r10.3.0-devel (L4T base with CUDA 12.6, cuDNN 9.3, TensorRT 10.3) + └─ jp62-setup (locale, OpenCV swap, CMake upgrade, ROS 2, L4T NVIDIA pkgs, CUDA env, spconv/cumm) + └─ common-base-jp62 (Autoware scripts + setup-dev-env.sh --module base) + └─ common-devel-jp62 (setup-dev-env.sh --module all + dev-tools, rosdep, colcon build) +``` + +## How to build + +```bash +# Local build (requires Docker buildx + arm64 QEMU or native Jetson) +./build.sh --platform jp62 --target common + +# Or directly via docker buildx bake +docker buildx bake -f components/docker-bake.hcl \ + --set "*.context=." \ + --set "*.platform=linux/arm64" \ + --set "*.args.ROS_DISTRO=humble" \ + --set "common-base-jp62.tags=openadkit-common:base-jp62" \ + common-base-jp62 +``` + +## Progress + +### What works (validated 2026-04-13) + +- **jp62-setup** (15 steps): All pass. L4T base image bootstrapping, OpenCV 4.8→4.5.4 swap, CMake 3.14→3.22 upgrade, ROS 2 Humble desktop installation from apt, NVIDIA L4T package installation, CUDA environment configuration (CUDAARCHS=87), spconv/cumm Jetson ARM debs. +- **common-base-jp62** (9 steps): All pass. Autoware setup scripts copied, `setup-dev-env.sh --module base --no-nvidia --no-cuda-drivers` completes successfully via ansible. +- **common-devel-jp62** all pre-build steps pass: + - `setup-dev-env.sh --module all --no-nvidia --no-cuda-drivers` + `--module dev-tools`: Pass. + - colcon mixin registration: Pass (with retry for GitHub CDN flakiness). + - rosdep dependency resolution and install: Pass (with retry for GitHub CDN 503s). +- **colcon build**: Blocked on x86 by QEMU bug (see below). **Must be validated on native Jetson.** + +### Resolved: cmake 3.22 `find_library` bug (fixed with cmake 3.28) + +The colcon build step failed with cmake 3.22 (Ubuntu 22.04 system default): +``` +Package 'builtin_interfaces' exports the library +'builtin_interfaces__rosidl_generator_c' which couldn't be found +``` + +**Root cause (confirmed 2026-04-15):** cmake 3.22's `find_library()` fails when the result variable is pre-set to `"NOTFOUND"`. The `ament_cmake_export_libraries` template does exactly this: +```cmake +set(_lib "NOTFOUND") +find_library(_lib NAMES "${_library}" PATHS "..." NO_DEFAULT_PATH NO_CMAKE_FIND_ROOT_PATH) +``` + +On cmake 3.22, `find_library` sees `_lib` is "already set" and skips the search, even though the `.so` file exists on disk (confirmed via cmake `if(EXISTS)` and `ls` in the same cmake invocation). This is NOT a QEMU bug — cmake's `find_library` genuinely fails to search. + +**Evidence:** +1. cmake `if(EXISTS "/opt/ros/humble/lib/libbuiltin_interfaces__rosidl_generator_c.so")` → YES +2. `find_library(_lib ...)` in the same cmake run → `_lib-NOTFOUND` +3. Upgrading to cmake 3.28 from Kitware PPA → `find_library` succeeds, colcon build passes + +**Root cause detail:** The `ament_cmake_export_libraries-extras.cmake` template uses a shared cache variable name `_lib` across ALL packages. When `find_package(A)` processes A's export template and caches `_lib = /path/to/libA.so`, then `find_package(B)`'s template does `set(_lib "NOTFOUND")` + `find_library(_lib ...)`. The `set()` creates a normal variable but does NOT clear the cache entry. `find_library` sees the cache entry is "already set" and skips the search, leaving `_lib` pointing to A's library instead of B's. This is a known ament_cmake design flaw (see [ament_cmake#182](https://github.com/ament/ament_cmake/issues/182), [ament_cmake#365](https://github.com/ament/ament_cmake/issues/365)). + +**Fix:** Two-part: +1. Install cmake 3.28 from Kitware APT (3.24+ handles NOTFOUND re-search better). Pinned to 3.28.x: >= 3.24 for find_library fix, < 3.29 for FindPythonLibs compat, < 4.0 for cmake_minimum_required compat. +2. Patch all ament export templates to `unset(_lib CACHE)` before `find_library`, clearing the stale cache entry from previous packages. This is applied via a RUN step in the Dockerfile. + +**Also required for building:** +- Build against a pinned Autoware release tag (e.g., `1.7.1`), not `main`. Autoware `main` removed `.env` files referenced by the existing x86 `Dockerfile` COPY. The release workflow (`release-all-images.yaml`) already pins to semver tags. +- `apt-mark manual` for all ROS packages before `cleanup_apt.sh` to prevent `apt-get autoremove` from removing ROS libraries installed as dependencies of `ros-humble-desktop`. + +### Resolved: `nvcc fatal: Unsupported gpu architecture 'compute_101'` (Autoware 1.7.1) + +Autoware 1.7.1 hardcodes `CUDA_NVCC_FLAGS` with `-gencode arch=compute_101,code=sm_101` and `compute_120` in 14 CMakeLists.txt files across perception, sensing, and e2e packages. These architectures require CUDA 12.8+ (Blackwell), but the JP62 L4T base provides CUDA 12.6 which only supports up to `compute_90` (Hopper). + +The `CUDAARCHS=87` env var set in `Dockerfile.jp62` controls CMake's `CMAKE_CUDA_ARCHITECTURES`, but the affected packages use the legacy `find_package(CUDA)` / `cuda_add_library()` path with `CUDA_NVCC_FLAGS` directly — bypassing `CMAKE_CUDA_ARCHITECTURES` entirely. + +**Fix:** Gate the `compute_101`+ gencode flags behind `CUDA_VERSION VERSION_GREATER_EQUAL "12.8"` so they are only added when the toolkit actually supports them. The existing `compute_86/87/89` flags remain unconditional. + +Before (upstream): +```cmake +list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_86,code=sm_86") +list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_87,code=sm_87") +list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_89,code=sm_89") +if(CUDA_VERSION VERSION_LESS "13.0") + list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_101,code=sm_101") +else() # CUDA 13.0 renamed SM101 to SM110 + list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_110,code=sm_110") +endif() +list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_120,code=sm_120") +list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_120,code=compute_120") +``` + +After (patched): +```cmake +list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_86,code=sm_86") +list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_87,code=sm_87") +list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_89,code=sm_89") +# Only add newer architectures if the CUDA toolkit actually supports them +if(CUDA_VERSION VERSION_GREATER_EQUAL "12.8") + if(CUDA_VERSION VERSION_LESS "13.0") + list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_101,code=sm_101") + else() # CUDA 13.0 renamed SM101 to SM110 + list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_110,code=sm_110") + endif() + list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_120,code=sm_120") + list(APPEND CUDA_NVCC_FLAGS "-gencode arch=compute_120,code=compute_120") +endif() +``` + +**Affected packages (14 files):** +- `universe/autoware_universe/e2e/autoware_tensorrt_vad` +- `universe/autoware_universe/perception/autoware_bevfusion` +- `universe/autoware_universe/perception/autoware_ground_segmentation_cuda` +- `universe/autoware_universe/perception/autoware_image_projection_based_fusion` +- `universe/autoware_universe/perception/autoware_lidar_centerpoint` +- `universe/autoware_universe/perception/autoware_lidar_frnet` +- `universe/autoware_universe/perception/autoware_lidar_transfusion` +- `universe/autoware_universe/perception/autoware_probabilistic_occupancy_grid_map` +- `universe/autoware_universe/perception/autoware_ptv3` +- `universe/autoware_universe/perception/autoware_tensorrt_classifier` +- `universe/autoware_universe/perception/autoware_tensorrt_plugins` +- `universe/autoware_universe/perception/autoware_tensorrt_yolox` +- `universe/autoware_universe/sensing/autoware_calibration_status_classifier` +- `universe/autoware_universe/sensing/autoware_cuda_pointcloud_preprocessor` + +**Applying the patch:** Run `components/common/jp62/patch-cuda-arch.sh` after cloning Autoware sources and before building: +```bash +./components/common/jp62/patch-cuda-arch.sh autoware/src +``` + +This patch is only needed for CUDA < 12.8 (i.e., JP62 with CUDA 12.6). On x86 with CUDA 12.8+ the upstream CMakeLists.txt files work as-is. + +### Remaining work (not yet implemented) + +1. **CI workflow** (`build-all-images.yaml`): Add JP62 to the build matrix — new `include:` entries for `jp62` platform in `build-common`, `build-components`, and `build-universe` jobs. +2. **Release workflow** (`release-all-images.yaml`): Same JP62 matrix additions. +3. **Manifest action** (`combine-multi-arch-images/action.yaml`): Handle `jp62` as single-arch (arm64), similar to how `*cuda*` is handled as single-arch (amd64). + +## Key design decisions + +### `--no-nvidia --no-cuda-drivers` for setup-dev-env.sh + +The Autoware ansible `cuda` role detects arm64 as `sbsa` architecture and installs CUDA packages from `developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/sbsa/`. These are server-grade ARM CUDA packages that **conflict** with L4T's pre-installed CUDA from `repo.download.nvidia.com/jetson/`. Using `--no-nvidia` skips both the `cuda` and `tensorrt` ansible roles entirely, relying on the L4T base image for the full NVIDIA stack. + +### ros-humble-desktop instead of ros-humble-ros-base + +The x86 path starts from the `ros:humble-ros-base-jammy` Docker image (built by OSRF), which includes all ROS message generation libraries as shared objects. Installing `ros-humble-ros-base` via apt on L4T does not produce an identical installation — some development `.so` files are treated as auto-removable. Using `ros-humble-desktop` (which the reference JP62 Dockerfile also uses) provides a superset that includes all required libraries. + +### colcon mixin explicit registration + +The official `ros:humble-ros-base-jammy` Docker image pre-configures the colcon mixin index. Since JP62 installs ROS from apt on a bare L4T image, the mixin index must be registered explicitly: +```dockerfile +RUN colcon mixin add default https://raw.githubusercontent.com/colcon/colcon-mixin-repository/master/index.yaml; \ + for i in 1 2 3; do colcon mixin update default && break || sleep 10; done +``` + +### rosdep update retry + +GitHub CDN `raw.githubusercontent.com` frequently returns HTTP 503 during Docker builds (parallel requests from buildkit). Both the `jp62-setup` stage and the devel rosdep step use retry logic: +```dockerfile +# Base stage: tolerate failure entirely +RUN rosdep init || true; rosdep update || true + +# Devel stage: retry up to 3 times +&& for i in 1 2 3; do rosdep update && break || sleep 10; done +``` diff --git a/components/common/jp62/opencv-preferences b/components/common/jp62/opencv-preferences new file mode 100644 index 0000000..854c601 --- /dev/null +++ b/components/common/jp62/opencv-preferences @@ -0,0 +1,12 @@ +# Prefer Ubuntu repository OpenCV packages over NVIDIA L4T +Package: libopencv* +Pin: release o=Ubuntu +Pin-Priority: 1001 + +Package: opencv-data +Pin: release o=Ubuntu +Pin-Priority: 1001 + +Package: python3-opencv +Pin: release o=Ubuntu +Pin-Priority: 1001 diff --git a/components/common/jp62/patch-cuda-arch.sh b/components/common/jp62/patch-cuda-arch.sh new file mode 100755 index 0000000..83befcb --- /dev/null +++ b/components/common/jp62/patch-cuda-arch.sh @@ -0,0 +1,126 @@ +#!/usr/bin/env bash +# patch-cuda-arch.sh — Gate CUDA compute_101+ gencode flags behind CUDA >= 12.8 +# +# Autoware 1.7.1 hardcodes -gencode arch=compute_101 and compute_120 in 14 +# CMakeLists.txt files. These architectures require CUDA 12.8+ (Blackwell), +# but JP62 provides CUDA 12.6 which only supports up to compute_90. +# +# This script wraps the compute_101/110/120 flags in a CUDA version check so +# they are only added when the toolkit supports them. +# +# Usage: +# ./components/common/jp62/patch-cuda-arch.sh +# +# Example: +# ./components/common/jp62/patch-cuda-arch.sh autoware/src + +set -euo pipefail + +SRC_DIR="${1:?Usage: $0 }" + +if [ ! -d "$SRC_DIR" ]; then + echo "Error: directory '$SRC_DIR' does not exist" >&2 + exit 1 +fi + +# Find all CMakeLists.txt files that reference compute_101 +FILES=$(grep -rl "compute_101" "$SRC_DIR" --include="CMakeLists.txt" || true) + +if [ -z "$FILES" ]; then + echo "No files with compute_101 found in $SRC_DIR — nothing to patch." + exit 0 +fi + +PATCHED=0 +SKIPPED=0 + +for f in $FILES; do + # Skip already-patched files + if grep -q "VERSION_GREATER_EQUAL.*12.8" "$f" 2>/dev/null; then + echo "SKIP (already patched): $f" + SKIPPED=$((SKIPPED + 1)) + continue + fi + + python3 -c " +import re, sys + +with open('$f', 'r') as fh: + content = fh.read() + +# Match the block with quoted gencode args (2-space indent) +patterns = [ + # Pattern 1: 2-space indent, quoted args (most common) + (r' if\(CUDA_VERSION VERSION_LESS \"13\.0\"\)\n' + r' list\(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_101,code=sm_101\"\)\n' + r' else\(\) # CUDA 13\.0 renamed SM101 to SM110\n' + r' list\(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_110,code=sm_110\"\)\n' + r' endif\(\)\n' + r' list\(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_120,code=sm_120\"\)\n' + r' list\(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_120,code=compute_120\"\)', + ' # Only add newer architectures if the CUDA toolkit actually supports them\n' + ' if(CUDA_VERSION VERSION_GREATER_EQUAL \"12.8\")\n' + ' if(CUDA_VERSION VERSION_LESS \"13.0\")\n' + ' list(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_101,code=sm_101\")\n' + ' else() # CUDA 13.0 renamed SM101 to SM110\n' + ' list(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_110,code=sm_110\")\n' + ' endif()\n' + ' list(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_120,code=sm_120\")\n' + ' list(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_120,code=compute_120\")\n' + ' endif()'), + # Pattern 2: no indent, quoted args + (r'if\(CUDA_VERSION VERSION_LESS \"13\.0\"\)\n' + r' list\(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_101,code=sm_101\"\)\n' + r'else\(\) # CUDA 13\.0 renamed SM101 to SM110\n' + r' list\(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_110,code=sm_110\"\)\n' + r'endif\(\)\n' + r'list\(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_120,code=sm_120\"\)\n' + r'list\(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_120,code=compute_120\"\)', + '# Only add newer architectures if the CUDA toolkit actually supports them\n' + 'if(CUDA_VERSION VERSION_GREATER_EQUAL \"12.8\")\n' + ' if(CUDA_VERSION VERSION_LESS \"13.0\")\n' + ' list(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_101,code=sm_101\")\n' + ' else() # CUDA 13.0 renamed SM101 to SM110\n' + ' list(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_110,code=sm_110\")\n' + ' endif()\n' + ' list(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_120,code=sm_120\")\n' + ' list(APPEND CUDA_NVCC_FLAGS \"-gencode arch=compute_120,code=compute_120\")\n' + 'endif()'), + # Pattern 3: 4-space indent, unquoted args + (r' if\(CUDA_VERSION VERSION_LESS \"13\.0\"\)\n' + r' list\(APPEND CUDA_NVCC_FLAGS -gencode arch=compute_101,code=sm_101\)\n' + r' else\(\) # CUDA 13\.0 renamed SM101 to SM110\n' + r' list\(APPEND CUDA_NVCC_FLAGS -gencode arch=compute_110,code=sm_110\)\n' + r' endif\(\)\n' + r' list\(APPEND CUDA_NVCC_FLAGS -gencode arch=compute_120,code=sm_120\)\n' + r' list\(APPEND CUDA_NVCC_FLAGS -gencode arch=compute_120,code=compute_120\)', + ' # Only add newer architectures if the CUDA toolkit actually supports them\n' + ' if(CUDA_VERSION VERSION_GREATER_EQUAL \"12.8\")\n' + ' if(CUDA_VERSION VERSION_LESS \"13.0\")\n' + ' list(APPEND CUDA_NVCC_FLAGS -gencode arch=compute_101,code=sm_101)\n' + ' else() # CUDA 13.0 renamed SM101 to SM110\n' + ' list(APPEND CUDA_NVCC_FLAGS -gencode arch=compute_110,code=sm_110)\n' + ' endif()\n' + ' list(APPEND CUDA_NVCC_FLAGS -gencode arch=compute_120,code=sm_120)\n' + ' list(APPEND CUDA_NVCC_FLAGS -gencode arch=compute_120,code=compute_120)\n' + ' endif()'), +] + +for old_pat, new_str in patterns: + result = re.sub(old_pat, new_str, content) + if result != content: + with open('$f', 'w') as fh: + fh.write(result) + print('PATCHED: $f') + sys.exit(0) + +print('NO MATCH: $f (may need manual patching)', file=sys.stderr) +sys.exit(1) +" && PATCHED=$((PATCHED + 1)) || { + echo "WARNING: Failed to patch $f — check manually" >&2 + } +done + +echo "" +echo "Done: $PATCHED patched, $SKIPPED skipped (already patched)." +echo "Total files with compute_101: $(echo "$FILES" | wc -l)" diff --git a/components/docker-bake.hcl b/components/docker-bake.hcl index 7f59400..830d2f5 100644 --- a/components/docker-bake.hcl +++ b/components/docker-bake.hcl @@ -18,6 +18,13 @@ group "common" { ] } +group "common-jp62" { + targets = [ + "common-base-jp62", + "common-devel-jp62" + ] +} + group "component" { targets = [ "sensing-perception", @@ -41,8 +48,10 @@ group "universe-all" { // For docker/metadata-action target "docker-metadata-action-common-base" {} target "docker-metadata-action-common-base-cuda" {} +target "docker-metadata-action-common-base-jp62" {} target "docker-metadata-action-common-devel" {} target "docker-metadata-action-common-devel-cuda" {} +target "docker-metadata-action-common-devel-jp62" {} target "docker-metadata-action-sensing-perception" {} target "docker-metadata-action-sensing-perception-cuda" {} target "docker-metadata-action-localization-mapping" {} @@ -78,6 +87,18 @@ target "common-devel-cuda" { target = "common-devel-cuda" } +target "common-base-jp62" { + inherits = ["docker-metadata-action-common-base-jp62"] + dockerfile = "components/common/Dockerfile.jp62" + target = "common-base-jp62" +} + +target "common-devel-jp62" { + inherits = ["docker-metadata-action-common-devel-jp62"] + dockerfile = "components/common/Dockerfile.jp62" + target = "common-devel-jp62-build" +} + target "sensing-perception" { inherits = ["docker-metadata-action-sensing-perception"] dockerfile = "components/sensing-perception/Dockerfile" diff --git a/deployments/samples/planning-simulation/README.md b/deployments/samples/planning-simulation/README.md index 958c6a6..34aeb32 100644 --- a/deployments/samples/planning-simulation/README.md +++ b/deployments/samples/planning-simulation/README.md @@ -21,6 +21,8 @@ unzip -d ~/autoware_map ~/autoware_map/sample-map-planning.zip ## Run the Deployment +### x86 (amd64) — using pre-built images from GHCR + 1. Start the deployment by running the following command: ```bash @@ -33,7 +35,7 @@ unzip -d ~/autoware_map ~/autoware_map/sample-map-planning.zip http://localhost:6080/vnc.html ``` - Use the default password `openadkit` to access the visualizer. **It can take a few seconds to visualizer to start.** + Use the default password `openadkit` to access the visualizer. **It can take a few seconds for the visualizer to start.** > If your machine is on a remote server, you can access the visualizer by using its accessible IP address: > @@ -43,8 +45,40 @@ unzip -d ~/autoware_map ~/autoware_map/sample-map-planning.zip 3. After you see the visualizer, you can start the autonomous driving simulation by following the [planning simulation instructions](https://autowarefoundation.github.io/autoware-documentation/main/demos/planning-sim/lane-driving/#2-set-an-initial-pose-for-the-ego-vehicle) in the Autoware documentation. +### Jetson (JP62) — using locally-built images + +The `docker-compose.jp62.yaml` override replaces all service images with locally-built JP62 images (`openadkit:universe-jp62` for components, `openadkit:visualizer-jp62` for the visualizer). It also adds `runtime: nvidia` for GPU access and sets `ROS_DISTRO=humble` (not baked into JP62 images since they are built from L4T, not the `ros:` Docker base). + +> **Prerequisites:** Build the JP62 images first with `./build.sh --platform jp62 --target universe` from the repo root. + +1. Start the deployment: + + ```bash + docker compose -f docker-compose.yaml -f docker-compose.jp62.yaml --env-file planning-simulation.env up -d + ``` + +2. Open the visualizer in a browser: + + ```bash + http://:6080/vnc.html + ``` + + Use the default password `openadkit`. + +3. Follow the [planning simulation instructions](https://autowarefoundation.github.io/autoware-documentation/main/demos/planning-sim/lane-driving/#2-set-an-initial-pose-for-the-ego-vehicle) to set an initial pose and goal in RViz. + +> **Note:** Do not use `docker compose restart` — services share the `map` container's PID namespace (`pid: service:map`), so restarting breaks the namespace reference. Always use `down` followed by `up -d`. + ## Stop the Deployment +### x86 + ```bash docker compose --env-file planning-simulation.env down ``` + +### Jetson (JP62) + +```bash +docker compose -f docker-compose.yaml -f docker-compose.jp62.yaml --env-file planning-simulation.env down +``` diff --git a/deployments/samples/planning-simulation/docker-compose.jp62.yaml b/deployments/samples/planning-simulation/docker-compose.jp62.yaml new file mode 100644 index 0000000..c10baf1 --- /dev/null +++ b/deployments/samples/planning-simulation/docker-compose.jp62.yaml @@ -0,0 +1,37 @@ +# Override to use locally-built JP62 images for all services. +# Usage: docker compose -f docker-compose.yaml -f docker-compose.jp62.yaml --env-file planning-simulation.env up -d +# +# JP62 images are built from L4T (not the ros: Docker base), so ROS_DISTRO +# is not baked into the image. We set it here for the entrypoint script. +x-jp62-common: &jp62-common + image: openadkit:universe-jp62 + runtime: nvidia + environment: + - ROS_DISTRO=humble + - RMW_IMPLEMENTATION=rmw_cyclonedds_cpp + - ROS_DOMAIN_ID=${ROS_DOMAIN_ID} + +services: + map: + <<: *jp62-common + planning: + <<: *jp62-common + vehicle: + <<: *jp62-common + system: + <<: *jp62-common + control: + <<: *jp62-common + simulator: + <<: *jp62-common + api: + <<: *jp62-common + visualizer: + image: openadkit:visualizer-jp62 + runtime: nvidia + environment: + - ROS_DISTRO=humble + - RMW_IMPLEMENTATION=rmw_cyclonedds_cpp + - ROS_DOMAIN_ID=${ROS_DOMAIN_ID} + - RVIZ_CONFIG=${RVIZ_CONFIG} + - USE_SIM_TIME=${USE_SIM_TIME}