Skip to content

how to open difix in sensorsim #7

@chengYi-xun

Description

@chengYi-xun

Excellent work. I have already run the demo example successfully, but I noticed that the sensorsim node does not appear to have difix enabled; it seems to be running only with render. I would like to enable difix in sensorsim. What should I do? I tried the following two steps but was not successful:

  1. I checked the configuration file loaded by this node. I first decompressed the *.usdz package and modified parsed_config.yaml by setting difix: reference to True. After repackaging and rerunning the example, I found that nothing changed.
difix:
  name: cosmos-difix
  inference:
    enabled: true
    use_color_transfer: true
  progressive_distillation:
.....
  1. I commented out the "--no-enable-nrend" option in src/wizard/configs/base_config.yaml , but this resulted in the following error.

config:

services:
  # \/ sensor simulator for now is just NRE. We'll use it as an example on the structure of a container definition
  sensorsim:
    image: ???
    # \/ volumes lets us mount host (ORD/local) containers to the running container
    volumes:
      - "${scenes.database.scene_cache}:/mnt/nre-data"
      - "${defines.sensordata}/ego-hoods:/mnt/ego-hoods"
    # \/ environments lets you set environment variables inside the container
    environments:
      # this may not be necessary but at least on COLOSSUS by default pytorch had really stupid
      # configuration running to excessively parallelizing everything
      - OMP_NUM_THREADS=1
    # \/ command is like docker entrypoint + command combined
    command:
      - "/app/pycena_run.runfiles/nre_repo/scripts/pycena/runtime/entrypoint_3_11.sh"
      - "--port={port}" # {port} is a wizard-generated unique variable which enumerates {baseport}, {baseport+1}, ...
      - "--host=0.0.0.0" # the default container IP
      - "--artifact-glob=/mnt/nre-data/{sceneset}/**/*.usdz"
      - "--egocar-hood-dir=/mnt/ego-hoods"
      # - "--no-enable-nrend"
      - "--download-cache-dir /tmp/nre-cache-dir" # unused
      - "--cache-size=${defines.nre_cache_size}" # as a rule of thumb n_concurrent_rollouts + 1 allows to avoid premature evictions
      # - "--enable-timing" # uncomment to enable timing information on the sensor simulator side
    # \/ gpus is a list of GPUs to use for all service replicas, each replica using one (possibly shared GPU). See below
    gpus: [0]
    # \/ replicas_per_container indicates we want 6 identical containers started (for load balancing). With the `gpus` definition above we'll
    # have 1 replicas on each of the 6 GPUs but it's possible to e.g. have 12 replicas on 6 GPUs (to better utilize them) by setting replicas_per_container
    replicas_per_container: 1

error

sensorsim-0-1   | [NuRec::NRend][ERROR] ::: JIT: compiling preProcessParticles.9e9875ea512dc028.cu failed.
sensorsim-0-1   | [NuRec::NRend][ERROR] ::: GUTRenderer : cannot get cuda resource on the device 0.
sensorsim-0-1   | [2025-12-10 06:09:37,889][nre.grpc.serve][ERROR] Traceback (most recent call last):
sensorsim-0-1   |   File "nre/grpc/<unknown>", line 0, in render_rgb
sensorsim-0-1   |   File "nre/grpc/<unknown>", line 0, in wrapper
sensorsim-0-1   |   File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1   |   File "nre/grpc/<unknown>", line 0, in render_camera_request
sensorsim-0-1   |   File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1   |   File "nre/render/<unknown>", line 0, in _render_volume_from_ray_bundle
sensorsim-0-1   |   File "nre/render/<unknown>", line 0, in __call__
sensorsim-0-1   |   File "nre/models/<unknown>", line 0, in render_nrend_sensor_rays_with_poses
sensorsim-0-1   |   File "/app/pycena_run.runfiles/python_3_11_x86_64-unknown-linux-gnu/lib/python3.11/contextlib.py", line 81, in inner
sensorsim-0-1   |     return func(*args, **kwds)
sensorsim-0-1   |            ^^^^^^^^^^^^^^^^^^^
sensorsim-0-1   |   File "libs/nrend/<unknown>", line 0, in render
sensorsim-0-1   | AssertionError: NRenderer.render failed.
sensorsim-0-1   | 
sensorsim-0-1   | [2025-12-10 06:09:37,889][grpc._server][ERROR] Exception calling application: NRenderer.render failed.
sensorsim-0-1   | Traceback (most recent call last):
sensorsim-0-1   |   File "/app/pycena_run.runfiles/pip_deps_3_11_grpcio/site-packages/grpc/_server.py", line 609, in _call_behavior
runtime-0-1     | 06:09:37.899 DEBUG:   [_cygrpc] Loaded running loop: id(loop)=140553004067664
sensorsim-0-1   |     response_or_iterator = behavior(argument, context)
sensorsim-0-1   |                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
driver-0-1      | [2025-12-10 06:09:37,900][__main__][INFO] - Closing session c8b25ee8-d58e-11f0-9c71-f18f730da2da
sensorsim-0-1   |   File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1   |   File "nre/grpc/<unknown>", line 0, in render_rgb
sensorsim-0-1   |   File "nre/grpc/<unknown>", line 0, in wrapper
runtime-0-1     | 06:09:37.900 DEBUG:   [_cygrpc] Loaded running loop: id(loop)=140553004067664
sensorsim-0-1   |   File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1   |   File "nre/grpc/<unknown>", line 0, in render_camera_request
sensorsim-0-1   |   File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1   |   File "nre/render/<unknown>", line 0, in _render_volume_from_ray_bundle
controller-0-1  | 06:09:37.901 INFO:    close_session for session_uuid: c8b25ee8-d58e-11f0-9c71-f18f730da2da
sensorsim-0-1   |   File "nre/render/<unknown>", line 0, in __call__
controller-0-1  | 06:09:37.901 ERROR:   Exception calling application: 'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist'
sensorsim-0-1   |   File "nre/models/<unknown>", line 0, in render_nrend_sensor_rays_with_poses
controller-0-1  | Traceback (most recent call last):
sensorsim-0-1   |   File "/app/pycena_run.runfiles/python_3_11_x86_64-unknown-linux-gnu/lib/python3.11/contextlib.py", line 81, in inner
controller-0-1  |   File "/repo/.venv/lib/python3.11/site-packages/grpc/_server.py", line 608, in _call_behavior
sensorsim-0-1   |     return func(*args, **kwds)
controller-0-1  |     response_or_iterator = behavior(argument, context)
sensorsim-0-1   |            ^^^^^^^^^^^^^^^^^^^
controller-0-1  |                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
sensorsim-0-1   |   File "libs/nrend/<unknown>", line 0, in render
controller-0-1  |   File "/repo/src/controller/alpasim_controller/server.py", line 63, in close_session
sensorsim-0-1   | AssertionError: NRenderer.render failed.
controller-0-1  |     self._backend.close_session(request)
controller-0-1  |   File "/repo/src/controller/alpasim_controller/system_manager.py", line 24, in close_session
controller-0-1  |     raise KeyError(f"Session {request.session_uuid} does not exist")
controller-0-1  | KeyError: 'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist'
runtime-0-1     | 06:09:37.965 [W0] WARNING:    Rollout FAILED: job=3aaf3cfe08d14e1c818d883da587f4db scene=clipgt-05bb8212-63e1-40a8-b4fc-3142c0e94646 uuid=c8b25ee8-d58e-11f0-9c71-f18f730da2da error=<AioRpcError of RPC that terminated with:
runtime-0-1     |       status = StatusCode.UNKNOWN
runtime-0-1     |       details = "Exception calling application: 'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist'"
runtime-0-1     |       debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Exception calling application: \'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist\'", grpc_status:2}"
runtime-0-1     | >
runtime-0-1     | Traceback (most recent call last):
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/loop.py", line 668, in run
runtime-0-1     |     await self._loop()
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/loop.py", line 794, in _loop
runtime-0-1     |     await asyncio.gather(*tasks)
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/loop.py", line 725, in _send_images
runtime-0-1     |     await asyncio.gather(
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/loop.py", line 437, in render_and_send_image
runtime-0-1     |     image = await self.sensorsim.render(
runtime-0-1     |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/services/sensorsim_service.py", line 336, in render
runtime-0-1     |     response: RGBRenderReturn = await profiled_rpc_call(
runtime-0-1     |                                 ^^^^^^^^^^^^^^^^^^^^^^^^
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/telemetry/rpc_wrapper.py", line 124, in profiled_rpc_call
runtime-0-1     |     result = await fut
runtime-0-1     |              ^^^^^^^^^
runtime-0-1     |   File "/repo/.venv/lib/python3.11/site-packages/grpc/aio/_call.py", line 328, in __await__
runtime-0-1     |     raise _create_rpc_error(
runtime-0-1     | grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
runtime-0-1     |       status = StatusCode.UNKNOWN
runtime-0-1     |       details = "Exception calling application: NRenderer.render failed."
runtime-0-1     |       debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Exception calling application: NRenderer.render failed.", grpc_status:2}"
runtime-0-1     | >
runtime-0-1     | 
runtime-0-1     | During handling of the above exception, another exception occurred:
runtime-0-1     | 
runtime-0-1     | Traceback (most recent call last):
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/dispatcher.py", line 220, in run_job
runtime-0-1     |     await rollout.bind(
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/loop.py", line 575, in run
runtime-0-1     |     async with contextlib.AsyncExitStack() as async_stack:
runtime-0-1     |   File "/root/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/contextlib.py", line 745, in __aexit__
runtime-0-1     |     raise exc_details[1]
runtime-0-1     |   File "/root/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/contextlib.py", line 728, in __aexit__
runtime-0-1     |     cb_suppress = await cb(*exc_details)
runtime-0-1     |                   ^^^^^^^^^^^^^^^^^^^^^^
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/services/service_base.py", line 140, in __aexit__
runtime-0-1     |     await self._cleanup_session(session_info=self.session_info)
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/services/controller_service.py", line 121, in _cleanup_session
runtime-0-1     |     await profiled_rpc_call(
runtime-0-1     |   File "/repo/src/runtime/alpasim_runtime/telemetry/rpc_wrapper.py", line 124, in profiled_rpc_call
runtime-0-1     |     result = await fut
runtime-0-1     |              ^^^^^^^^^
runtime-0-1     |   File "/repo/.venv/lib/python3.11/site-packages/grpc/aio/_call.py", line 328, in __await__
runtime-0-1     |     raise _create_rpc_error(
runtime-0-1     | grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
runtime-0-1     |       status = StatusCode.UNKNOWN
runtime-0-1     |       details = "Exception calling application: 'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist'"
runtime-0-1     |       debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Exception calling application: \'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist\'", grpc_status:2}"
runtime-0-1     | >
runtime-0-1     | 
runtime-0-1     | 06:09:37.984 [W0] INFO:       Exporting GPU metrics for 1 GPU(s)

In addition, my computer is equipped with an NVIDIA GeForce RTX 4080 SUPER
How should I properly enable DiFiX in sensorsim?


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions