-
Notifications
You must be signed in to change notification settings - Fork 50
Open
Description
Excellent work. I have already run the demo example successfully, but I noticed that the sensorsim node does not appear to have difix enabled; it seems to be running only with render. I would like to enable difix in sensorsim. What should I do? I tried the following two steps but was not successful:
- I checked the configuration file loaded by this node. I first decompressed the
*.usdzpackage and modifiedparsed_config.yamlby settingdifix: referencetoTrue. After repackaging and rerunning the example, I found that nothing changed.
difix:
name: cosmos-difix
inference:
enabled: true
use_color_transfer: true
progressive_distillation:
.....
- I commented out the
"--no-enable-nrend"option insrc/wizard/configs/base_config.yaml, but this resulted in the following error.
config:
services:
# \/ sensor simulator for now is just NRE. We'll use it as an example on the structure of a container definition
sensorsim:
image: ???
# \/ volumes lets us mount host (ORD/local) containers to the running container
volumes:
- "${scenes.database.scene_cache}:/mnt/nre-data"
- "${defines.sensordata}/ego-hoods:/mnt/ego-hoods"
# \/ environments lets you set environment variables inside the container
environments:
# this may not be necessary but at least on COLOSSUS by default pytorch had really stupid
# configuration running to excessively parallelizing everything
- OMP_NUM_THREADS=1
# \/ command is like docker entrypoint + command combined
command:
- "/app/pycena_run.runfiles/nre_repo/scripts/pycena/runtime/entrypoint_3_11.sh"
- "--port={port}" # {port} is a wizard-generated unique variable which enumerates {baseport}, {baseport+1}, ...
- "--host=0.0.0.0" # the default container IP
- "--artifact-glob=/mnt/nre-data/{sceneset}/**/*.usdz"
- "--egocar-hood-dir=/mnt/ego-hoods"
# - "--no-enable-nrend"
- "--download-cache-dir /tmp/nre-cache-dir" # unused
- "--cache-size=${defines.nre_cache_size}" # as a rule of thumb n_concurrent_rollouts + 1 allows to avoid premature evictions
# - "--enable-timing" # uncomment to enable timing information on the sensor simulator side
# \/ gpus is a list of GPUs to use for all service replicas, each replica using one (possibly shared GPU). See below
gpus: [0]
# \/ replicas_per_container indicates we want 6 identical containers started (for load balancing). With the `gpus` definition above we'll
# have 1 replicas on each of the 6 GPUs but it's possible to e.g. have 12 replicas on 6 GPUs (to better utilize them) by setting replicas_per_container
replicas_per_container: 1
error
sensorsim-0-1 | [NuRec::NRend][ERROR] ::: JIT: compiling preProcessParticles.9e9875ea512dc028.cu failed.
sensorsim-0-1 | [NuRec::NRend][ERROR] ::: GUTRenderer : cannot get cuda resource on the device 0.
sensorsim-0-1 | [2025-12-10 06:09:37,889][nre.grpc.serve][ERROR] Traceback (most recent call last):
sensorsim-0-1 | File "nre/grpc/<unknown>", line 0, in render_rgb
sensorsim-0-1 | File "nre/grpc/<unknown>", line 0, in wrapper
sensorsim-0-1 | File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1 | File "nre/grpc/<unknown>", line 0, in render_camera_request
sensorsim-0-1 | File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1 | File "nre/render/<unknown>", line 0, in _render_volume_from_ray_bundle
sensorsim-0-1 | File "nre/render/<unknown>", line 0, in __call__
sensorsim-0-1 | File "nre/models/<unknown>", line 0, in render_nrend_sensor_rays_with_poses
sensorsim-0-1 | File "/app/pycena_run.runfiles/python_3_11_x86_64-unknown-linux-gnu/lib/python3.11/contextlib.py", line 81, in inner
sensorsim-0-1 | return func(*args, **kwds)
sensorsim-0-1 | ^^^^^^^^^^^^^^^^^^^
sensorsim-0-1 | File "libs/nrend/<unknown>", line 0, in render
sensorsim-0-1 | AssertionError: NRenderer.render failed.
sensorsim-0-1 |
sensorsim-0-1 | [2025-12-10 06:09:37,889][grpc._server][ERROR] Exception calling application: NRenderer.render failed.
sensorsim-0-1 | Traceback (most recent call last):
sensorsim-0-1 | File "/app/pycena_run.runfiles/pip_deps_3_11_grpcio/site-packages/grpc/_server.py", line 609, in _call_behavior
runtime-0-1 | 06:09:37.899 DEBUG: [_cygrpc] Loaded running loop: id(loop)=140553004067664
sensorsim-0-1 | response_or_iterator = behavior(argument, context)
sensorsim-0-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^
driver-0-1 | [2025-12-10 06:09:37,900][__main__][INFO] - Closing session c8b25ee8-d58e-11f0-9c71-f18f730da2da
sensorsim-0-1 | File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1 | File "nre/grpc/<unknown>", line 0, in render_rgb
sensorsim-0-1 | File "nre/grpc/<unknown>", line 0, in wrapper
runtime-0-1 | 06:09:37.900 DEBUG: [_cygrpc] Loaded running loop: id(loop)=140553004067664
sensorsim-0-1 | File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1 | File "nre/grpc/<unknown>", line 0, in render_camera_request
sensorsim-0-1 | File "nre/utils/<unknown>", line 0, in wrapper
sensorsim-0-1 | File "nre/render/<unknown>", line 0, in _render_volume_from_ray_bundle
controller-0-1 | 06:09:37.901 INFO: close_session for session_uuid: c8b25ee8-d58e-11f0-9c71-f18f730da2da
sensorsim-0-1 | File "nre/render/<unknown>", line 0, in __call__
controller-0-1 | 06:09:37.901 ERROR: Exception calling application: 'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist'
sensorsim-0-1 | File "nre/models/<unknown>", line 0, in render_nrend_sensor_rays_with_poses
controller-0-1 | Traceback (most recent call last):
sensorsim-0-1 | File "/app/pycena_run.runfiles/python_3_11_x86_64-unknown-linux-gnu/lib/python3.11/contextlib.py", line 81, in inner
controller-0-1 | File "/repo/.venv/lib/python3.11/site-packages/grpc/_server.py", line 608, in _call_behavior
sensorsim-0-1 | return func(*args, **kwds)
controller-0-1 | response_or_iterator = behavior(argument, context)
sensorsim-0-1 | ^^^^^^^^^^^^^^^^^^^
controller-0-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^
sensorsim-0-1 | File "libs/nrend/<unknown>", line 0, in render
controller-0-1 | File "/repo/src/controller/alpasim_controller/server.py", line 63, in close_session
sensorsim-0-1 | AssertionError: NRenderer.render failed.
controller-0-1 | self._backend.close_session(request)
controller-0-1 | File "/repo/src/controller/alpasim_controller/system_manager.py", line 24, in close_session
controller-0-1 | raise KeyError(f"Session {request.session_uuid} does not exist")
controller-0-1 | KeyError: 'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist'
runtime-0-1 | 06:09:37.965 [W0] WARNING: Rollout FAILED: job=3aaf3cfe08d14e1c818d883da587f4db scene=clipgt-05bb8212-63e1-40a8-b4fc-3142c0e94646 uuid=c8b25ee8-d58e-11f0-9c71-f18f730da2da error=<AioRpcError of RPC that terminated with:
runtime-0-1 | status = StatusCode.UNKNOWN
runtime-0-1 | details = "Exception calling application: 'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist'"
runtime-0-1 | debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Exception calling application: \'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist\'", grpc_status:2}"
runtime-0-1 | >
runtime-0-1 | Traceback (most recent call last):
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/loop.py", line 668, in run
runtime-0-1 | await self._loop()
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/loop.py", line 794, in _loop
runtime-0-1 | await asyncio.gather(*tasks)
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/loop.py", line 725, in _send_images
runtime-0-1 | await asyncio.gather(
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/loop.py", line 437, in render_and_send_image
runtime-0-1 | image = await self.sensorsim.render(
runtime-0-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/services/sensorsim_service.py", line 336, in render
runtime-0-1 | response: RGBRenderReturn = await profiled_rpc_call(
runtime-0-1 | ^^^^^^^^^^^^^^^^^^^^^^^^
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/telemetry/rpc_wrapper.py", line 124, in profiled_rpc_call
runtime-0-1 | result = await fut
runtime-0-1 | ^^^^^^^^^
runtime-0-1 | File "/repo/.venv/lib/python3.11/site-packages/grpc/aio/_call.py", line 328, in __await__
runtime-0-1 | raise _create_rpc_error(
runtime-0-1 | grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
runtime-0-1 | status = StatusCode.UNKNOWN
runtime-0-1 | details = "Exception calling application: NRenderer.render failed."
runtime-0-1 | debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Exception calling application: NRenderer.render failed.", grpc_status:2}"
runtime-0-1 | >
runtime-0-1 |
runtime-0-1 | During handling of the above exception, another exception occurred:
runtime-0-1 |
runtime-0-1 | Traceback (most recent call last):
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/dispatcher.py", line 220, in run_job
runtime-0-1 | await rollout.bind(
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/loop.py", line 575, in run
runtime-0-1 | async with contextlib.AsyncExitStack() as async_stack:
runtime-0-1 | File "/root/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/contextlib.py", line 745, in __aexit__
runtime-0-1 | raise exc_details[1]
runtime-0-1 | File "/root/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/contextlib.py", line 728, in __aexit__
runtime-0-1 | cb_suppress = await cb(*exc_details)
runtime-0-1 | ^^^^^^^^^^^^^^^^^^^^^^
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/services/service_base.py", line 140, in __aexit__
runtime-0-1 | await self._cleanup_session(session_info=self.session_info)
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/services/controller_service.py", line 121, in _cleanup_session
runtime-0-1 | await profiled_rpc_call(
runtime-0-1 | File "/repo/src/runtime/alpasim_runtime/telemetry/rpc_wrapper.py", line 124, in profiled_rpc_call
runtime-0-1 | result = await fut
runtime-0-1 | ^^^^^^^^^
runtime-0-1 | File "/repo/.venv/lib/python3.11/site-packages/grpc/aio/_call.py", line 328, in __await__
runtime-0-1 | raise _create_rpc_error(
runtime-0-1 | grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
runtime-0-1 | status = StatusCode.UNKNOWN
runtime-0-1 | details = "Exception calling application: 'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist'"
runtime-0-1 | debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Exception calling application: \'Session c8b25ee8-d58e-11f0-9c71-f18f730da2da does not exist\'", grpc_status:2}"
runtime-0-1 | >
runtime-0-1 |
runtime-0-1 | 06:09:37.984 [W0] INFO: Exporting GPU metrics for 1 GPU(s)
In addition, my computer is equipped with an NVIDIA GeForce RTX 4080 SUPER
How should I properly enable DiFiX in sensorsim?
Metadata
Metadata
Assignees
Labels
No labels