optimize.docstring

Invoke the calibration routine

SYNOPSIS

    stats = mrcal.optimize( intrinsics_data,
                            rt_cam_ref,
                            rt_ref_frame, points,
                            observations_board, indices_frame_camintrinsics_camextrinsics,
                            observations_point, indices_point_camintrinsics_camextrinsics,

                            lensmodel,
                            imagersizes                       = imagersizes,
                            do_optimize_intrinsics_core       = True,
                            do_optimize_intrinsics_distortions= True,
                            calibration_object_spacing        = object_spacing,
                            point_min_range                   = 0.1,
                            point_max_range                   = 100.0,
                            do_apply_outlier_rejection        = True,
                            do_apply_regularization           = True,
                            verbose                           = False)

Please see the mrcal documentation at
https://mrcal.secretsauce.net/formulation.html for details.

This is a flexible implementation of a calibration system core that uses sparse
Jacobians, performs outlier rejection and reports some metrics back to the user.
Measurements from any number of cameras can beat used simultaneously, and this
routine is flexible-enough to solve structure-from-motion problems.

The input is a combination of observations of a calibration board and
observations of discrete points. The point observations MAY have a known
range.

The cameras and what they're observing is given in the arrays

- intrinsics_data
- rt_cam_ref
- rt_ref_frame
- points
- indices_frame_camintrinsics_camextrinsics
- indices_point_camintrinsics_camextrinsics

intrinsics_data contains the intrinsics for all the physical cameras present in
the problem. len(intrinsics_data) = Ncameras_intrinsics

rt_cam_ref contains all the camera poses present in the problem,
omitting any cameras that sit at the reference coordinate system.
len(rt_cam_ref) = Ncameras_extrinsics.

rt_ref_frame is all the poses of the calibration board in the problem, and
points is all the discrete points being observed in the problem.

indices_frame_camintrinsics_camextrinsics describes which board observations
were made by which camera, and where this camera was. Each board observation is
described by a tuple (iframe,icam_intrinsics,icam_extrinsics). The board at
rt_ref_frame[iframe] was observed by camera
intrinsics_data[icam_intrinsics], which was at
rt_cam_ref[icam_extrinsics]

indices_point_camintrinsics_camextrinsics is the same thing for discrete points.

If we're solving a vanilla calibration problem, we have stationary cameras
observing a moving target. By convention, camera 0 is at the reference
coordinate system. So

- Ncameras_intrinsics = Ncameras_extrinsics+1
- All entries in indices_frame_camintrinsics_camextrinsics have
  icam_intrinsics = icam_extrinsics+1
- rt_ref_frame, points describes the motion of the moving target we're
  observing

Conversely, in a structure-from-motion problem we have some small number of
moving cameras (often just 1) observing stationary target(s). We would have

- Ncameras_intrinsics is small: it's how many physical cameras we have
- Ncameras_extrinsics is large: it describes the motion of the cameras
- rt_ref_frame, points is small: it describes the non-moving world we're
  observing

Any combination of these extreme cases is allowed.

REQUIRED ARGUMENTS

- intrinsics: array of dims (Ncameras_intrinsics, Nintrinsics). The intrinsics
  of each physical camera. Each intrinsic vector is given as

    (focal_x, focal_y, center_pixel_x, center_pixel_y, distortion0, distortion1,
    ...)

  The focal lengths are given in pixels.

  On input this is a seed. On output the optimal data is returned. THIS ARRAY IS
  MODIFIED BY THIS CALL.

- rt_cam_ref: array of dims (Ncameras_extrinsics, 6). The pose of
  each camera observation. Each pose is given as 6 values: a Rodrigues rotation
  vector followed by a translation. This represents a transformation FROM the
  reference coord system TO the coord system of each camera.

  On input this is a seed. On output the optimal data is returned. THIS ARRAY IS
  MODIFIED BY THIS CALL.

  If we only have one camera, pass either None or np.zeros((0,6))

- rt_ref_frame: array of dims (Nframes, 6). The poses of the calibration
  object over time. Each pose is given as 6 values: a rodrigues rotation vector
  followed by a translation. This represents a transformation FROM the coord
  system of the calibration object TO the reference coord system. THIS IS
  DIFFERENT FROM THE CAMERA EXTRINSICS.

  On input this is a seed. On output the optimal data is returned. THIS ARRAY IS
  MODIFIED BY THIS CALL.

  If we don't have any frames, pass either None or np.zeros((0,6))

- points: array of dims (Npoints, 3). The estimated positions of discrete points
  we're observing. These positions are represented in the reference coord
  system. The initial Npoints-Npoints_fixed points are optimized by this
  routine. The final Npoints_fixed points are fixed. By default
  Npoints_fixed==0, and we optimize all the points.

  On input this is a seed. On output the optimal data is returned. THIS ARRAY IS
  MODIFIED BY THIS CALL.

- observations_board: array of dims (Nobservations_board,
                                     calibration_object_height_n,
                                     calibration_object_width_n,
                                     3).
  Each slice is an (x,y,weight) tuple where (x,y) are the observed pixel
  coordinates of the corners in the calibration object, and "weight" is the
  relative weight of this point observation. Most of the weights are expected to
  be 1.0, which implies that the noise on that observation has the nominal
  standard deviation of observed_pixel_uncertainty (in addition to the overall
  assumption of gaussian noise, independent on x,y). weight<0 indicates that
  this is an outlier. This is respected on input (even if
  !do_apply_outlier_rejection). New outliers are marked with weight<0 on output.
  Subpixel interpolation is assumed, so these contain 64-bit floating point
  values, like all the other data. The frame and camera that produced these
  observations are given in the indices_frame_camintrinsics_camextrinsics

  THIS ARRAY IS MODIFIED BY THIS CALL (to mark outliers)

- indices_frame_camintrinsics_camextrinsics: array of dims (Nobservations_board,
  3). For each observation these are an
  (iframe,icam_intrinsics,icam_extrinsics) tuple. icam_extrinsics == -1
  means this observation came from a camera in the reference coordinate system.
  iframe indexes the "rt_ref_frame" array, icam_intrinsics indexes the
  "intrinsics_data" array, icam_extrinsics indexes the "rt_cam_ref"
  array

  All of the indices are guaranteed to be monotonic. This array contains 32-bit
  integers.

- observations_point: array of dims (Nobservations_point, 3). Each slice is an
  (x,y,weight) tuple where (x,y) are the pixel coordinates of the observed
  point, and "weight" is the relative weight of this point observation. Most of
  the weights are expected to be 1.0, which implies that the noise on the
  observation is gaussian, independent on x,y, and has the nominal standard
  deviation of observed_pixel_uncertainty. weight<0 indicates that this is an
  outlier. This is respected on input (even if !do_apply_outlier_rejection). At
  this time, no new outliers are detected for point observations. Subpixel
  interpolation is assumed, so these contain 64-bit floating point values, like
  all the other data. The point index and camera that produced these
  observations are given in the indices_point_camera_points array.

- indices_point_camintrinsics_camextrinsics: array of dims (Nobservations_point,
  3). For each observation these are an
  (i_point,icam_intrinsics,icam_extrinsics) tuple. Analogous to
  indices_frame_camintrinsics_camextrinsics, but for observations of discrete
  points.

  The indices can appear in any order. No monotonicity is required. This array
  contains 32-bit integers.

- lensmodel: a string such as

  LENSMODEL_PINHOLE
  LENSMODEL_OPENCV4
  LENSMODEL_CAHVOR
  LENSMODEL_SPLINED_STEREOGRAPHIC_order=3_Nx=16_Ny=12_fov_x_deg=100

- imagersizes: integer array of dims (Ncameras_intrinsics,2)

OPTIONAL ARGUMENTS

- calobject_warp

  A numpy array of shape (2,) describing the non-flatness of the calibration
  board. If omitted or None, the board is assumed to be perfectly flat. And if
  do_optimize_calobject_warp then we optimize these parameters to find the
  best-fitting board shape.

- Npoints_fixed

  Specifies how many points at the end of the points array are fixed, and remain
  unaffected by the optimization. This is 0 by default, and we optimize all the
  points.

- do_optimize_intrinsics_core
- do_optimize_intrinsics_distortions
- do_optimize_extrinsics
- do_optimize_frames
- do_optimize_calobject_warp

  Indicate whether to optimize a specific set of variables. The intrinsics core
  is fx,fy,cx,cy. These all default to True so if we specify none of these, we
  will optimize ALL the variables.

- calibration_object_spacing: the width of each square in a calibration board.
  Can be omitted if we have no board observations, just points. The calibration
  object has shape (calibration_object_height_n,calibration_object_width_n),
  given by the dimensions of "observations_board"

- verbose: if True, write out all sorts of diagnostic data to STDERR. Defaults
  to False

- do_apply_outlier_rejection: if False, don't bother with detecting or rejecting
  outliers. The outliers we get on input (observations_board[...,2] < 0) are
  honered regardless. Defaults to True

- do_apply_regularization: if False, don't include regularization terms in the
  solver. Defaults to True

- point_min_range, point_max_range: Required ONLY if point observations are
  given. These are lower, upper bounds for the distance of a point observation
  to its observing camera. Each observation outside of this range is penalized.
  This helps the solver by guiding it away from unreasonable solutions.

We return a dict with various metrics describing the computation we just
performed