open_svo2
¶
OpenSVO2: An open-source reverse-engineered interface for SVO2 files.
open_svo2.FrameFooter
¶
Bases: Structure
Memory mapping for the SVO2 stereo frame footer (56 bytes, 12 fields).
Attributes:
| Name | Type | Description |
|---|---|---|
width |
Image width in pixels. |
|
height |
Image height in pixels. |
|
_unknown_magic |
Magic number (0x5c002c00). |
|
_unknown_1 |
Unknown constant (1). |
|
_unknown_2 |
Unknown constant (2). |
|
_unknown_3 |
Unknown constant (-1). |
|
timestamp |
Timestamp in nanoseconds. |
|
payload_size |
Size of H.264/H.265 payload in bytes. |
|
frame_type |
3 for key-frame, 0 for i-frame. |
|
last_keyframe_index |
Index of the last keyframe. |
|
frame_id |
Sequential frame index. |
|
_unsure_keyframe_id |
Possible keyframe ID. |
Source code in src/open_svo2/metadata.py
open_svo2.Header
¶
Bases: Structure
Memory mapping for the SVO2 binary header (128 bytes, 32 fields).
Field naming conventions
- Confirmed fields are named directly
- Unconfirmed fields are prefixed with 'unsure'
- Likely correct fields are prefixed with 'likely'
Warning
The parsed transformation matrix does not match the stereo transformation values given by the Zed SDK. The exact meaning and relationship is currently unknown.
Attributes:
| Name | Type | Description |
|---|---|---|
width |
Image width in pixels (for a single camera). |
|
height |
Image height in pixels. |
|
serial_number |
Camera serial number (e.g., 40735594). |
|
fps |
Frames per second. |
|
_unsure_frame_counter |
Possibly frame index or counter. |
|
_unsure_bit_depth |
Bits per channel (typically 8). |
|
_unsure_exposure_mode |
Exposure control mode. |
|
_likely_exposure_time |
Likely exposure time (units unknown, observed: 1000). |
|
_likely_camera_model |
Camera model/SKU (e.g., 2001 = ZED 2). |
|
_unsure_ts_sec |
Timestamp seconds (often 0). |
|
_unsure_ts_nsec |
Timestamp nanoseconds (often 0). |
|
_unsure_imu_status |
IMU-related status flag. |
|
w_scale |
Scale factor (typically 1.0). |
|
_likely_lens_id |
Lens type identifier (observed: 5). |
|
_unsure_isp_gain |
ISP gain setting. |
|
_unsure_isp_wb_r |
White balance red channel. |
|
_unsure_isp_wb_b |
White balance blue channel. |
|
_unsure_isp_gamma |
Gamma correction value. |
|
_likely_sync_status |
Sync status flag (1 = synced?). |
|
_unsure_padding |
Padding or reserved field. |
Source code in src/open_svo2/metadata.py
from_base64
classmethod
¶
Create an SVO2Header instance from encoded base64.
Source code in src/open_svo2/metadata.py
open_svo2.IMUData
dataclass
¶
Zed IMU data.
Attributes:
| Name | Type | Description |
|---|---|---|
timestamp |
float64
|
IMU measurement timestamp in seconds. |
accel |
Float32[ndarray, 3]
|
Linear acceleration in m/s^2, in the Zed camera coordinate frame, without calibration. |
avel |
Float32[ndarray, 3]
|
Angular velocity in deg/s, in the Zed camera coordinate frame, without calibration. |
Source code in src/open_svo2/imu.py
from_raw_data
classmethod
¶
Parse raw IMU data from the ZED SDK binary format.
Source code in src/open_svo2/imu.py
open_svo2.Intrinsics
dataclass
¶
Camera intrinsic parameters using the Brown-Conrady (OpenCV) model.
This dataclass represents the calibration parameters for a single camera in OpenCV-compatible format, ready to be passed directly to OpenCV functions.
- The
camera_matrixis the camera intrinsic matrix in the form: where fx/fy are focal lengths in pixels and cx/cy is the principal point. - The
dist_coeffsarray contains the distortion coefficients in the order (k1, k2, p1, p2, k3) following OpenCV's standard 5-parameter distortion model:- k1, k2, k3: Radial distortion coefficients (2nd, 4th, 6th order)
- p1, p2: Tangential distortion coefficients
Attributes:
| Name | Type | Description |
|---|---|---|
camera_matrix |
Float64[ndarray, '3 3']
|
camera intrinsic matrix in OpenCV format. |
dist_coeffs |
Float64[ndarray, 5]
|
distortion coefficients in OpenCV order. |
Notes
- Compatible with cv2.undistort(), cv2.calibrateCamera(), etc.
- Distortion coefficients are dimensionless and resolution-independent
- Camera matrix scales linearly with image resolution
- The distortion model follows OpenCV convention (Brown-Conrady model)
Source code in src/open_svo2/intrinsics.py
from_config
classmethod
¶
Create Intrinsics from a parsed configuration dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cfg
|
dict
|
Zed SDK sensor configuration dictionary. |
required |
Source code in src/open_svo2/intrinsics.py
open_svo2.Metadata
dataclass
¶
SVO2 file metadata extracted from MCAP container.
Attributes:
| Name | Type | Description |
|---|---|---|
imu_frequency |
float
|
IMU sampling frequency in Hz (e.g., 200.0). |
zed_sdk_version |
str
|
Version of the ZED SDK used to create the file. |
calib_acc_matrix1 |
Float32[ndarray, '3 3']
|
3x3 float32 matrix for accelerometer calibration. |
calib_acc_matrix2 |
Float32[ndarray, '3 3']
|
3x3 float32 matrix for accelerometer calibration. |
calib_gyro_matrix1 |
Float32[ndarray, '3 3']
|
3x3 float32 matrix for gyroscope calibration. |
calib_gyro_matrix2 |
Float32[ndarray, '3 3']
|
3x3 float32 matrix for gyroscope calibration. |
header |
Header
|
Parsed SVO2Header. |
version |
str
|
SVO2 file format version string (e.g., "2.0.3"). |
channels |
dict[str, int]
|
Mapping of topic names to channel IDs in the MCAP file. |
timestamps |
dict[str, UInt64[ndarray, '?N']]
|
Dictionary mapping topic names to arrays of uint64 timestamps (in nanoseconds since epoch) for each sensor reading. |
Source code in src/open_svo2/metadata.py
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 | |
consistency_check
¶
Check parsed metadata for consistency.
Source code in src/open_svo2/metadata.py
from_mcap
classmethod
¶
Extract metadata from the MCAP reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mcap
|
McapReader | str
|
file path to a svo2 mcap file or a |
required |
Source code in src/open_svo2/metadata.py
open_svo2.StereoIntrinsics
dataclass
¶
Stereo camera pair parameters.
Info
Zed uses a convention where the left camera is transformed relative to the right camera which is considered the reference frame.
Attributes:
| Name | Type | Description |
|---|---|---|
left |
Intrinsics
|
Intrinsics for the left camera. |
right |
Intrinsics
|
Intrinsics for the right camera. |
baseline |
float
|
Horizontal separation between cameras in mm. |
ty |
float
|
Translation offset in Y direction (vertical) in mm. |
tz |
float
|
Translation offset in Z direction (depth) in mm. |
cv |
float
|
Convergence angle in radians (angle at which optical axes converge). |
rx |
float
|
Rotation around X axis (pitch) in radians. |
rz |
float
|
Rotation around Z axis (roll) in radians. |
Source code in src/open_svo2/intrinsics.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 | |
as_dict
¶
as_dict() -> dict
Convert StereoIntrinsics to a dictionary format.
Source code in src/open_svo2/intrinsics.py
from_config
classmethod
¶
Parse Zed SDK sensor.conf contents.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cfg
|
dict | str
|
Zed SDK sensor configuration dictionary or path to dictionary. |
required |
mode
|
str | None
|
Camera mode (e.g., |
None
|
height
|
int | None
|
Image height in pixels, used to infer mode if mode is not provided. Must be one of {1200, 1080, 600} corresponding to modes {FHD1200, FHD, SVGA} respectively. |
None
|
Source code in src/open_svo2/intrinsics.py
infer_mode
staticmethod
¶
Infer Zed camera mode from image height.
Source code in src/open_svo2/intrinsics.py
open_svo2.imu_from_svo2
¶
Extract raw IMU data from SVO2 MCAP into a .npz file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mcap
|
McapReader | str
|
file path to a svo2 mcap file or a |
required |
metadata
|
Metadata | None
|
Optional pre-parsed metadata. If not provided, it will be extracted from the MCAP reader. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, ndarray]
|
A dictionary containing |
Source code in src/open_svo2/convert.py
open_svo2.mp4_from_svo2
¶
mp4_from_svo2(
mcap: McapReader | str, output: str, metadata: Metadata | None = None
) -> UInt32[ndarray, N]
Extract video stream from SVO2 MCAP into a standard MP4 container.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mcap
|
McapReader | str
|
file path to a svo2 mcap file or a |
required |
output
|
str
|
file path to the output MP4 file. |
required |
metadata
|
Metadata | None
|
Optional pre-parsed metadata. If not provided, it will be extracted from the MCAP reader. |
None
|
Returns:
| Type | Description |
|---|---|
UInt32[ndarray, N]
|
Index of the last keyframe, as recorded by the frame footer. |
Source code in src/open_svo2/convert.py
open_svo2.raw_from_svo2
¶
raw_from_svo2(
mcap: McapReader | str, output: str, metadata: Metadata | None = None
) -> tuple[UInt64[ndarray, N + 1], Float64[ndarray, N], UInt32[ndarray, N]]
Extract raw video frames from SVO2 MCAP into a binary file.
Raw frames are concatenated to the output file; this should be readable with ffmpeg.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mcap
|
McapReader | str
|
file path to a svo2 mcap file or a |
required |
output
|
str
|
file path to the output file. |
required |
metadata
|
Metadata | None
|
Optional pre-parsed metadata. If not provided, it will be extracted from the MCAP reader. |
None
|
Returns:
| Type | Description |
|---|---|
UInt64[ndarray, N + 1]
|
Byte offsets of frame boundaries. |
Float64[ndarray, N]
|
Timestamps in seconds (Float64). |
UInt32[ndarray, N]
|
Index of the last keyframe, as recorded by the frame footer. |