roverd
¶
Roverd: data format and data loading.
ADL-Compliant
The roverd
package implements a fully Abstract Dataloader-compliant map-style
data loader.
Thus, to use the dataloader in practice, in addition to writing custom ADL-compliant components, you can use generic ADL components:
Nearest
synchronizationWindow
to load consecutive frames as a single sampleTransformedDataset
to get theroverd.Dataset
into a pytorch dataloader
Fully Typed
The roverd
dataloader is fully typed using generic dataclasses of
jaxtyping arrays following
the Abstract Dataloader's recommendations, and comes with a
type library which describes the data types collected by the
red-rover
system.
To use roverd
, you can either use the high level interfaces to load a
complete dataset consisting of multiple traces, or use lower-level APIs
to load a single trace, single sensor, or a single "channel" within a sensor.
- High level APIs are generally preferred, and include descriptive types.
- Lower level APIs should generally be avoided, but are required to modify the data; high-level APIs are intentionally read only.
>>> from roverd import Dataset, sensors
>>> from abstract_dataloader import generic
>>> dataset = Dataset.from_config(
Dataset.find_traces("/data/grt"),
sync=generic.Nearest("lidar", tol=0.1),
sensors={"radar": sensors.XWRRadar, "lidar": sensors.OSLidar})
>>> dataset
Dataset(166 traces, n=1139028)
>>> dataset[42]
{'radar': XWRRadarIQ(...), 'lidar': OSLidarData(...)}
>>> from roverd import Trace, sensors
>>> from abstract_dataloader import generic
>>> trace = Trace.from_config(
"/data/grt/bike/point.back",
sync=generic.Nearest("lidar"),
sensors={"radar": sensors.XWRRadar, "lidar": sensors.OSLidar})
>>> trace['radar']
XWRRadar(/data/grt/bike/point.back/radar: [ts, iq, valid])
>>> trace[42]
{'radar': XWRRadarIQ(...), 'lidar': OSLidarData(...)}
>>> from roverd import Trace
>>> from abstract_dataloader import generic
>>> trace = Trace.from_config(
"/data/grt/bike/point.back", sync=generic.Nearest("lidar"))
>>> trace
Trace(/data/grt/bike/point.back, 12195x[radar, camera, lidar, imu])
>>> trace['radar']
DynamicSensor(/data/grt/bike/point.back/radar: [ts, iq, valid])
>>> trace[42]
{'radar': {...}, 'camera': {...}, 'lidar': {...}, ...}
>>> from roverd.sensors import XWRRadar
>>> radar = XWRRadar("/data/grt/bike/point.back/radar")
>>> radar
XWRRadar(/data/grt/bike/point.back/radar: [ts, iq, valid])
>>> len(radar)
24576
>>> radar[42]
XWRRadarIQ(
iq=int16[1 64 3 4 512], timestamps=float64[1], valid=uint8[1],
range_resolution=float32[1], doppler_resolution=float32[1])
roverd.Dataset
¶
Bases: Dataset[TSample]
A dataset, consisting of multiple traces.
Type Parameters
Sample
: sample data type which thisDataset
returns. As a convention, we suggest returning "batched" data by default, i.e. with a leading singleton axis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
traces
|
Sequence[Trace[TSample]]
|
traces which make up this dataset; must be |
required |
Source code in format/src/roverd/trace.py
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 |
|
__getitem__
¶
Fetch item from this dataset by global index.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
int | integer
|
sample index. |
required |
Returns:
Type | Description |
---|---|
TSample
|
loaded sample. |
Raises:
Type | Description |
---|---|
IndexError
|
provided index is out of bounds. |
Source code in format/src/roverd/trace.py
find_traces
staticmethod
¶
Walk a directory (or list of directories) to find all datasets.
Datasets are defined by directories containing a config.yaml
file.
Warning
This method does not follow symlinks by default. If you have a
cirular symlink, and follow_symlinks=True
, this method will loop
infinitely!
Parameters:
Name | Type | Description | Default |
---|---|---|---|
paths
|
str
|
a (list) of filepaths. |
()
|
follow_symlinks
|
bool
|
whether to follow symlinks. |
False
|
Source code in format/src/roverd/trace.py
from_config
classmethod
¶
from_config(
paths: Sequence[str],
sync: Synchronization = Empty(),
sensors: Mapping[str, Callable[[str], Sensor] | str | None] | None = None,
include_virtual: bool = False,
workers: int = 0,
) -> Dataset
Create a dataset from a list of directories containing recordings.
Constructor arguments are forwarded to Trace.from_config
.
Tip
Set workers=-1
to initialize all traces in parallel. This can
greatly speed up initialization on highly distributed filesystems,
e.g. blob stores!
Parameters:
Name | Type | Description | Default |
---|---|---|---|
paths
|
Sequence[str]
|
paths to trace directories. |
required |
sync
|
Synchronization
|
synchronization protocol. |
Empty()
|
sensors
|
Mapping[str, Callable[[str], Sensor] | str | None] | None
|
sensor types to use. |
None
|
include_virtual
|
bool
|
if |
False
|
workers
|
int
|
number of worker threads to use during initialization. If
|
0
|
Source code in format/src/roverd/trace.py
roverd.Trace
¶
Bases: Trace[TSample]
A single trace, containing multiple sensors.
Type Parameters
Sample
: sample data type which thisSensor
returns. As a convention, we suggest returning "batched" data by default, i.e. with a leading singleton axis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sensors
|
Mapping[str, Sensor]
|
sensors which make up this trace. |
required |
sync
|
Synchronization | Mapping[str, Integer[ndarray, N]] | None
|
synchronization protocol used to create global samples from
asynchronous time series. If |
None
|
name
|
str
|
friendly name; should only be used for debugging and inspection. |
'trace'
|
Source code in format/src/roverd/trace.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 |
|
datarate
cached
property
¶
Total data rate, in bytes/sec.
Warning
The trace must be initialized with all sensors for this calculation to be correct.
filesize
cached
property
¶
Total filesize, in bytes.
Warning
The trace must be initialized with all sensors for this calculation to be correct.
__getitem__
¶
Get sample from sychronized index (or fetch a sensor by name).
Tip
For convenience, traces can be indexed by a str
sensor name,
returning that Sensor
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
int | integer | str
|
sample index, or sensor name. |
required |
Returns:
Type | Description |
---|---|
TSample | Sensor
|
Loaded sample if |
Source code in format/src/roverd/trace.py
find_sensors
staticmethod
¶
Find all (non-virtual) sensors in a given directory.
Source code in format/src/roverd/trace.py
from_config
classmethod
¶
from_config(
path: str,
sync: Synchronization = Empty(),
sensors: Mapping[str, Callable[[str], Sensor] | str | None] | None = None,
include_virtual: bool = False,
name: str | None = None,
) -> Trace
Create a trace from a directory containing a single recording.
Sensor types can be specified by:
None
: use theDynamicSensor
type."auto"
: return a known sensor type if applicable; seeroverd.sensors
.Callable[[str], Sensor]
: a sensor constructor, which has all non-path arguments closed on.Sensor
: an already initialized sensor instance.
Info
Sensors can also be inferred automatically (sensors: None
), in
which case we ind and load all sensors in the directory, excluding
virtual sensors (those starting with _
) unless
include_virtual=True
. Each sensor is then initialized as a
DynamicSensor
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
path to trace directory. |
required |
sync
|
Synchronization
|
synchronization protocol. |
Empty()
|
sensors
|
Mapping[str, Callable[[str], Sensor] | str | None] | None
|
sensor types to use. |
None
|
include_virtual
|
bool
|
if |
False
|
name
|
str | None
|
friendly name; if not provided, defaults to the given |
None
|
Source code in format/src/roverd/trace.py
roverd.split
¶
split(
dataset: Dataset[TSample] | Trace[TSample],
start: float = 0.0,
end: float = 0.0,
) -> Dataset[TSample] | Trace[TSample]
Get sub-trace or sub-dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset[TSample] | Trace[TSample]
|
trace or dataset to split. |
required |
start
|
float
|
start of the split, as a proportion of the trace length ( |
0.0
|
end
|
float
|
end of the split, as a proportion of the trace length ( |
0.0
|
Returns:
Type | Description |
---|---|
Dataset[TSample] | Trace[TSample]
|
Trace or dataset with a contiguous subset of samples according to the start and end indices. |