nrdk.tss
¶
Time series metric statistics.
The high level API is broken up into four steps:
-
index
: Index evaluation files using a regex pattern.Info
If you are only interested in a subset of evaluation traces, you can filter them at this stage.
-
experiments_from_index
: Load data for each indexed result file, or a subset of experiments. -
stats_from_experiments
: Compute statistics for each experiment. SeeNDStats
andeffective_sample_size
for more details about how and what statistics are computed. -
dataframe_from_stats
: Aggregate the statistics into a readable dataframe, ready to be plotted or exported.
Tip
We also provide dataframe_from_index
, which combines the last three
steps into a single function for convenience.
nrdk.tss.NestedValues
module-attribute
¶
NestedValues = Sequence['NestedValues'] | LeafType
An arbitrarily nested sequence, parameterized by a leaf type.
For example, these are valid examples of
NestedValues[Float[np.ndarray, "_N"]]
:
nrdk.tss.dataframe_from_index
¶
dataframe_from_index(
index: dict[str | None, dict[str | None, str]],
key: str,
timestamps: str | None = None,
experiments: Sequence[str | None] | None = None,
cut: float | None = None,
baseline: str | None = None,
workers: int = -1,
t_max: int | None = None,
) -> DataFrame
Load and calculate statistics from indexed experiment results.
See (1) dataframe_from_stats
, (2) stats_from_experiments
,
and (3) and experiments_from_index
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
dict[str | None, dict[str | None, str]]
|
2-level dictionary with experiment names, sequence/trace names,
and paths to the result files; see |
required |
key
|
str
|
name of the metric to load from the result files. |
required |
timestamps
|
str | None
|
name of the timestamps to load from the result files. |
None
|
experiments
|
Sequence[str | None] | None
|
list of experiment names to load from the index; loads all experiments if not specified. |
None
|
cut
|
float | None
|
cut each time series when there is a gap in the timestamps larger
than this value if provided; see |
None
|
baseline
|
str | None
|
baseline experiment for relative statistics. |
None
|
workers
|
int
|
number of worker threads to use when loading. If |
-1
|
t_max
|
int | None
|
maximum time delay to consider when computing effective sample
size; if |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
Dataframe with statistics for each experiment. |
Source code in src/nrdk/tss/api.py
nrdk.tss.dataframe_from_stats
¶
dataframe_from_stats(
names: list[str],
abs: NDStats,
rel: NDStats | None = None,
baseline: str | None = None,
) -> DataFrame
Create a dataframe from (possibly un-aggregated) experiment statistics.
Returns a dataframe where each row is a different experiment.
abs/(mean|std|stderr|zscore|n|ess)
: absolute statistics for the provided metric for each experiment.rel/(mean|std|stderr|zscore|n|ess)
: relative statistics for the provided metric for each experiment, relative to thebaseline
. If nobaseline
is provided, these columns are not included.pct/(mean|stderr)
: percent difference and standard error relative to thebaseline
, computed as100 * <rel/mean>/<abs/mean>
and100 * <rel/stderr>/<abs/mean>
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
names
|
list[str]
|
names of the experiments corresponding to the leading axis in the input statistics. |
required |
abs
|
NDStats
|
absolute statistics for the provided metric for each experiment. |
required |
rel
|
NDStats | None
|
optional relative statistics. |
None
|
baseline
|
str | None
|
name of the experiment used as the baseline. |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
Dataframe with statistics for each experiment. |
Source code in src/nrdk/tss/api.py
nrdk.tss.experiments_from_index
¶
experiments_from_index(
index: dict[str | None, dict[str | None, str]],
key: str,
timestamps: str | None = None,
experiments: Sequence[str | None] | str | None = None,
cut: float | None = None,
workers: int = -1,
) -> tuple[
Mapping[str, NestedValues[Num[ndarray, _N]]],
Mapping[str, NestedValues[Float64[ndarray, _N]]] | None,
list[str],
]
Load experiment results from indexed result files.
Each results file is expected to be a .npz
file containing metric and
metadata arrays; the keys for these arrays should be specified by key
and
timestamps
, respectively.
- These arrays should all have the same leading axis length.
- The metric array should have only a single axis.
Warning
Only sequences which are present in all experiments will be loaded.
Check the returned common
list to make sure it matchse what you
expect!
Tip
A timestamps
key can optionally be provided.
- If not provided, the metrics are assumed to be at identical timestamps.
- If multiple timestamps are present, the last one is used.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
dict[str | None, dict[str | None, str]]
|
2-level dictionary with experiment names, sequence/trace names,
and paths to the result files; see |
required |
key
|
str
|
name of the metric to load from the result files. |
required |
timestamps
|
str | None
|
name of the timestamps to load from the result files. |
None
|
experiments
|
Sequence[str | None] | str | None
|
list of experiment names to load from the index (or a regex filter); loads all experiments if not specified. |
None
|
cut
|
float | None
|
cut each time series when there is a gap in the timestamps larger
than this value if provided; see |
None
|
workers
|
int
|
number of worker threads to use when loading. If |
-1
|
Returns:
Type | Description |
---|---|
Mapping[str, NestedValues[Num[ndarray, _N]]]
|
A dictionary of metric values (as a list of metric values by sequence). |
Mapping[str, NestedValues[Float64[ndarray, _N]]] | None
|
A dictionary of timestamps (or |
list[str]
|
A list of the common sequence/trace names which correspond to the loaded metrics. |
Source code in src/nrdk/tss/api.py
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 |
|
nrdk.tss.index
¶
index(
path: str, pattern: str | Pattern, follow_symlinks: bool = False
) -> dict[str | None, dict[str | None, str]]
Recursively find all evaluations matching the given pattern.
Tip
LLM chat bots are very good at writing simple regex patterns!
The pattern can have two groups: experiment
, and trace
, which
respectively indicate the name of the experiment and trace. If either group
is omitted, it is set as None
.
Example
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
directory to start searching from. |
required |
pattern
|
str | Pattern
|
regex pattern to match the evaluation directories. |
required |
follow_symlinks
|
bool
|
whether to follow symbolic links. |
False
|
Returns:
Type | Description |
---|---|
dict[str | None, dict[str | None, str]]
|
A two-level dictionary, where the first level keys are the experiment names, the second level keys are the trace names, and the values are paths to the matching files. |
Source code in src/nrdk/tss/api.py
nrdk.tss.stats_from_experiments
¶
stats_from_experiments(
y: Mapping[str, NestedValues[Num[ndarray, _N]]],
t: Mapping[str, NestedValues[Float64[ndarray, _N]]] | None = None,
baseline: str | None = None,
workers: int = -1,
t_max: int | None = None,
) -> tuple[list[str], NDStats, NDStats | None]
Calculate statistics from experiment results.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
y
|
Mapping[str, NestedValues[Num[ndarray, _N]]]
|
mapping of experiment names and metric values. |
required |
t
|
Mapping[str, NestedValues[Float64[ndarray, _N]]] | None
|
mapping of experiment names and timestamps. If not provided, the metrics are assumed to be at identical timestamps. |
None
|
baseline
|
str | None
|
baseline experiment for relative statistics. |
None
|
workers
|
int
|
number of worker threads to use for computation. |
-1
|
t_max
|
int | None
|
maximum time delay to consider when computing effective sample
size; if |
None
|
Returns:
Type | Description |
---|---|
list[str]
|
Names of each experiment corresponding to leading axis in the output statistics. |
NDStats
|
Absolute statistics for the provided metric. |
NDStats | None
|
Relative statistics (difference relative to the specified baseline), if provided. |