Hypothesis

`module` `crystallize.experiments.hypothesis`

`function` `rank_by_p_value`

rank_by_p_value(result: dict) → float

A simple, picklable ranker function. Lower p-value is better.

`class` `Hypothesis`

Encapsulate a statistical test to compare baseline and treatment results.

`method` `Hypothesis.init`

__init__(
    verifier: Callable[[Mapping[str, Sequence[Any]], Mapping[str, Sequence[Any]]], Mapping[str, Any]],
    metrics: Optional[str, Sequence[str], Sequence[Sequence[str]]] = None,
    ranker: Optional[Callable[[Mapping[str, Any]], float]] = None,
    name: Optional[str] = None
) → None

`method` `Hypothesis.rank_treatments`

rank_treatments(verifier_results: Mapping[str, Any]) → Mapping[str, Any]

Rank treatments using the ranker score function.

`method` `Hypothesis.verify`

verify(
    baseline_metrics: Mapping[str, Sequence[Any]],
    treatment_metrics: Mapping[str, Sequence[Any]]
) → Any

Evaluate the hypothesis using selected metric groups.

Args:

baseline_metrics: Aggregated metrics from baseline runs.
treatment_metrics: Aggregated metrics from a treatment.

Returns: The output of the verifier callable. When multiple metric groups are specified the result is a list of outputs in the same order.

Hypothesis

module crystallize.experiments.hypothesis

function rank_by_p_value

class Hypothesis

method Hypothesis.__init__

method Hypothesis.rank_treatments

method Hypothesis.verify