Parameter Optimization
Crystallize ships with a simple optimisation hook that lets you drive experiments from an external strategy (grid search, Bayesian optimisation, evolutionary algorithms, …). The pattern mirrors ask/tell: the optimiser proposes a treatment, Crystallize evaluates it, and the optimiser consumes the aggregated metric.
1. Implement a BaseOptimizer
Section titled “1. Implement a BaseOptimizer”from crystallize.experiments.optimizers import BaseOptimizer, Objectivefrom crystallize import Treatment
class GridSearchOptimizer(BaseOptimizer): def __init__(self, deltas: list[float], objective: Objective): super().__init__(objective) self.deltas = deltas self._index = 0 self._scores: list[float] = []
def ask(self) -> list[Treatment]: delta = self.deltas[self._index] return [Treatment(name=f"grid_{delta}", apply={"delta": delta})]
def tell(self, objective_values: dict[str, float]) -> None: self._scores.append(next(iter(objective_values.values()))) self._index += 1
def get_best_treatment(self) -> Treatment: best_idx = min(range(len(self._scores)), key=self._scores.__getitem__) return Treatment(name="grid_best", apply={"delta": self.deltas[best_idx]})Key points:
Objective.metricnames the metric to optimise (must exist inctx.metrics). The current helper averages the metric across replicates.ask()returns a list of treatments. The built-in extraction assumes exactly one treatment per trial.tell()receives a{metric_name: aggregated_value}mapping.
2. Configure the Experiment
Section titled “2. Configure the Experiment”@data_sourcedef initial_data(ctx: FrozenContext) -> list[int]: return [1, 2, 3]
@pipeline_step()def add_delta(data: list[int], ctx: FrozenContext, *, delta: float = 0.0) -> list[int]: return [x + delta for x in data]
@pipeline_step()def record_sum(data: list[int], ctx: FrozenContext): ctx.metrics.add("sum", sum(data)) return data
experiment = Experiment( datasource=initial_data(), pipeline=Pipeline([add_delta(), record_sum()]),)
optimizer = GridSearchOptimizer( deltas=[0.0, 1.0, 2.0], objective=Objective(metric="sum", direction="minimize"),)3. Run the Loop
Section titled “3. Run the Loop”best_treatment = experiment.optimize( optimizer, num_trials=len(optimizer.deltas), replicates_per_trial=1,)print(best_treatment.name, best_treatment._apply_value) # internal representationExperiment.optimize (synchronous) calls the async helper under the hood:
- Call
ask()to obtain a treatment. - Run the experiment with that treatment (respecting
replicates_per_trial). - Average the specified metric across replicates and hand it to
tell(). - Repeat for
num_trials. - Return
get_best_treatment().
You can also call await experiment.aoptimize(...) inside your own async code.
4. Tips
Section titled “4. Tips”- Use
directionto indicate how you interpret the metric (currently informational; implement minimisation/maximisation logic intell()). - You can access full metrics inside
tell()by running the experiment manually and extracting whatever you need before callingtell(). The built-in helper only passes the average of the named metric. - To evaluate multiple treatments per trial, extend
_extract_objective_from_resultor run the experiment yourself and calloptimizer.tell(...)with whatever aggregation makes sense. - Optimisation is orthogonal to the CLI. Run the optimisation loop in Python and then apply the returned treatment in production via
experiment.apply(best_treatment).