Extending Crystallize
Crystallize is intentionally minimal. Its architecture emphasizes clear data transformations and reproducible workflows. Rather than subclassing core classes, you extend functionality through a small set of well-defined extension points. This guide outlines those points and the guiding principles behind them.
Plugins: The Primary Extension Mechanism
Section titled “Plugins: The Primary Extension Mechanism”Plugins allow you to inject behavior around the execution of an experiment. Each plugin subclasses BasePlugin and implements any of the available lifecycle hooks:
init_hook(experiment): configure defaults when the experiment instance is created.before_run(experiment): run logic at the start ofrun()before replicates execute.before_replicate(experiment, ctx): invoked prior to each replicate’s pipeline.before_step(experiment, step): runs immediately before a step executes.after_step(experiment, step, data, ctx): observe the output of everyPipelineStep. Do not mutatedataorctxhere.after_run(experiment, result): called after the result object is assembled.run_experiment_loop(experiment, replicate_fn): optional hook to override how replicates are executed (e.g., parallelism).
Use plugins for cross-cutting operational concerns such as logging, notifications, artifact storage, or custom execution strategies. Keep data transformations inside pipeline steps; plugins should be side-effect oriented observers or coordinators.
Design Principle
Section titled “Design Principle”- Transform data with
PipelineStepimplementations. Steps are pure functions that takedataandctxand return new data. - Handle operational logic with plugins. Anything related to orchestration, metrics aggregation, saving outputs, or monitoring belongs in a plugin.
This separation keeps pipelines focused and testable while enabling rich customization of experiment behavior.
Other Extension Points
Section titled “Other Extension Points”While plugins are the main mechanism, you can also extend Crystallize by:
- Creating custom
DataSourceclasses to fetch or generate your input data. - Writing new
@verifierfunctions for hypotheses, providing statistical tests or custom ranking logic.
Together these extension points allow advanced users to tailor Crystallize to unique workflows without modifying the core library.