🔄 Reproducibility ================= RexF is designed from the ground up to ensure your experiments are reproducible. This guide covers all the ways RexF helps you achieve reliable, repeatable results. Automatic Reproducibility Tracking ---------------------------------- RexF automatically captures all information needed to reproduce your experiments: Code Version Control ~~~~~~~~~~~~~~~~~~~ Every experiment automatically records: .. code-block:: python from rexf import experiment, run @experiment def my_experiment(param1=42): return {"result": param1 * 2} # RexF automatically captures: # - Git commit hash # - Repository status (clean/dirty) # - Branch name # - Uncommitted changes (if any) run_id = run.single(my_experiment, param1=100) # View captured git info experiment = run.get_by_id(run_id) print(f"Git commit: {experiment.git_commit}") print(f"Repository status: {experiment.git_status}") Environment Capture ~~~~~~~~~~~~~~~~~~ Python environment details are automatically recorded: .. code-block:: python # Automatically captured for each experiment: # - Python version # - Installed packages and versions # - Virtual environment info # - Operating system details experiment = run.get_by_id(run_id) env_info = experiment.environment print(f"Python version: {env_info['python_version']}") print(f"Platform: {env_info['platform']}") print(f"Installed packages: {len(env_info['packages'])} packages") Random Seed Management --------------------- RexF provides comprehensive random seed handling for true reproducibility: Automatic Seed Capture ~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python import random import numpy as np from rexf import experiment, run @experiment def random_experiment(n_samples=1000): # RexF automatically captures seeds for: # - Python's random module # - NumPy random state # - Other supported libraries random_values = [random.random() for _ in range(n_samples)] numpy_values = np.random.rand(n_samples) return { "mean_random": sum(random_values) / len(random_values), "mean_numpy": np.mean(numpy_values) } # Seeds are automatically captured and stored run_id = run.single(random_experiment, n_samples=500) Manual Seed Control ~~~~~~~~~~~~~~~~~~ For explicit control over randomness: .. code-block:: python @experiment def seeded_experiment(data_size=1000, random_seed=None): if random_seed is not None: random.seed(random_seed) np.random.seed(random_seed) # Your experiment code using random numbers data = np.random.normal(0, 1, data_size) result = np.mean(data) return {"mean_value": result, "std_value": np.std(data)} # Reproducible run with fixed seed run_id1 = run.single(seeded_experiment, data_size=500, random_seed=42) run_id2 = run.single(seeded_experiment, data_size=500, random_seed=42) # Results will be identical exp1 = run.get_by_id(run_id1) exp2 = run.get_by_id(run_id2) assert exp1.metrics["mean_value"] == exp2.metrics["mean_value"] print("✅ Experiments are perfectly reproducible!") Supported Random Libraries ~~~~~~~~~~~~~~~~~~~~~~~~~ RexF automatically handles seeds for: - **Python's random module**: ``random.seed()`` - **NumPy**: ``np.random.seed()`` - **PyTorch**: ``torch.manual_seed()`` (if available) - **TensorFlow**: ``tf.random.set_seed()`` (if available) - **Scikit-learn**: Via ``random_state`` parameters .. code-block:: python @experiment def ml_experiment(random_seed=None): if random_seed is not None: # Set all seeds for reproducibility random.seed(random_seed) np.random.seed(random_seed) # If using PyTorch try: import torch torch.manual_seed(random_seed) except ImportError: pass # If using TensorFlow try: import tensorflow as tf tf.random.set_seed(random_seed) except ImportError: pass # Your ML experiment code return {"accuracy": train_model()} Parameter Tracking ----------------- All function parameters are automatically captured: Default Parameter Values ~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python @experiment def experiment_with_defaults(required_param, optional_param=42, another_param="default"): return { "result": required_param * optional_param, "param_length": len(str(another_param)) } # All parameters are recorded, including defaults run_id = run.single(experiment_with_defaults, required_param=10) experiment = run.get_by_id(run_id) print(experiment.parameters) # Output: {"required_param": 10, "optional_param": 42, "another_param": "default"} Complex Parameter Types ~~~~~~~~~~~~~~~~~~~~~~ RexF handles various parameter types: .. code-block:: python @experiment def complex_experiment( numeric_param=3.14, string_param="test", list_param=[1, 2, 3], dict_param={"key": "value"}, bool_param=True ): # All parameter types are properly serialized and stored return {"processed": True} run_id = run.single( complex_experiment, list_param=[10, 20, 30], dict_param={"model": "cnn", "layers": 5} ) Verification and Validation -------------------------- RexF provides tools to verify reproducibility: Reproducibility Testing ~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python def test_reproducibility(experiment_func, params, num_runs=3): """Test if an experiment is reproducible.""" results = [] for i in range(num_runs): # Use same seed for all runs run_id = run.single(experiment_func, random_seed=42, **params) experiment = run.get_by_id(run_id) results.append(experiment.metrics) # Check if all results are identical first_result = results[0] is_reproducible = all( result == first_result for result in results[1:] ) return is_reproducible, results # Test your experiment is_repro, results = test_reproducibility( seeded_experiment, {"data_size": 1000} ) print(f"Experiment is reproducible: {is_repro}") Environment Comparison ~~~~~~~~~~~~~~~~~~~~~ Compare environments between experiments: .. code-block:: python def compare_environments(run_id1, run_id2): """Compare environments between two experiments.""" exp1 = run.get_by_id(run_id1) exp2 = run.get_by_id(run_id2) env1 = exp1.environment env2 = exp2.environment differences = {} # Compare Python versions if env1["python_version"] != env2["python_version"]: differences["python_version"] = (env1["python_version"], env2["python_version"]) # Compare packages packages1 = set(env1["packages"].items()) packages2 = set(env2["packages"].items()) different_packages = packages1.symmetric_difference(packages2) if different_packages: differences["packages"] = different_packages return differences # Compare two experiments differences = compare_environments(run_id1, run_id2) if differences: print("Environment differences found:") for key, diff in differences.items(): print(f" {key}: {diff}") else: print("✅ Environments are identical") Best Practices for Reproducibility ---------------------------------- Experiment Design ~~~~~~~~~~~~~~~~ 1. **Use explicit seeds** when reproducibility is critical: .. code-block:: python @experiment def reproducible_experiment(data_size=1000, random_seed=42): # Always accept and use random_seed parameter if random_seed is not None: random.seed(random_seed) np.random.seed(random_seed) # Your experiment code return {"result": generate_results()} 2. **Document stochastic processes**: .. code-block:: python @experiment def documented_experiment(iterations=1000): """ Experiment with stochastic optimization. Note: This experiment uses random initialization and may produce different results on each run unless random_seed is set. """ # Your stochastic code return {"final_value": optimize_randomly()} 3. **Separate deterministic and stochastic parts**: .. code-block:: python @experiment def hybrid_experiment(deterministic_param=10, random_seed=None): # Deterministic computation deterministic_result = deterministic_param ** 2 # Stochastic computation (controlled by seed) if random_seed is not None: random.seed(random_seed) stochastic_result = random.random() return { "deterministic": deterministic_result, "stochastic": stochastic_result, "combined": deterministic_result + stochastic_result } Version Control Integration ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. **Commit code before important experiments**: .. code-block:: bash # Good practice: commit your experiment code git add experiment.py git commit -m "Add new hyperparameter search experiment" # Run experiments python run_experiments.py 2. **Tag important experiment runs**: .. code-block:: bash # Tag significant results git tag -a "v1.0-baseline" -m "Baseline experiment results" 3. **Use branches for experimental features**: .. code-block:: bash # Create branch for new experiment variants git checkout -b "experiment/new-optimization" Data Dependencies ~~~~~~~~~~~~~~~~ 1. **Record data versions and sources**: .. code-block:: python @experiment def data_dependent_experiment(data_path="data/v1.0/dataset.csv"): # Record data version in results data_info = get_data_info(data_path) # Your experiment results = process_data(data_path) return { **results, "data_version": data_info["version"], "data_checksum": data_info["checksum"], "data_size": data_info["size"] } 2. **Use content hashing for data integrity**: .. code-block:: python import hashlib def get_data_checksum(file_path): """Calculate checksum of data file.""" hash_sha256 = hashlib.sha256() with open(file_path, "rb") as f: for chunk in iter(lambda: f.read(4096), b""): hash_sha256.update(chunk) return hash_sha256.hexdigest() @experiment def checksum_verified_experiment(data_path="dataset.csv"): checksum = get_data_checksum(data_path) # Your experiment results = process_data(data_path) return { **results, "data_checksum": checksum } Reproducing Experiments ----------------------- From Experiment IDs ~~~~~~~~~~~~~~~~~~ .. code-block:: python def reproduce_experiment(original_run_id): """Reproduce an experiment from its run ID.""" original = run.get_by_id(original_run_id) if not original: raise ValueError(f"Experiment {original_run_id} not found") # Get the original experiment function (you need to import it) experiment_name = original.experiment_name experiment_func = globals()[experiment_name] # Or import appropriately # Reproduce with same parameters new_run_id = run.single(experiment_func, **original.parameters) return new_run_id # Reproduce a specific experiment original_run_id = "your_run_id_here" reproduced_run_id = reproduce_experiment(original_run_id) # Compare results run.compare([original_run_id, reproduced_run_id]) From Saved Configurations ~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Save experiment configuration def save_experiment_config(run_id, config_file): """Save experiment configuration for later reproduction.""" experiment = run.get_by_id(run_id) config = { "experiment_name": experiment.experiment_name, "parameters": experiment.parameters, "git_commit": experiment.git_commit, "environment": experiment.environment } import json with open(config_file, "w") as f: json.dump(config, f, indent=2) # Load and reproduce from configuration def reproduce_from_config(config_file): """Reproduce experiment from saved configuration.""" import json with open(config_file, "r") as f: config = json.load(f) # Check environment compatibility current_env = get_current_environment() # You'd implement this if current_env != config["environment"]: print("⚠️ Warning: Current environment differs from original") # Reproduce experiment experiment_func = globals()[config["experiment_name"]] return run.single(experiment_func, **config["parameters"]) Cross-Platform Reproducibility ------------------------------ Handling Platform Differences ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python import platform @experiment def platform_aware_experiment(param1=42): # Record platform information platform_info = { "system": platform.system(), "machine": platform.machine(), "processor": platform.processor() } # Adjust behavior for platform differences if needed if platform.system() == "Windows": # Windows-specific handling result = windows_specific_computation(param1) else: # Unix-like systems result = unix_specific_computation(param1) return { "result": result, "platform": platform_info } Numerical Precision ~~~~~~~~~~~~~~~~~~ .. code-block:: python import numpy as np @experiment def precision_controlled_experiment(data_size=1000, dtype="float64"): # Control numerical precision explicitly np_dtype = getattr(np, dtype) data = np.random.rand(data_size).astype(np_dtype) result = np.sum(data) return { "result": float(result), # Convert to Python float for JSON "dtype_used": dtype, "precision_bits": np.finfo(np_dtype).bits } Troubleshooting Reproducibility Issues ------------------------------------- Common Issues and Solutions ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. **Different random number sequences**: .. code-block:: python # Problem: Random sequences differ between runs # Solution: Always set and use seeds explicitly @experiment def fixed_random_experiment(n_samples=100, random_seed=42): # Set seed at the beginning np.random.seed(random_seed) random.seed(random_seed) # Your random computations return {"result": np.random.rand(n_samples).mean()} 2. **Environment dependency issues**: .. code-block:: bash # Create reproducible environment with exact versions pip freeze > requirements.txt # Or use conda conda env export > environment.yml 3. **Floating point precision differences**: .. code-block:: python @experiment def precision_robust_experiment(tolerance=1e-10): # Use appropriate tolerances for comparisons result = complex_computation() return { "result": round(result, 10), # Round to avoid precision issues "tolerance_used": tolerance } Debugging Non-Reproducible Results ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python def debug_reproducibility(experiment_func, params, num_runs=5): """Debug why an experiment isn't reproducible.""" results = [] for i in range(num_runs): print(f"Run {i+1}...") # Use same seed but capture more info run_id = run.single( experiment_func, random_seed=42, # Same seed **params ) experiment = run.get_by_id(run_id) results.append({ "run_id": run_id, "metrics": experiment.metrics, "start_time": experiment.start_time, "environment": experiment.environment }) # Analyze differences print("\nAnalyzing differences...") # Check metrics first_metrics = results[0]["metrics"] for i, result in enumerate(results[1:], 1): if result["metrics"] != first_metrics: print(f"Run {i+1} differs from run 1:") for key in first_metrics: if key in result["metrics"]: if first_metrics[key] != result["metrics"][key]: print(f" {key}: {first_metrics[key]} vs {result['metrics'][key]}") return results # Debug your experiment debug_results = debug_reproducibility(my_experiment, {"param1": 100}) Next Steps --------- - :doc:`advanced_features` - Advanced analysis techniques - :doc:`tutorials/monte_carlo` - Complete reproducibility example - :doc:`api/core` - Core API for experiment control