Example Project: Developing a Tool to Analyze Experiment Results and Generate Custom Reports
Outline
This comprehensive project will solidify your Python SDK knowledge by guiding you through building a real-world Opal tool. You'll integrate data fetching, data processing, statistical analysis, and report generation.
Project Goal: Build an Opal tool that takes Optimizely experiment IDs, fetches relevant data (simulated or from a mock API), performs custom statistical analysis (e.g., Bayesian A/B testing, frequentist A/B testing calculations), and generates a summarized report (e.g., JSON or a simple HTML snippet).
Scenario: A marketing team wants a quick way to get deeper insights into their Optimizely experiments, beyond the standard dashboard. They need a tool that can calculate custom metrics or perform specific statistical tests not readily available.
Key Steps and Implementation Details
Project Setup:
- Create a new Python project and virtual environment.
- Install necessary libraries:
optimizely-opal.opal-tools-sdk
,fastapi
,uvicorn
,pandas
,numpy
,scipy
. - Create a
main.py
for your FastAPI app and asrc/tools/experiment_analyzer.py
for your tool logic.
Data Input and Tool Definition:
- Define a Pydantic model for the tool's parameters. This should include:
-
experiment_id
:str
(The ID of the experiment). -
control_data
: A nested Pydantic model or dictionary withvisitors: int
andconversions: int
. -
variant_data
: A list of nested Pydantic models or dictionaries, each withname: str
,visitors: int
,conversions: int
. -
alpha
:float
(Significance level, e.g., 0.05).
-
- Use the
@tool
decorator to define youranalyze_experiment_results
tool.
Simulate Data Fetching (or Mock API):
- For simplicity, you won't connect to a real Optimizely API in this exercise. Instead, your tool will receive the
control_data
andvariant_data
directly as parameters. - In a real application, this step would involve making authenticated API calls to Optimizely's APIs (e.g., Experimentation REST API) to retrieve raw experiment data.
Statistical Analysis:
Implement functions to perform common A/B testing calculations. A good starting point is a Chi-Squared Test for Proportions to determine statistical significance between variants and control.
Formula for Chi-Squared (simplified for 2x2 table):
- Expected conversions for control:
(total_control_visitors * total_conversions) / total_visitors
- Expected non-conversions for control:
(total_control_visitors * total_non_conversions) / total_visitors
- Calculate for each variant similarly.
- Chi-squared statistic:
sum((observed - expected)^2 / expected)
for all cells. - Compare the chi-squared statistic to a critical value (from a chi-squared distribution table) or use
scipy.stats.chi2_contingency
.
- Expected conversions for control:
Example Chi-Squared Calculation (simplified):
from scipy.stats import chi2_contingency
def calculate_chi_squared(control_conversions, control_visitors, variant_conversions, variant_visitors):
# Create a contingency table
# Rows: Control, Variant
# Columns: Conversions, Non-Conversions
control_non_conversions = control_visitors - control_conversions
variant_non_conversions = variant_visitors - variant_conversions
contingency_table = [
[control_conversions, control_non_conversions],
[variant_conversions, variant_non_conversions]
]
chi2, p_value, _, _ = chi2_contingency(contingency_table)
return chi2, p_value
# Example usage:
# chi2, p_value = calculate_chi_squared(100, 1000, 120, 1000)
# print(f"Chi2: {chi2}, P-value: {p_value}")
Report Generation:
- The tool should return a structured JSON object containing:
-
experiment_id
-
control_results
(conversions, visitors, conversion_rate) -
variant_results
(for each variant: name, conversions, visitors, conversion_rate, statistical_significance_vs_control, p_value_vs_control, uplift_vs_control) -
overall_conclusion
(e.g., "Variant X is statistically significant winner," "No significant difference found.")
-
- Consider adding a simple HTML report string as an output field for better readability in some contexts.
Tool Implementation (src/tools/experiment_analyzer.py
):
from optimizely_opal.opal_tools_sdk import tool
from pydantic import BaseModel, Field
from typing import List, Dict, Any
from scipy.stats import chi2_contingency
import math
class VariantMetrics(BaseModel):
name: str = Field(..., description="Name of the variant (e.g., 'Control', 'Variant A').")
visitors: int = Field(..., description="Number of unique visitors to this variant.", ge=0)
conversions: int = Field(..., description="Number of conversions for this variant.", ge=0)
class ExperimentAnalysisParams(BaseModel):
experiment_id: str = Field(..., description="The ID of the Optimizely experiment.")
control: VariantMetrics = Field(..., description="Metrics for the control group.")
variants: List[VariantMetrics] = Field(..., description="List of metrics for each variant group.")
alpha: float = Field(0.05, description="Significance level (alpha) for statistical tests.", ge=0.01, le=0.1)
@tool(name="analyze_experiment_results", description="Performs statistical analysis on Optimizely experiment data.")
async def analyze_experiment_results_tool(params: ExperimentAnalysisParams):
"""
Analyzes experiment data, calculates conversion rates, uplift, and statistical significance.
"""
results = {
"experiment_id": params.experiment_id,
"control_results": {
"name": params.control.name,
"visitors": params.control.visitors,
"conversions": params.control.conversions,
"conversion_rate": (params.control.conversions / params.control.visitors) if params.control.visitors > 0 else 0
},
"variant_results": [],
"overall_conclusion": "Analysis completed."
}
if params.control.visitors == 0:
results["overall_conclusion"] = "Control group has no visitors, cannot perform analysis."
return results
control_cr = results["control_results"]["conversion_rate"]
overall_significant_difference = False
for variant in params.variants:
variant_cr = (variant.conversions / variant.visitors) if variant.visitors > 0 else 0
uplift = ((variant_cr - control_cr) / control_cr) * 100 if control_cr > 0 else float('inf')
statistical_significance = "N/A"
p_value = None
if variant.visitors > 0 and variant.conversions >= 0: # Ensure valid data for chi-squared
try:
# Contingency table for Chi-Squared Test
# Rows: Control, Variant
# Columns: Conversions, Non-Conversions
control_non_conversions = params.control.visitors - params.control.conversions
variant_non_conversions = variant.visitors - variant.conversions
contingency_table = [
[params.control.conversions, control_non_conversions],
[variant.conversions, variant_non_conversions]
]
# Perform Chi-Squared test
chi2, p_value, _, _ = chi2_contingency(contingency_table)
if p_value < params.alpha:
statistical_significance = "Statistically Significant"
overall_significant_difference = True
else:
statistical_significance = "Not Statistically Significant"
except ValueError as e:
statistical_significance = f"Error in Chi-Squared: {str(e)}"
p_value = None
except Exception as e:
statistical_significance = f"Unexpected error in Chi-Squared: {str(e)}"
p_value = None
results["variant_results"].append({
"name": variant.name,
"visitors": variant.visitors,
"conversions": variant.conversions,
"conversion_rate": variant_cr,
"uplift_vs_control_percent": round(uplift, 2) if math.isfinite(uplift) else "Infinity",
"statistical_significance_vs_control": statistical_significance,
"p_value_vs_control": round(p_value, 4) if p_value is not None else "N/A"
})
if overall_significant_difference:
results["overall_conclusion"] = "One or more variants showed a statistically significant difference from control."
else:
results["overall_conclusion"] = "No statistically significant difference found between variants and control."
return results
# Example of how to integrate this tool into your FastAPI app (main.py)
# from .tools.experiment_analyzer import analyze_experiment_results_tool
# opal_tools_service.register_tool(analyze_experiment_results_tool)
Testing:
- Run your FastAPI application locally (
uvicorn main:app --reload
). - Use Postman or Insomnia to send
POST
requests to<a href="http://localhost:8000/tools/analyze_experiment_results">http://localhost:8000/tools/analyze_experiment_results</a>
with variousExperimentAnalysisParams
JSON bodies. Test cases:
- Control and variant data with clear differences (expect significance).
- Control and variant data with small differences (expect no significance).
- Edge cases: zero visitors, zero conversions.
- Invalid
alpha
values.
By completing this project, you will have developed a robust, data-driven Opal tool using the Python SDK, capable of automating complex backend processes and integrating with diverse data sources. This demonstrates a powerful application of Opal tools for enhancing data analysis workflows within Optimizely.