Evaluator API
Synthetic data quality evaluation module, providing privacy risk measurement, data quality assessment, and machine learning utility analysis.
Class Architecture
classDiagram
class Evaluator {
EvaluatorConfig config
string method
__init__(method, **kwargs)
create()
eval(data) EvalResult
}
class EvaluatorConfig {
string method
dict params
string module_path
string class_name
}
class EvalResult {
float global
dict details
DataFrame report
}
%% Privacy Risk Evaluators
class Anonymeter {
int n_attacks
int n_cols
eval() EvalResult
}
class SinglingOutEvaluator {
eval() EvalResult
}
class LinkabilityEvaluator {
list aux_cols
eval() EvalResult
}
class InferenceEvaluator {
string secret
list aux_cols
eval() EvalResult
}
%% Data Quality Evaluators
class SDMetrics {
string report_type
eval() EvalResult
}
class DiagnosticReport {
eval() EvalResult
}
class QualityReport {
eval() EvalResult
}
%% ML Utility Evaluators
class MLUtility {
string task_type
string target
string experiment_design
string resampling
eval() EvalResult
}
class ClassificationUtility {
list metrics
eval() EvalResult
}
class RegressionUtility {
list metrics
eval() EvalResult
}
class ClusteringUtility {
int n_clusters
eval() EvalResult
}
%% Statistical Evaluator
class StatsEvaluator {
list stats_method
string compare_method
eval() EvalResult
}
%% Custom Evaluator
class CustomEvaluator {
string module_path
string class_name
eval() EvalResult
}
%% Input Data
class InputData {
DataFrame ori
DataFrame syn
DataFrame control
}
%% Relationships
Evaluator *-- EvaluatorConfig
Evaluator ..> EvalResult
%% Inheritance for Privacy
Anonymeter <|-- SinglingOutEvaluator
Anonymeter <|-- LinkabilityEvaluator
Anonymeter <|-- InferenceEvaluator
%% Inheritance for Quality
SDMetrics <|-- DiagnosticReport
SDMetrics <|-- QualityReport
%% Inheritance for ML Utility
MLUtility <|-- ClassificationUtility
MLUtility <|-- RegressionUtility
MLUtility <|-- ClusteringUtility
%% Dependencies
Evaluator ..> Anonymeter
Evaluator ..> SDMetrics
Evaluator ..> MLUtility
Evaluator ..> StatsEvaluator
Evaluator ..> CustomEvaluator
%% Data flow
InputData ..> Evaluator
%% Styling
style Evaluator fill:#e6f3ff,stroke:#4a90e2,stroke-width:3px
style EvaluatorConfig fill:#f3e6ff,stroke:#9966cc,stroke-width:2px
style EvalResult fill:#f3e6ff,stroke:#9966cc,stroke-width:2px
style Anonymeter fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style SinglingOutEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style LinkabilityEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style InferenceEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style SDMetrics fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style DiagnosticReport fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style QualityReport fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style MLUtility fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style ClassificationUtility fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style RegressionUtility fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style ClusteringUtility fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style StatsEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style CustomEvaluator fill:#fff2e6,stroke:#ff9800,stroke-width:2px
style InputData fill:#e6ffe6,stroke:#66cc66,stroke-width:2pxLegend:
- Blue boxes: Main classes
- Orange boxes: Subclass implementations
- Light purple boxes: Configuration and data classes
- Light green boxes: Input data
<|--: Inheritance relationship*--: Composition relationship..>: Dependency relationship-->: Data flow
Basic Usage
from petsard import Evaluator
# Privacy risk assessment
evaluator = Evaluator('anonymeter-singlingout')
evaluator.create()
eval_result = evaluator.eval({
'ori': train_data,
'syn': synthetic_data,
'control': test_data
})
privacy_risk = eval_result['global']
# Data quality assessment
evaluator = Evaluator('sdmetrics-qualityreport')
evaluator.create()
eval_result = evaluator.eval({
'ori': train_data,
'syn': synthetic_data
})
quality_score = eval_result['global']
# Machine learning utility assessment (new version)
evaluator = Evaluator('mlutility', task_type='classification', target='income')
evaluator.create()
eval_result = evaluator.eval({
'ori': train_data,
'syn': synthetic_data,
'control': test_data
})
ml_utility = eval_result['global']Constructor (init)
Initialize evaluator instance.
Syntax
def __init__(
method: str,
**kwargs
)Parameters
method : str, required
- Evaluation method name
- Required parameter
- Supported methods:
- Privacy Risk Assessment:
'anonymeter-singlingout': Singling out risk'anonymeter-linkability': Linkability risk'anonymeter-inference': Inference risk
- Data Quality Assessment:
'sdmetrics-diagnosticreport': Data diagnostic report'sdmetrics-qualityreport': Data quality report
- Machine Learning Utility Assessment (Legacy):
'mlutility-classification': Classification utility (multiple models)'mlutility-regression': Regression utility (multiple models)'mlutility-cluster': Clustering utility (K-means)
- Machine Learning Utility Assessment (New, Recommended):
'mlutility': Unified interface (requires task_type parameter)
- Statistical Assessment:
'stats': Statistical difference comparison
- Default Method:
'default': Uses sdmetrics-qualityreport
- Custom Method:
'custom_method': Custom evaluator
- Privacy Risk Assessment:
kwargs : dict, optional
- Additional parameters for specific evaluators
- May include depending on evaluation method:
- MLUtility Parameters:
task_type: Task type (‘classification’, ‘regression’, ‘clustering’)target: Target column nameexperiment_design: Experiment design approachresampling: Imbalanced data handling method
- Anonymeter Parameters:
n_attacks: Number of attack attemptsn_cols: Number of columns per querysecret: Column to be inferred (inference risk)aux_cols: Auxiliary information columns (linkability risk)
- Custom Method Parameters:
module_path: Custom module pathclass_name: Custom class name
- MLUtility Parameters:
Return Value
- Evaluator
- Initialized evaluator instance
Usage Examples
from petsard import Evaluator
# Default evaluation
evaluator = Evaluator('default')
evaluator.create()
eval_result = evaluator.eval({
'ori': original_data,
'syn': synthetic_data
})Supported Evaluation Types
Please refer to PETsARD YAML documentation for details.
Notes
- Method Selection: Choose evaluation method suitable for your needs, different methods focus on different aspects
- Data Requirements: Different evaluation methods require different input data combinations
- Anonymeter and MLUtility: Require ori, syn, control three datasets
- SDMetrics and Stats: Only require ori and syn two datasets
- Best Practice: Use YAML configuration files rather than direct Python API
- Method Call Order: Must call
create()before callingeval() - MLUtility Version: Recommend using new MLUtility (with task_type) rather than legacy separate interfaces
- Documentation Note: This documentation is for internal development team reference only, backward compatibility is not guaranteed