II.
LibraryProcess overview
Reference · livelib-process:data-science-ml--model-evaluation
model-evaluation overview
Model Evaluation and Validation Framework - Comprehensive model assessment across multiple dimensions including performance metrics, robustness testing, fairness analysis, explainability, and production readiness checks with iterative validation loops and quality gates.
Attributes
displayName
model-evaluation
description
Model Evaluation and Validation Framework - Comprehensive model assessment across multiple dimensions
including performance metrics, robustness testing, fairness analysis, explainability, and production readiness checks
with iterative validation loops and quality gates.
libraryPath
library/specializations/data-science-ml/model-evaluation.js
specialization
data-science-ml
references
- - Google ML Testing: https://developers.google.com/machine-learning/testing-debugging - Model Cards: https://arxiv.org/abs/1810.03993 - Fairness Indicators: https://www.tensorflow.org/responsible_ai/fairness_indicators/guide - SHAP (SHapley Additive exPlanations): https://github.com/slundberg/shap - Model Validation Best Practices: https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
example
const result = await orchestrate('specializations/data-science-ml/model-evaluation', {
modelPath: 'models/trained/churn-predictor-v2.pkl',
testDataPath: 'data/test/churn_test.csv',
modelType: 'classification',
targetMetrics: { accuracy: 0.85, f1_score: 0.80, auc_roc: 0.88 },
validationLevel: 'comprehensive',
fairnessAttributes: ['age_group', 'gender', 'region'],
explainabilityRequired: true
});
usesAgents
- general-purpose
Outgoing edges
lib_applies_to_domain1
- domain:data-science·DomainData Science
lib_belongs_to_specialization1
- specialization:data-science-ml·Specialization
lib_implements_workflow1
- workflow:data-pipeline-deployment·WorkflowData Pipeline Deployment
lib_involves_role1
- role:data-scientist·RoleData Scientist
Incoming edges
None.