II.
Domain overview
Reference · livedomain:data-science
Data Science overview
Data Science applies statistical analysis, machine learning, and domain expertise to extract insight and predictive models from data. It covers exploratory data analysis (pandas, R, Jupyter), feature engineering, classical ML (scikit-learn, XGBoost), deep learning (PyTorch, TensorFlow), experiment tracking, and communication of findings through visualizations and reports. Data Science is upstream of ML-Ops (productionizing models) and downstream of Data Engineering (building the pipelines that supply clean data). Python and R are the dominant languages; SQL is essential for data access.
Attributes
displayName
Data Science
description
Data Science applies statistical analysis, machine learning, and
domain expertise to extract insight and predictive models from data.
It covers exploratory data analysis (pandas, R, Jupyter), feature
engineering, classical ML (scikit-learn, XGBoost), deep learning
(PyTorch, TensorFlow), experiment tracking, and communication of
findings through visualizations and reports. Data Science is upstream
of ML-Ops (productionizing models) and downstream of Data Engineering
(building the pipelines that supply clean data). Python and R are the
dominant languages; SQL is essential for data access.
Outgoing edges
contains9
- specialization:data-engineering-analytics·Specialization
- topic:context-aware-retrieval·TopicContext-Aware Retrieval
- topic:evidence-based-graph·TopicEvidence-Based Graph
- topic:experiment-driven-development·TopicExperiment-Driven Development
- topic:graph-clustering·TopicGraph Clustering
- topic:graph-embedding·TopicGraph Embedding
- topic:graph-provenance·TopicGraph Provenance
- topic:sparse-retrieval·TopicSparse Retrieval
- topic:data-mesh·TopicData Mesh
Incoming edges
applies_to77
- benchmark:agentbench·BenchmarkAgentBench
- benchmark:ds1000·BenchmarkDS-1000
- benchmark:mle-bench·BenchmarkMLE-bench
- language:julia·LanguageJulia
- language:r-lang·LanguageR
- skill-area:hyperparameter-tuning-experiment-management·SkillAreaHyperparameter Tuning and Experiment Management
- skill-area:game-analytics-monetization·SkillAreaGame Analytics and Monetization
- skill-area:blockchain-analytics-explorer·SkillAreaBlockchain Analytics and Explorer Setup
- skill-area:machine-learning-frameworks·SkillAreaMachine Learning Frameworks
- skill-area:deep-learning-libraries·SkillAreaDeep Learning Libraries and Services
- skill-area:audio-processing·SkillAreaAudio Processing Libraries and Services
- skill-area:video-processing·SkillAreaVideo Processing Libraries and Services
- skill-area:time-series-analysis·SkillAreaTime Series Analysis
- skill-area:visualization-testing·SkillAreaVisualization Testing
- skill-area:bias-fairness-analysis·SkillAreaBias and Fairness Analysis
- skill-area:explainability-interpretation·SkillAreaExplainability and Interpretation
- skill-area:data-visualization·SkillAreaData Visualization
- skill-area:natural-language-processing·SkillAreaNatural Language Processing
- skill-area:data-quality-testing·SkillAreaData Quality Testing
- skill-area:analytics-tracking·SkillAreaAnalytics and Tracking
- skill-area:data-preprocessing·SkillAreaData Preprocessing
- skill-area:computer-vision·SkillAreaComputer Vision
- skill-area:model-validation-testing·SkillAreaModel Validation Testing
- skill-area:ab-testing-experimentation·SkillAreaA/B Testing and Experimentation
- skill-area:training-data-testing·SkillAreaTraining Data Testing
- skill-area:bias-fairness-testing·SkillAreaBias and Fairness Testing
- skill-area:model-explainability-testing·SkillAreaModel Explainability Testing
- skill-area:data-analysis·SkillAreaData Analysis
- skill-area:statistical-analysis·SkillAreaStatistical Analysis
- skill-area:geospatial-data-analysis·SkillAreaGeospatial Data Analysis
- skill-area:statistical-testing·SkillAreaStatistical Testing
- skill-area:reproducibility-testing·SkillAreaReproducibility Testing
- skill-area:machine-learning·SkillAreaMachine Learning
- skill-area:epidemiological-modeling·SkillAreaEpidemiological Modeling
- skill-area:employee-engagement-analytics·SkillAreaEmployee Engagement Analytics
- skill-area:org-network-analysis·SkillAreaOrganizational Network Analysis
- skill-area:synthetic-data-generation·SkillAreaSynthetic Data Generation
- skill-area:experiment-design-AB-testing·SkillAreaExperiment Design & A/B Testing
- skill-area:embedding-optimization·SkillAreaEmbedding Optimization
- skill-area:a-b-testing·SkillAreaA/B Testing
- skill-area:product-analytics·SkillAreaProduct Analytics
- stack-profile:data-lakehouse·StackProfileData Lakehouse Stack (Databricks, Spark, Delta Lake, dbt, Airflow)
- stack-profile:analytics-dashboard·StackProfileAnalytics Dashboard Stack (React, D3, Recharts, Python, FastAPI, Grafana)
- stack-profile:synthetic-data-generation·StackProfileSynthetic Data Generation Stack (Python, PyTorch, FastAPI, PostgreSQL, S3)
- stack-profile:data-quality-governance·StackProfileData Quality / Governance Stack (Great Expectations, dbt, Airflow, PostgreSQL, Python)
- stack-profile:research-data-platform·StackProfileResearch Data Platform (Python, Jupyter, PostgreSQL, Boto3, FastAPI, React)
- stack-profile:ab-testing-platform·StackProfileA/B Testing Platform (Python, PostgreSQL, Redis, React, FastAPI, Prometheus)
- stack-profile:julia-data-service·StackProfileJulia Data Service (Julia, Python, PostgreSQL, Docker)
- stack-profile:python-ml-stack·StackProfilePython ML Stack (NumPy, Pandas, scikit-learn)
- stack-profile:data-lake-stack·StackProfileData Lake Stack (Spark, Object Storage, Delta/Iceberg)
- topic:evidence-based-graph·TopicEvidence-Based Graph
- topic:experiment-driven-development·TopicExperiment-Driven Development
- topic:graph-provenance·TopicGraph Provenance
- topic:embedding-pipeline·TopicEmbedding Pipeline
- topic:dense-retrieval·TopicDense Retrieval
- topic:sparse-retrieval·TopicSparse Retrieval
- topic:hybrid-retrieval·TopicHybrid Retrieval
- topic:re-ranking·TopicRe-Ranking
- topic:graph-rag·TopicGraph RAG
- topic:context-aware-retrieval·TopicContext-Aware Retrieval
- topic:graph-embedding·TopicGraph Embedding
- topic:graph-clustering·TopicGraph Clustering
- skill:xlsx-handling·SkillXLSX Handling
- skill:python-data-analysis·SkillPython Data Analysis
- skill:csv-analysis·SkillCSV Analysis
- role:data-scientist·RoleData Scientist
- role:data-visualization-specialist·RoleData Visualization Specialist
- role:staff-data-scientist·RoleStaff Data Scientist
- role:head-of-data·RoleHead of Data
- role:head-of-AI·RoleHead of AI
- role:business-intelligence-analyst·RoleBusiness Intelligence Analyst
- role:chief-data-officer·RoleChief Data Officer
- term:eval-result·TermEvalResult
- term:eval-run·TermEvalRun
- term:large-language-model·TermLarge Language Model
- term:ontology-schema·TermOntologySchema
- term:retrieval-augmented-generation·TermRetrieval-Augmented Generation
applies_to_domain44
- workflow:ml-model-lifecycle·WorkflowML Model Lifecycle
- workflow:crop-yield-forecasting·WorkflowCrop Yield Forecasting
- workflow:rag-pipeline-evaluation·WorkflowRAG Pipeline Evaluation
- workflow:synthetic-data-generation-pipeline·WorkflowSynthetic Data Generation Pipeline
- workflow:data-quality-scorecard-review·WorkflowData Quality Scorecard Review
- workflow:self-serve-analytics-enablement·WorkflowSelf-Serve Analytics Enablement
- workflow:privacy-impact-assessment·WorkflowPrivacy Impact Assessment
- workflow:enterprise-data-platform-health-check·WorkflowEnterprise Data Platform Health Check
- workflow:ai-powered-product-feature-review·WorkflowAI-Powered Product Feature Review
- workflow:ad-hoc-analysis-request-cycle·WorkflowAd-Hoc Analysis Request Cycle
- workflow:data-mesh-domain-ownership-review·WorkflowData Mesh Domain Ownership Review
- workflow:cdc-pipeline-validation·WorkflowCDC Pipeline Validation
- workflow:data-catalog-maintenance·WorkflowData Catalog Maintenance
- workflow:dbt-model-review·Workflowdbt Model Review
- workflow:model-fairness-audit·WorkflowModel Fairness Audit
- workflow:model-explainability-review·WorkflowModel Explainability Review
- workflow:dataset-versioning-governance·WorkflowDataset Versioning Governance
- workflow:data-pipeline-monitoring·WorkflowData Pipeline Monitoring
- workflow:data-governance-review·WorkflowData Governance Review
- workflow:data-warehouse-cost-optimization·WorkflowData Warehouse Cost Optimization
- workflow:ml-model-versioning-governance·WorkflowML Model Versioning Governance
- workflow:learning-outcomes-assessment·WorkflowLearning Outcomes Assessment
- workflow:renewable-energy-forecasting·WorkflowRenewable Energy Forecasting
- workflow:a-b-test-lifecycle·WorkflowA/B Test Lifecycle
- workflow:game-economy-balancing-review·WorkflowGame Economy Balancing Review
- workflow:game-analytics-instrumentation·WorkflowGame Analytics Instrumentation
- workflow:underwriting-model-validation·WorkflowUnderwriting Model Validation
- workflow:hypothesis-driven-experiment·WorkflowHypothesis-Driven Experiment
- workflow:model-training-cycle·WorkflowModel Training Cycle
- workflow:ab-experiment-lifecycle·WorkflowA/B Experiment Lifecycle
- workflow:feature-store-management·WorkflowFeature Store Management
- workflow:llm-eval-pipeline·WorkflowLLM Evaluation Pipeline
- workflow:hyperparameter-tuning-cycle·WorkflowHyperparameter Tuning Cycle
- workflow:data-labeling-pipeline·WorkflowData Labeling Pipeline
- workflow:model-card-maintenance·WorkflowModel Card Maintenance
- workflow:ml-experiment-tracking·WorkflowML Experiment Tracking
- workflow:research-notebook-reproducibility·WorkflowResearch Notebook Reproducibility
- workflow:experiment-reproducibility-review·WorkflowExperiment Reproducibility Review
- workflow:livestock-monitoring-pipeline·WorkflowLivestock Monitoring Pipeline
- workflow:fraud-detection-model-review·WorkflowFraud Detection Model Review
- workflow:actuarial-model-validation·WorkflowActuarial Model Validation
- workflow:legal-ai-bias-audit·WorkflowLegal AI Bias Audit
- workflow:data-pipeline-monitoring·WorkflowData Pipeline Monitoring
- workflow:data-governance-review·WorkflowData Governance Review
belongs_to_domain1
- topic:model-interpretability·TopicModel Interpretability
deferred_for1
- deferred:evidence-experiments·DeferredNodeEvidence & Experiments
lib_applies_to_domain80
- tool-server:mcp-aws-bedrock·ToolServerMCP AWS Bedrock
- tool-server:mcp-openai·ToolServerOpenAI MCP Server
- tool-server:mcp-huggingface·ToolServerHugging Face MCP Server
- tool-server:mcp-replicate·ToolServerReplicate MCP Server
- tool-server:mcp-jupyter·ToolServerJupyter MCP Server
- tool-server:mcp-neo4j·ToolServerNeo4j MCP Server
- tool-server:mcp-pinecone·ToolServerPinecone MCP Server
- tool-server:mcp-weaviate·ToolServerWeaviate MCP Server
- tool-server:mcp-qdrant·ToolServerQdrant MCP Server
- tool-server:mcp-haystack·ToolServerHaystack MCP Server
- tool-server:mcp-qdrant·ToolServerQdrant MCP Server
- tool-server:mcp-weaviate·ToolServerWeaviate MCP Server
- tool-server:mcp-pinecone·ToolServerPinecone MCP Server
- tool-server:mcp-chromadb·ToolServerChromaDB MCP Server
- tool-server:mcp-milvus·ToolServerMilvus MCP Server
- lib-agent:data-science-ml--ab-test-analyst·LibraryAgentab-test-analyst
- lib-agent:data-science-ml--automl-orchestrator·LibraryAgentautoml-orchestrator
- lib-agent:data-science-ml--data-engineer·LibraryAgentdata-engineer
- lib-agent:data-science-ml--deployment-engineer·LibraryAgentdeployment-engineer
- lib-agent:data-science-ml--distributed-training-engineer·LibraryAgentdistributed-training-engineer
- lib-agent:data-science-ml--drift-detective·LibraryAgentdrift-detective
- lib-agent:data-science-ml--eda-analyst·LibraryAgenteda-analyst
- lib-agent:data-science-ml--experiment-designer·LibraryAgentexperiment-designer
- lib-agent:data-science-ml--explainability-analyst·LibraryAgentexplainability-analyst
- lib-agent:data-science-ml--feature-engineer·LibraryAgentfeature-engineer
- lib-agent:data-science-ml--feature-store-engineer·LibraryAgentfeature-store-engineer
- lib-agent:data-science-ml--incident-responder·LibraryAgentincident-responder
- lib-agent:data-science-ml--integration-tester·LibraryAgentintegration-tester
- lib-agent:data-science-ml--ml-architect·LibraryAgentml-architect
- lib-agent:data-science-ml--ml-requirements-analyst·LibraryAgentml-requirements-analyst
- lib-agent:data-science-ml--model-evaluator·LibraryAgentmodel-evaluator
- lib-agent:data-science-ml--model-trainer·LibraryAgentmodel-trainer
- lib-agent:data-science-ml--retraining-orchestrator·LibraryAgentretraining-orchestrator
- lib-process:data-science-ml--ab-testing-ml·LibraryProcessab-testing-ml
- lib-process:data-science-ml--automl-pipeline·LibraryProcessautoml-pipeline
- lib-process:data-science-ml--data-collection-validation·LibraryProcessdata-collection-validation
- lib-process:data-science-ml--distributed-training·LibraryProcessdistributed-training
- lib-process:data-science-ml--eda-pipeline·LibraryProcesseda-pipeline
- lib-process:data-science-ml--experiment-planning·LibraryProcessexperiment-planning
- lib-process:data-science-ml--feature-engineering·LibraryProcessfeature-engineering
- lib-process:data-science-ml--feature-store·LibraryProcessfeature-store
- lib-process:data-science-ml--ml-architecture-design·LibraryProcessml-architecture-design
- lib-process:data-science-ml--ml-integration-testing·LibraryProcessml-integration-testing
- lib-process:data-science-ml--ml-observability·LibraryProcessml-observability
- lib-process:data-science-ml--ml-project-scoping·LibraryProcessml-project-scoping
- lib-process:data-science-ml--model-deployment-canary·LibraryProcessmodel-deployment-canary
- lib-process:data-science-ml--model-evaluation·LibraryProcessmodel-evaluation
- lib-process:data-science-ml--model-interpretability·LibraryProcessmodel-interpretability
- lib-process:data-science-ml--model-monitoring-drift·LibraryProcessmodel-monitoring-drift
- lib-process:data-science-ml--model-retraining·LibraryProcessmodel-retraining
- lib-process:data-science-ml--model-training-pipeline·LibraryProcessmodel-training-pipeline
- lib-skill:data-science-ml--alibi-explainer·LibrarySkillalibi-explainer
- lib-skill:data-science-ml--arize-observability·LibrarySkillarize-observability
- lib-skill:data-science-ml--bentoml-model-packager·LibrarySkillbentoml-model-packager
- lib-skill:data-science-ml--dvc-dataset-versioning·LibrarySkilldvc-dataset-versioning
- lib-skill:data-science-ml--evidently-drift-detector·LibrarySkillevidently-drift-detector
- lib-skill:data-science-ml--fairlearn-bias-detector·LibrarySkillfairlearn-bias-detector
- lib-skill:data-science-ml--feast-feature-store·LibrarySkillfeast-feature-store
- lib-skill:data-science-ml--great-expectations-validator·LibrarySkillgreat-expectations-validator
- lib-skill:data-science-ml--jupyter-notebook-executor·LibrarySkilljupyter-notebook-executor
- lib-skill:data-science-ml--kubeflow-pipeline-executor·LibrarySkillkubeflow-pipeline-executor
- lib-skill:data-science-ml--lime-explainer·LibrarySkilllime-explainer
- lib-skill:data-science-ml--mlflow-experiment-tracker·LibrarySkillmlflow-experiment-tracker
- lib-skill:data-science-ml--model-card-generator·LibrarySkillmodel-card-generator
- lib-skill:data-science-ml--optuna-hyperparameter-tuner·LibrarySkilloptuna-hyperparameter-tuner
- lib-skill:data-science-ml--pandas-dataframe-analyzer·LibrarySkillpandas-dataframe-analyzer
- lib-skill:data-science-ml--pytest-ml-tester·LibrarySkillpytest-ml-tester
- lib-skill:data-science-ml--pytorch-trainer·LibrarySkillpytorch-trainer
- lib-skill:data-science-ml--ray-distributed-trainer·LibrarySkillray-distributed-trainer
- lib-skill:data-science-ml--reproducibility-testing·LibrarySkillreproducibility-testing
- lib-skill:data-science-ml--seldon-model-deployer·LibrarySkillseldon-model-deployer
- lib-skill:data-science-ml--shap-explainer·LibrarySkillshap-explainer
- lib-skill:data-science-ml--sklearn-model-trainer·LibrarySkillsklearn-model-trainer
- lib-skill:data-science-ml--statistical-testing·LibrarySkillstatistical-testing
- lib-skill:data-science-ml--tensorflow-trainer·LibrarySkilltensorflow-trainer
- lib-skill:data-science-ml--time-series-analysis·LibrarySkilltime-series-analysis
- lib-skill:data-science-ml--training-data-testing·LibrarySkilltraining-data-testing
- lib-skill:data-science-ml--visualization-testing·LibrarySkillvisualization-testing
- lib-skill:data-science-ml--wandb-experiment-tracker·LibrarySkillwandb-experiment-tracker
- lib-skill:data-science-ml--whylabs-monitor·LibrarySkillwhylabs-monitor
requires_skill10
- role:marketing-manager·RoleMarketing Manager
- role:data-scientist·RoleData Scientist
- role:data-analyst·RoleData Analyst
- role:bi-developer·RoleBI Developer
- role:research-scientist·RoleResearch Scientist
- role:ml-engineer·RoleMachine Learning Engineer
- role:ux-researcher·RoleUX Researcher
- role:product-analyst·RoleProduct Analyst
- role:growth-pm·RoleGrowth Product Manager
- role:people-analytics-specialist·RolePeople Analytics Specialist
specializes2
- specialization:data-science-ml·Specialization
- specialization:data-engineering-analytics·Specialization