1. Stabilize Before Optimizing
- Verify hardware and management-plane integrity first.
- Confirm firmware/software baseline consistency.
- Only then run performance tuning decisions.
Protected
NCP-ADS module content is available after admin verification. Redirecting…
If you are not redirected, login.
Access
Admin only
NCP-ADS module pages are restricted to admin users.
Training / NCP-ADS
Module study guide
Priority 6 of 6 · Domain 1 in exam order
Scope
This module contains expanded study notes, practical drills, and an exam-style question set.
Exam Framework
Exam Scope Coverage
This module is aligned to Domain 1 scope: exploratory data analysis, summary statistics, probability distributions, hypothesis testing, correlation/covariance, and model evaluation metrics.
EDA questions test whether you can quickly profile data quality, distribution shape, and data issues before modeling.
Drill: Take one medium-size tabular dataset and produce a one-page EDA summary with missing-data, distribution, and relationship sections.
The exam checks whether you can choose statistics that match data shape and noise profile.
Drill: For the same feature, compare mean/std vs median/IQR and explain which pair you trust and why.
Distribution assumptions drive the validity of tests, confidence statements, and many modeling decisions.
Drill: Assess two features, state plausible distributions, and justify with visual and summary evidence.
You may be asked to pick appropriate tests and correctly interpret p-values and decision thresholds.
Drill: Run one two-sample test and one contingency-table test, then interpret statistical and practical significance separately.
Correlation questions are common and frequently mixed with causality traps.
Drill: Compute covariance and correlation matrices and list one misleading interpretation you intentionally avoid.
Metric selection errors cause wrong model conclusions even when pipelines run correctly.
Drill: Given one imbalanced classification output, choose a metric set and justify tradeoffs for a production-like threshold.
Concept Explanations
Prioritize decisions in this order: safety and hardware integrity, baseline consistency, controlled validation, then optimization.
Treat every key action as evidence-producing: command, output, timestamp, and expected vs observed behavior.
Use a fixed order when reading any dataset: integrity checks, distribution checks, relationship checks, then metric-aligned conclusions.
Select tests using data type and assumptions first, then interpret p-value and effect size separately.
In imbalanced classification, threshold-aware metrics and error-cost framing are more reliable than accuracy.
Scenario Playbooks
A binary anomaly model shows strong validation accuracy, but production users report many missed anomalies. You need to re-evaluate analysis and metric strategy.
Architecture Diagram
[Raw Events] -> [EDA + Label Audit] -> [Train/Validation Split]
|
[Threshold + Metrics]
|
[Production Alerts] Response Flow
Success Signals
Threshold sweep for confusion matrix
python - <<'PY'
from sklearn.metrics import confusion_matrix
import numpy as np
# y_true, y_score loaded earlier
for t in [0.2, 0.4, 0.6, 0.8]:
y_pred = (y_score >= t).astype(int)
print(t, confusion_matrix(y_true, y_pred).ravel())
PY Expected output (example)
0.2 [tn fp fn tp]
0.4 [tn fp fn tp]
... A feature shows high correlation with target in one slice. You must decide whether it is stable enough for production decisions.
Architecture Diagram
[Feature Table] -> [Correlation Matrix] -> [Segment Stability Check]
-> [Confounder Review]
-> [Decision Memo] Response Flow
Success Signals
CLI and Commands
Run a fast quality and distribution baseline before selecting any statistical test.
Schema/null/duplicate profile
python - <<'PY'
import pandas as pd
df = pd.read_parquet('dataset.parquet')
print(df.shape)
print(df.isna().mean().sort_values(ascending=False).head(10))
print('duplicates=', df.duplicated().sum())
PY Expected output (example)
(1200000, 42)
feature_a 0.182
feature_b 0.104
duplicates= 3912 Distribution and tail check
python - <<'PY'
import pandas as pd
df = pd.read_parquet('dataset.parquet')
print(df['amount'].describe(percentiles=[0.5,0.9,0.99]))
PY Expected output (example)
count ...
50% 42.1
90% 381.3
99% 3210.7 Compare threshold-dependent and threshold-independent metrics for imbalanced labels.
PR-AUC and ROC-AUC check
python - <<'PY'
from sklearn.metrics import average_precision_score, roc_auc_score
print('pr_auc=', average_precision_score(y_true, y_score))
print('roc_auc=', roc_auc_score(y_true, y_score))
PY Expected output (example)
pr_auc= 0.41
roc_auc= 0.89 Thresholded precision/recall/F1
python - <<'PY'
from sklearn.metrics import precision_recall_fscore_support
for t in [0.3,0.5,0.7]:
y_pred=(y_score>=t).astype(int)
p,r,f,_=precision_recall_fscore_support(y_true,y_pred,average='binary')
print(t, round(p,3), round(r,3), round(f,3))
PY Expected output (example)
0.3 0.22 0.81 0.346
0.5 0.37 0.58 0.452
0.7 0.61 0.34 0.437 Common Problems
Symptoms
Likely Cause
Split discipline was applied late, allowing future or target-adjacent information into training features.
Remediation
Prevention: Require a split and leakage checklist before approving model comparisons.
Symptoms
Likely Cause
Metric strategy over-relied on accuracy and default threshold, which masked minority-class failures.
Remediation
Prevention: Define primary metric family by class balance before any model training begins.
Lab Walkthroughs
Select and defend the right statistical test for two-group comparison.
Prerequisites
Profile both groups for shape and variance behavior.
python - <<'PY'
import pandas as pd
from scipy import stats
df=pd.read_parquet('dataset.parquet')
a=df[df.grp==0]['metric']
b=df[df.grp==1]['metric']
print(stats.describe(a))
print(stats.describe(b))
PY Expected: You identify skew/outlier behavior and decide if parametric assumptions are acceptable.
Run chosen test and record p-value plus effect size.
python - <<'PY'
from scipy import stats
stat,p=stats.ttest_ind(a,b,equal_var=False)
print('p=',p)
print('mean_delta=', b.mean()-a.mean())
PY Expected: Decision report includes both significance and practical magnitude.
Write final decision statement with assumptions and caveats.
Expected: Statement clearly separates statistical evidence from operational recommendation.
Success Criteria
Choose a deployment threshold using confusion-matrix tradeoffs.
Prerequisites
Sweep thresholds and capture precision, recall, and F1.
python - <<'PY'
from sklearn.metrics import precision_recall_fscore_support
for t in [x/10 for x in range(1,10)]:
y_pred=(y_score>=t).astype(int)
p,r,f,_=precision_recall_fscore_support(y_true,y_pred,average='binary')
print(t,p,r,f)
PY Expected: You obtain a threshold table that reveals tradeoff knees.
Pick candidate threshold and verify confusion matrix impact.
python - <<'PY'
from sklearn.metrics import confusion_matrix
t=0.4
y_pred=(y_score>=t).astype(int)
print(confusion_matrix(y_true,y_pred))
PY Expected: Chosen threshold aligns with acceptable miss rate and alert load.
Publish threshold recommendation with rationale.
Expected: Recommendation includes metric evidence plus cost-based reasoning.
Success Criteria
Study Sprint
| Day | Focus | Output |
|---|---|---|
| 1 | Dataset inventory and schema-quality audit (types, nulls, duplicates). | EDA setup notebook and baseline data-quality report. |
| 2 | Univariate analysis with robust and classical statistics. | Feature summary sheet with mean/median/IQR comparisons. |
| 3 | Bivariate analysis and relationship mapping. | Correlation map plus caveats and confounder notes. |
| 4 | Distribution checks and sampling assumptions. | Distribution-assumption log for key variables. |
| 5 | Hypothesis testing practice (parametric and nonparametric choices). | Decision table: test selection, p-values, effect-size interpretation. |
| 6 | Classification metric deep dive on imbalanced labels. | Metric dashboard with confusion-matrix and PR/ROC analysis. |
| 7 | Regression metric and residual diagnostics. | Residual-analysis memo and metric tradeoff summary. |
| 8 | Anomaly and threshold calibration exercise. | Threshold strategy note with false-positive cost assumptions. |
| 9 | Timed mini-case combining EDA and evaluation decisions. | End-to-end case writeup with defensible conclusions. |
| 10 | Final revision and weak-area drills. | Exam-day cheat sheet for statistics and metrics decisions. |
Hands-on Labs
Each lab includes a collapsed execution sample with representative CLI usage and expected output.
Produce a repeatable EDA checklist for tabular datasets under time pressure.
Sample Command (EDA integrity quick pass)
python - <<'PY'
import pandas as pd
df = pd.read_parquet('dataset.parquet')
print(df.shape)
print(df.isna().mean().sort_values(ascending=False).head(10))
print('duplicates=', df.duplicated().sum())
PY Expected output (example)
(1200000, 42)
feature_a 0.182
feature_b 0.104
duplicates= 3912 Practice selecting and interpreting hypothesis tests correctly.
Sample Command (EDA integrity quick pass)
python - <<'PY'
import pandas as pd
df = pd.read_parquet('dataset.parquet')
print(df['amount'].describe(percentiles=[0.5,0.9,0.99]))
PY Expected output (example)
count ...
50% 42.1
90% 381.3
99% 3210.7 Avoid common interpretation errors in covariance/correlation analysis.
Sample Command (Classification metric decision runbook)
python - <<'PY'
from sklearn.metrics import average_precision_score, roc_auc_score
print('pr_auc=', average_precision_score(y_true, y_score))
print('roc_auc=', roc_auc_score(y_true, y_score))
PY Expected output (example)
pr_auc= 0.41
roc_auc= 0.89 Align evaluation metrics to business and deployment constraints.
Sample Command (Classification metric decision runbook)
python - <<'PY'
from sklearn.metrics import precision_recall_fscore_support
for t in [0.3,0.5,0.7]:
y_pred=(y_score>=t).astype(int)
p,r,f,_=precision_recall_fscore_support(y_true,y_pred,average='binary')
print(t, round(p,3), round(r,3), round(f,3))
PY Expected output (example)
0.3 0.22 0.81 0.346
0.5 0.37 0.58 0.452
0.7 0.61 0.34 0.437 Exam Pitfalls
Practice Set
Attempt each question first, then open the answer and explanation.
Answer: A
Precision and recall focus on minority-class performance and error tradeoffs that accuracy can hide.
Answer: B
Covariance reflects joint variation in original units, while correlation normalizes for comparability.
Answer: A
Statistical significance is a decision under assumptions; it does not guarantee large real-world effect.
Answer: B
Median is less affected by extreme values and often better reflects central tendency in skewed data.
Answer: B
ROC-AUC evaluates separability over threshold ranges, not one fixed decision threshold.
Answer: B
Clear hypotheses and assumption checks prevent invalid test selection and interpretation.
Answer: C
Correlation measures association strength and direction, but causal claims require additional evidence.
Answer: B
Using both reveals average error and sensitivity to large misses.
Answer: B
Single aggregate summaries can mask subgroup patterns and lead to weak conclusions.
Answer: B
Operational thresholds should reflect business cost asymmetry and validated performance.
Primary References
Curated from official documentation and high-signal references.
Objectives
Navigation