Why Not Just Run
p Separate t-Tests?
Before we compute anything, we need to build the right intuition. This module shows you the core problem that Hotelling's T² was designed to solve.
The Family-Wise Error Rate
Suppose we have p = 3 variables and test each independently at α = 0.05. What is the probability that we make at least one false rejection, even if H₀ is true for all variables?
That's a 14.3% error rate — nearly 3× our intended α. As p grows, this compounds. Hotelling's T² tests all variables simultaneously, controlling the error rate precisely at α.
Configure & Make Your Prediction
Set your experimental parameters. Then — before seeing any data — make a prediction. This activates your prior knowledge and creates a meaningful moment of comparison later.
Explore Your Dataset
Before computing, develop visual intuition about the data. The goal is to form an informed guess about whether Ȳ differs meaningfully from μ₀.
Scatter Plots: Observations vs. μ₀
The dashed horizontal line marks μ₀ᵢ. Does the cloud of points appear centered on μ₀?
Compute the Sample Mean Vector Ȳ
The sample mean is your best estimate of the true population mean. The key question is: how far does Ȳ sit from μ₀?
The sample mean ȳᵢ is the arithmetic average of all n observations for variable i. As n grows, ȳᵢ converges to the true μᵢ (Law of Large Numbers). It is an unbiased estimator: E[ȳᵢ] = μᵢ.
Suppose y₁ = [4.2, 3.8, 4.6]. Then:
ȳ₁ = (4.2 + 3.8 + 4.6) / 3 = 12.6 / 3 = 4.2
If μ₀₁ = 4.0, then ȳ₁ − μ₀₁ = 0.2. Is that large? Depends on the variance!
Compute the Sample Covariance Matrix S
Variance tells us how spread out the data is — a key context for judging whether deviations from μ₀ are surprising. A large variance means the same gap is less evidence against H₀.
Dividing by (n−1) rather than n gives us an unbiased estimator of σᵢ². Using n would systematically underestimate the true population variance because we've already "used" the data to estimate ȳᵢ — consuming one degree of freedom.
S = diag(s₁, s₂, ..., sₚ) ← diagonal because variables are independent
y₁ = [4.2, 3.8, 4.6], ȳ₁ = 4.2
Deviations: (4.2−4.2)² = 0, (3.8−4.2)² = 0.16, (4.6−4.2)² = 0.16
s₁ = (0 + 0.16 + 0.16) / (3−1) = 0.32 / 2 = 0.16
Compute the T² Statistic
T² combines everything: how far the sample mean deviates from μ₀, scaled by the uncertainty in each variable, amplified by sample size.
T² is proportional to the squared Mahalanobis distance from Ȳ to μ₀. Unlike Euclidean distance, it accounts for the "scale" of each variable via S⁻¹. A large T² means Ȳ is many standard deviations away from μ₀ — jointly.
For diagonal S: T² = n · Σᵢ (ȳᵢ − μ₀ᵢ)² / sᵢ
Look Up the Critical Value T²p,α
The critical value is the threshold T² must exceed to reject H₀. It depends on df = n−1 and your chosen α. The highlighted row and cell are your target.
Under H₀, T² follows a distribution related to the F-distribution. The critical value T²p,α is chosen so that P(T² > T²p,α | H₀) = α. Rejecting when T² > T²p,α means we have only an α chance of being wrong.
Degrees of freedom = n − 1 = —. Significance level α = —.
The Decision & Reflection
The numbers are in. Now comes the most important step: making sense of what they mean.
What if α were different?
The same T² statistic can lead to different conclusions depending on α. Observe how the decision changes:
Where does T² fall in the null distribution?
The best way to consolidate statistical understanding is to explain what happened — without formulas.
Sensitivity Analysis:
Can You Flip the Decision?
The original decision was: —. Your challenge: change exactly one observation to reverse it. This reveals how influential individual data points can be.
Changing a single observation affects both ȳᵢ (the sample mean for variable i) and sᵢ (the variance). The effect on T² works through both: a shift in ȳᵢ changes the numerator (ȳᵢ − μ₀ᵢ)², while a change in sᵢ changes the denominator. These effects can compound or cancel.