Lecture 4: Inferential Statistics

Note

This note was transcribed by Claude.

Overview

Lecture 4 (03.03.2026) focused on inferential statistics — how to determine whether observed differences between groups are statistically significant. The lecturer described this as the most difficult session in the course. The lecture built directly on Lecture 3, which covered descriptive statistics and the Shapiro-Wilk normality test.

Recap from Lecture 3

Students learned descriptive statistics and applied the Shapiro-Wilk normality test in Jamovi.
Key threshold: p = 0.05
- Shapiro-Wilk p-value > 0.05 = data is normally distributed (parametric)
- Shapiro-Wilk p-value < 0.05 = data is non-normally distributed (non-parametric)

Core Concept: Why Inferential Statistics?

Running example: comparing Benfica’s expected goals (xG) against opposition xG across a season.

Descriptive statistics can show one group has a higher average than another.
However, a higher mean alone does not prove the difference is meaningful.
Inferential statistics tests whether the difference is statistically significant — i.e., unlikely to have occurred by chance.
Significance threshold: p < 0.05.

Three Types of Inferential Analysis

Differences (comparing means) — e.g., comparing Benfica xG vs. opposition xG. Main focus of this lecture.
Correlations — e.g., testing whether more passes leads to higher xG. Values close to 1 = strong positive, 0 = no relationship, negative = inverse.
Categories (frequencies) — e.g., counting whether a team performs more out-swing or in-swing corners.

The Statistical Decision Tree

The central framework of the lecture. Three questions guide which test to use:

Question 1: Same subject or different subjects?

Same subject (dependent/paired): Comparing the same entity under different conditions.
- Benfica at home vs. Benfica away
- A player’s performance in league vs. cup
- Benfica in Champions League vs. domestic league
Different subjects (independent): Comparing different entities.
- Benfica vs. Porto
- Benfica xG vs. opposition xG
- Messi vs. Ronaldo vs. Neymar

Question 2: How many groups?

Two groups = t-test variant
Three or more groups = ANOVA variant

Question 3: Parametric or non-parametric?

Determined by Shapiro-Wilk test. Important rule: if comparing two groups and one is normal but the other is non-normal, treat the overall comparison as non-parametric.

Complete Decision Tree

Two Groups

Subject type	Distribution	Test
Different subjects (independent)	Parametric	Independent samples t-test (Student’s)
Different subjects (independent)	Non-parametric	Mann-Whitney U test
Same subject (dependent/paired)	Parametric	Paired samples t-test
Same subject (dependent/paired)	Non-parametric	Wilcoxon signed-rank test

Three or More Groups

Subject type	Distribution	Test
Different subjects (independent)	Parametric	One-way ANOVA
Different subjects (independent)	Non-parametric	Kruskal-Wallis H test
Same subject (dependent/paired)	Parametric	Repeated measures ANOVA
Same subject (dependent/paired)	Non-parametric	Friedman test

Examples Used to Illustrate

Scenario	Subject type	Groups	Example test
Benfica xG vs. opposition xG	Different	2	Mann-Whitney U (one normal, one non-normal)
Benfica home vs. away	Same	2	Paired t-test or Wilcoxon
Benfica win vs. draw vs. loss	Same	3	Repeated measures ANOVA or Friedman
Benfica vs. Porto vs. Sporting	Different	3	One-way ANOVA or Kruskal-Wallis H
Benfica home vs. Porto away	Different	2	Independent t-test or Mann-Whitney U

Data Organization in Excel

For same-subject (paired) comparisons

Data organized side by side in columns (e.g., Column A = home values, Column B = away values)
Each row = a matched pair (1st home game with 1st away game, etc.)
Number of observations in each column must be equal

For different-subject (independent) comparisons

Data in a single column with all values stacked
A grouping variable column identifies which group each value belongs to (e.g., “Benfica” or “Opposition”)
In Jamovi, use the “Split by” function for descriptive statistics

Practical data preparation steps

Start with full dataset in Excel
Filter to relevant competition (e.g., league only)
Filter to relevant team
Add column for condition variable (e.g., “Home” / “Away”)
Keep only the dependent variable column(s) needed
Reorganize into appropriate format
Remove all filters before importing to Jamovi
Delete extraneous columns

Practical Exercise

15-minute timed exercise:

Task: Compare FC Porto’s ball possession at home vs. away (2023-24 Liga Portugal, 10 games each)
Dependent variable: Possession percentage
Independent variable: Match venue (home/away)
Analysis type: Same subject, two groups (paired)
Steps:
1. Organize Excel data (filter Porto league matches, label home/away, extract possession)
2. Import into Jamovi
3. Run Shapiro-Wilk normality test on both conditions
4. Select appropriate test based on normality results
5. Determine if there is a statistically significant difference

Key Terminology

Term	Definition
Dependent variable	The variable being measured (e.g., possession, xG)
Independent variable	The condition or grouping factor (e.g., home/away)
Parametric data	Follows normal distribution (Shapiro-Wilk p > 0.05)
Non-parametric data	Does not follow normal distribution (Shapiro-Wilk p < 0.05)
Statistical significance	p < 0.05 on inferential test
Paired/Dependent	Comparing the same entity across conditions
Independent	Comparing different entities

Software and Tools

Tool	Purpose
Jamovi	Primary statistical software for all analyses
Microsoft Excel / Office 365	Data preparation before importing to Jamovi
Moodle	Lecture recordings and materials

What Comes Next

Next lecture will cover effect size — measuring the magnitude of a difference, beyond just whether it is significant.
Five analysis aims to complete across practical exercises.

Hai Notes

Explorer

Lecture 4: Inferential Statistics

Overview

Recap from Lecture 3

Core Concept: Why Inferential Statistics?

Three Types of Inferential Analysis

The Statistical Decision Tree

Question 1: Same subject or different subjects?

Question 2: How many groups?

Question 3: Parametric or non-parametric?

Complete Decision Tree

Two Groups

Three or More Groups

Examples Used to Illustrate

Data Organization in Excel

For same-subject (paired) comparisons

For different-subject (independent) comparisons

Practical data preparation steps

Practical Exercise

Key Terminology

Software and Tools

What Comes Next

Graph View

Table of Contents

Backlinks