Effect Size, Power, and Sample Size in Psychology: How They Fit Together

Students often meet effect size, power, and sample size as three separate annoyances, usually while a deadline is already looking unpleasant. They make more sense when treated as parts of the same problem. Change one, and the others move with it.

Why these three keep showing up together

Effect size, power, and sample size are tied together because they all shape how likely your study is to detect the pattern you care about.

If you expect a large effect, you usually need fewer participants to detect it. If you expect a small effect, you usually need more. If you want higher power, you usually need more participants again. If you use a stricter alpha level, the sample often has to grow as well.

So when students ask, “How many participants do I need?”, the annoying but honest answer is usually, “That depends on what effect you expect, how much certainty you want, and how strict you plan to be.”

What effect size actually means

Effect size is about magnitude. It tells you how large a difference or relationship is, not just whether something might be statistically significant.

For t-tests, the effect size is often expressed as Cohen’s d. Rough conventions are commonly described like this:

  • d = 0.20 as small

  • d = 0.50 as medium

  • d = 0.80 as large

These are only rough guides. They are not sacred numbers handed down from the statistical mountain. In some areas of psychology, a “small” effect may still matter a great deal. In others, a supposedly “medium” effect may be wildly optimistic.

That optimism matters, because overestimating the effect size is one of the easiest ways to end up with an underpowered study and a deeply irritating result.

What statistical power means

Power is the probability that your study will detect an effect if that effect is really there.

A common target is .80, which means an 80% chance of detecting the effect under the assumptions you specified. Some researchers aim for .90, especially when they want stricter planning. Higher power is attractive, but it costs sample size.

That is the trade-off in plain form. More certainty usually means more people, more time, and more admin. Statistics has never been especially sentimental about this.

What sample size is doing in all this

Sample size is the practical lever. It is the part you can usually change most easily, at least in principle.

If you hold alpha and power constant, the required sample size gets bigger as the expected effect gets smaller. This is why small effects are expensive. They are harder to detect cleanly, so the study needs more data to separate signal from noise.

That is also why many student projects quietly lean toward unrealistically large expected effects. Smaller samples look friendlier on paper. Reality tends to be less cooperative.

How they fit together

The relationship is easier to grasp if you think of it like this:

A small expected effect means you need more evidence to detect it.
More evidence usually means a larger sample.
If you also want high power, the sample grows again.
If you tighten alpha, the sample grows again.

So the broad pattern is not mysterious:

  • smaller effect sizes push sample size up

  • higher power pushes sample size up

  • stricter alpha pushes sample size up

This is why one fixed “correct sample size” almost never exists in the abstract. It depends on the assumptions you are making.

A simple example

Imagine you are planning an independent-samples t-test.

If you expect a medium effect, say d = 0.50, with power set at .80 and alpha at .05, the required sample will be much smaller than if you expect a small effect, say d = 0.20, with those same settings.

Nothing magical happened there. The expected effect just got harder to detect, so the study had to become larger to compensate.

What students often get wrong

One common mistake is treating effect size conventions as if they automatically fit every topic. They do not.

Another is entering whatever effect size produces a manageable sample, then calling that “planning.” That is less planning and more statistical wishful thinking in a lab coat.

A third is forgetting that paired designs and independent designs are not interchangeable. Paired designs can sometimes need fewer observations because they use related measurements, but only when the design genuinely supports that structure.

It is also worth saying that post hoc power talk is often less helpful than students hope. Power is most useful during planning, before the data are collected, not as a decorative flourish added after the fact.

So what should you do in practice?

Start with the design you are actually using. Do not choose the design that gives the nicest sample size unless you have also discovered a way to choose reality.

Then use the best effect size estimate you can justify. That might come from prior studies, a meta-analysis, a pilot, or a reasoned convention if nothing better exists.

After that, choose your target power and alpha deliberately rather than just accepting whatever number wanders into the box.

Then look at the required sample and ask the awkward but useful question: is this study realistically doable?

If the answer is no, that is still useful information. It may mean you need a narrower question, a stronger design, a different outcome measure, or a more honest sense of what the project can actually support.

Use the visualizer below

The helper below lets you see how required sample size changes when effect size, power, and alpha shift. It is built for t-test planning and simple intuition-building, not for every design ever invented.

Effect Size, Power, and Sample Size Visualizer (Free)

Free Helper

Effect Size, Power, and Sample Size Visualizer

Use this helper to see how required sample size changes when effect size, power, and alpha shift. It is built for straightforward t-test intuition and planning rather than every possible design.

Set your assumptions

Choose a t-test design, set your alpha and power, then pick the effect size you want to highlight.

What this shows: the chart helps you see the trade-off. Smaller effects need more people. Higher power usually needs more people. Stricter alpha usually needs more people too. Statistics can be terribly repetitive once it finds a pattern it likes.

This marks the current scenario on the chart and in the summary cards below.

Your visual summary

The cards show the highlighted scenario. The chart shows how the required sample changes across a range rather than pretending your chosen number arrived by divine revelation.

Highlighted effect size
Analyzable sample needed
Design assumed
What the highlighted scenario means
Update the visualizer to see the summary.
Quick reference points
Effect size Required sample
Small (d = 0.20)
Medium (d = 0.50)
Large (d = 0.80)
See Simply Put Premium

This free helper is for t-test intuition and basic planning. Premium is where it makes more sense to put broader designs and fuller guidance.

What Premium does better
  • Free: visual intuition for t-test planning
  • Premium: broader support for more designs and more detailed guidance
  • Premium: better once your study stops looking like the clean little example from the methods textbook

Simply Put

Effect size, power, and sample size are not three separate boxes to fill in because the software asked nicely. They are part of the same planning logic.

Small effects require bigger studies. Higher power usually requires bigger studies. Stricter thresholds usually require bigger studies. Once that clicks, sample size planning stops looking like random bureaucracy and starts looking like arithmetic with consequences.

References

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.

Field, A. P. (2018). Discovering statistics using IBM SPSS Statistics (5th ed.). Sage.

Lakens, D. (2022). Sample size justification. Collabra: Psychology, 8(1), Article 33267. https://doi.org/10.1525/collabra.33267

Table of Contents

    JC Pass

    JC Pass, MSc, is a social and political psychology specialist and self-described psychological smuggler; someone who slips complex theory into places textbooks never reach. His essays use games, media, politics, grief, and culture as gateways into deeper insight, exploring how power, identity, and narrative shape behaviour. JC’s work is cited internationally in universities and peer-reviewed research, and he creates clear, practical resources that make psychology not only understandable, but alive, applied, and impossible to forget.

    Previous
    Previous

    How to Cut Word Count Without Ruining Your Psychology Essay

    Next
    Next

    How to Interpret Means and Standard Deviations in Psychology