Field note · December 31, 2025
How Teams Fail at Retention Testing (Without Realizing It)
How teams fail at retention testing (without realizing it)
I hear this all the time
"Our new build has a D1 of 35%."
And perhaps...
"That's 4% better than the previous build! We're headed in the right direction!"
Cool!
Just to be safe though... These tests were based on how many installs?
Because if the answer is "in the hundreds,"
We are likely getting fooled by statistical noise.
An example from a client this year
- Spends $500 in the US on a retention test
- Gets 140 installs at $3.50 apiece
- Measures D1 Retention at 35%
Now, at 140 installs, a 35% D1 read has HUGE error bars.
Within 90% confidence interval that "35%" could be
- as bad as 28%
- as good as 42%
So, what did we actually learn from the test?
Not much. So, $500 (and critically, time) was wasted.
What most teams consistently fail to do
The point of failure is forgetting to ask
"How precise do we need this test to be?"
And then
"Given that, How many installs are needed?
Most teams
- Don't ask what error bar sizes support test goals
- Don't calculate installs needed
- Allocate an arbitrary budget, like "Spend $500"
- Overpay for expensive installs
- Get too few installs for test goals
- Fail to interpret results as a probability distribution.
What competent testing looks like
- Choose Acceptable error bar size (±X%) per test goals
- Choose Confidence Interval (usually 90%)
- Calculate Required installs (use your favorite LLM)
- Buy the installs from cheapest Geo with valid signal
You buy traffic to meet the math, not your gut.
A simple example
- A team wants a Tier-1 D1R benchmark
- They aren't comparing to prior build
- Given this, ±3pp error seems acceptable (90% conf).
- Calculates required installs at 750.
- Team saves $ buying from Denmark, not USA
- Total cost is $975.
- Test delivers exactly what was requested.
Have you ever been fooled by retention tests?
Turbine helps mobile game & app publishers drive UA and product KPIs.
Work with Turbine View original on LinkedIn →