Google Ads

A/B testing in Google Ads - how to run tests that produce reliable results

Adil Jain|Google Ads|2026-07-13

The experiment feature in Google Ads allows you to run split tests at the campaign level, dividing traffic between a control and a variant and measuring the performance difference. Used correctly, it is one of the most powerful tools for making evidence-based decisions. Used incorrectly, it produces misleading results that lead to poor account choices.

← Back to Field Notes

A proper A/B test answers a specific question with statistically reliable evidence. "Does changing the bidding strategy from Target CPA to Maximise Conversions improve CPA?" is a testable question. "What makes performance better?" is not. The specificity of the question determines whether you will learn something actionable from the test.

Google Ads Experiments - the mechanics

Google Ads Experiments allows you to create a draft version of a campaign, modify one element, and then run the original and the modified version simultaneously with traffic split between them. The platform tracks performance separately for each variant and calculates whether the observed difference is statistically significant. This is the correct way to test - not comparing before-and-after periods, which are confounded by seasonality, competitive changes, and other external factors.

What to test and what not to test

The highest-value experiments to run are bidding strategy changes and match type expansions. These are structural changes with significant potential impact that are difficult to evaluate without proper experimental design. Testing whether moving from Target CPA to Target ROAS improves efficiency, or whether expanding to broad match changes CPA, are questions that experiments answer with actual data rather than opinion.

Ad copy testing is better done through RSA asset performance data than through campaign experiments. The RSA system tests multiple asset combinations continuously and reports on relative performance - this is more efficient than running separate campaigns for individual copy variants.

The sample size and time requirements

Statistical significance requires sufficient sample size. Google Ads Experiments shows a significance indicator - wait until the experiment shows statistical significance before drawing conclusions. An experiment that runs for five days and shows one variant performing better but with low significance is not conclusive. Running experiments for a minimum of two full weeks, preferably four, and requiring 95 percent statistical significance before acting is the minimum standard for reliable results.

One variable at a time

Testing multiple changes simultaneously - changing bidding strategy, budget, and match type all at once - produces a result but not a learning. You know the combined package performed better or worse but you do not know which element drove the difference. Test one variable, learn from it, then test the next. This takes longer but produces accumulated knowledge about your specific account that compounds over time.

Found this useful?

Start a conversation - no pitch, no pressure.