Winsorizing Outliers

Why a few extreme values wreck an average, and the formula that fixes it · ACCTG 5150-090 Week 4

The problem

A handful of extreme values can drag a mean far away from what is typical. In the Compustat data you will use this week, a few pre-revenue firms have wildly negative gross margins, so a sector's average margin can read negative even though the typical firm is healthy. The fix analysts use is winsorizing: cap every value at the 1st and 99th percentiles, then recompute. Below, watch the mean move back toward the median as you tighten the caps.

Live demo — 15 gross margins (two are outliers)

Drill 1 — the cutoff

The 99th-percentile margin is the value below which 99% of the margins fall. If your margins are the range named margins, write the formula that returns the 99th percentile.

Excel has no PERCENTILEIFS, and you do not need one here:
=PERCENTILE.INC(margins, 0.99) — swap 0.01 for the 1st-percentile floor.

Drill 2 — the cap

The 1st-percentile floor is in $P$1 and the 99th-percentile cap is in $P$2. Write the formula that caps the margin in B2 at both tails (no lower than the floor, no higher than the cap).

=MAX(MIN(B2, $P$2), $P$1)MIN pulls anything above the cap down to it; MAX lifts anything below the floor up to it.