Monday, September 17, 2012

Forecasts At the Airport (a Case Against Knowing Uncertainty)

The departure board at London Gatwick airport currently lists all flights as being on time. Some of these are probably lies, and knowing lies at that. 

When planes are delayed or cancelled, the passengers are only let in on this once there is virtual certainty that the flight will be off schedule. Even then, the passengers are updated through a slow creep in the numbers. First, a delay of five minutes is announced, then twenty and then…


When compared to what actually happens, these “predictions” are clearly biased. The forecasts are always too optimistic. Planes never leave earlier than the advertised times.

Someone in the airport (e.g. in the flight control tower) has unbiased information about when the planes will likely take off but there is a good reason to not give that data directly to the public.

Imagine a parallel universe without bias on the departure board, where half the planes leave early and half leave late. Passengers would check the board and perhaps some would decide they have enough time to get another coffee. It would be clearly upsetting to come back “on time” and discover the plane already left. People also get very mad when buses or trains run ahead of schedule.

A recent NYT article “The Weatherman Is Not a Moron” described the river-forecasting equivalent of such a catastrophic “missed the flight” scenario:

The Weather Service has struggled over the years with how much to let the public in on what it doesn’t exactly know. In April 1997, Grand Forks, N.D., was threatened by the flooding Red River … [The Weather Service] predicted that the Red would crest to 49 feet, close to the record…The waters, in fact, crested to 54 feet. It was well within the forecast’s margin of error, but enough to overcome the levees and spill more than two miles into the city… The Weather Service had explicitly avoided communicating the uncertainty in its forecast to the public, emphasizing only the 49-foot prediction. The forecasters later told researchers that they were afraid the public might lose confidence in the forecast if they had conveyed any uncertainty.

Times have changed….  

Since [the Grand Forks flood], the National Weather Service has come to recognize the importance of communicating the uncertainty in its forecasts as completely as possible… “No forecast is complete without some description of that uncertainty.” [said Max Mayfield] 

Still, just like how there are biases in flight departure times, there are biases in the weather forecasts:

In what may be the worst-kept secret in the business, numerous commercial weather forecasts are also biased toward forecasting more precipitation than will actually occur… For years, when the Weather Channel said there was a 20 percent chance of rain, it actually rained only about 5 percent of the time.

People don’t mind when a forecaster predicts rain and it turns out to be a nice day. But if it rains when it isn’t supposed to, they curse the weatherman for ruining their picnic. “If the forecast was objective, if it has zero bias in precipitation,” Bruce Rose, a former vice president for the Weather Channel, said, “we’d probably be in trouble.”

In flood forecasting, there seems to be a tolerance for false alarms. Occasionally crying wolf is not bad compared to letting the wolf in to make a big mess of everything (even if only once).

However, whose responsibility is it to make sure the wolf stays out? It is the airport’s responsibility to give good information to the passengers. It is the passenger’s responsibility to be at the gate on time. Similarly, the forecaster is not responsible for the flood damages, but he has a duty to provide good information to decision makers and the public. But don’t people with more information make better decisions? Is it fair to restrict what the public knows? Do more people catch their flights because they don’t know the whole story of what could happen?  

Max Mayfield said in the NYT article that “No forecast is complete without some description of… uncertainty.” Scientists (myself included) are falling over themselves to come up with new and better ways of quantifying forecast uncertainty; this is one of today’s most active research topics in hydrology (and meteorology and climate change). There are stacks of reports going back more than thirty years saying that forecasts that communicate uncertainty have more value than forecasts that only give one number (the river will reach 52 feet).

Some of the more informative alternatives include a credible range (there is a 90% chance that the river will reach between 48 and 55 feet) or the chance of a relevant threshold (there is a 35% chance of the river going above the levees) or an ensemble (any of the following scenarios could happen: 48 feet, 49 feet, 51 feet, 54 feet…).

Imagine the confusion and frustration if the airport departure board listed ensembles:

London to Paris, possible departure times include 8:30, 8:32, 8:37, 8:40, 8:55.

It may be entirely true that there is a “90% chance this flight will leave between 8:30 and 8:45” but is this enough information to help the user make a decision? Although ensemble forecasts are technically feasible, river forecasters often feel like they are shirking their duties by asking users to wade through a mass of possible scenarios. Yet, scientists bristle at the idea of giving users only one number. “You can’t pick one number, because that depends on the risk tolerance of the user and every user is different.” What is a conservative forecast for one user is risky for another.

For example, my return flight to London has a tight connection. The board of departures says “what is the earliest possible time that the planes could leave?” I could plan better if I knew the latest possible time the flight could leave (or, say, the chances of being delayed by more than 15 minutes). I could decide for myself if that was an acceptable risk. I resent not knowing.

But I’m not going to get that information however. Why not?

It is partly because most people are terribly inexperienced at thinking about chances in their daily lives. Answer this question: 

“I am 80% confident that the average distance between the centers of the earth and the moon is between ____ and ____ kilometers/miles.”

Try it. Write down your range. The answer is here.

So far, I have asked 40 scientists and operational forecasters to give their ranges. If people were good at quantifying their uncertainty then about 32 people (80%) would give ranges that contain the true distance. Instead, only 10 people (25%) have. This means that people (even those that predict for a living) give too narrow of a range and are overconfident in what they think they know.

Therefore, even if they were armed with the information “there is a 75% chance of the flight leaving in 10 minutes”, passengers would be largely unprepared to come up with the rest of the information necessary to make a good decision (e.g. in those 10 minutes, there is a 80% chance I could successfully get coffee, 90% chance of getting to the bathroom, 40% chance of getting both coffee and bathroom, and so on).

That said, with practice and feedback, people get better at sizing up risks. Once they learn that they are overconfident, people start to widen the range of their guesses. However, it takes some discipline to look back at past forecasts. Also, the feedback loop rarely closes in practice, especially if it is something that people do infrequently (e.g. navigate an unfamiliar airport, protect against an unprecedented flood).

But I’ll start the process by collecting my first data point. My flight was supposed to leave at 8:20. We touched off at 8:24. I probably wouldn’t have had time for another coffee.

1 comment:

  1. Tom - Greetings from the Las Vegas (Nev) airport, where I'm recalling this fascinating post as I deal with a subtly changing flight time estimate issue.
    My connection was originally scheduled to leave at 1:50 p.m. I'd signed up for the text notification service, and this morning around 8:00 a.m. Albuquerque time, before I had left, I received a text informing me the flight was delayed, and a current estimated departure time of 2:25 p.m. Note that this is 7 hours before flight time. Clearly they have some useful forecast information that at this point has given them some confidence.

    Then 90 minutes later, as we were driving to the Albuquerque airport, a second text arrived changing the departure time to 2:05 p.m. (15 minutes closer to the scheduled departure time). Upon my arrival in LV, the bank of computers had pushed the estimate back to 2:15 p.m. Now, approximately an hour before scheduled departure, they've moved it to 2:20 p.m.

    Based on the logic of your analysis here, I have assumed all along that this is the earliest possible departure time because, much like a Weather Channel customer more likely to be harmed by rainfall on the wet side of the probability distribution, I am far likely to be harmed if I'm out grabbing that second sandwich when the plane leaves.

    I'll pop back in and let you know how this plays out.

    P.S. I'm very much enjoying Nate Silver's book