It looks like someone linked you here to our printer friendly page. Please make sure you go Back to Safehaven.com for more great articles just like this one!
Rogue Nowcasts: 67% Chance Both are Wrong
Salil Mehta at Statistical Ideas commented on my recent post that compared the Atlanta Fed GDPNow model to the FRBNY Nowcast model.
Mehta notes that both models claim accuracy within one percentage point. However, that's no longer mathematically possible given the difference between the two models is 2.3 percentage points.
Rogue one: Faithful GDP nowcasts by Salil Mehta
There is a 2/3 chance that both competing Federal Reserve 2017 Q1 GDP nowcasts are wrong! That's an audacious prediction for the storied NY and Atlanta institutions (one of them led by my former big boss Timothy Geithner), and yet there is no way around the current confusion they are in. This is also critically important as one is showing a robust 3.2% growth reading, while the other is at 0.9% (the 2nd lowest reading in nearly 3-years) and essentially indicates that we are descending towards recession.
Are we descending towards recession? While we don't forecast that, we certainly think there is only a single digit probability of a >3% GDP. How could the NY Fed plausibly give such a madly high estimate (which if true would be the second highest in 2-years)? Yet, there you have it, two extreme readings, and a 2.3% (3.2%-0.9%) chasm between them.
We show here that the Federal Reserve's conclusions are somewhat ridiculous, though shouldn't be since they impact the open market committee monetary decisions that the world looks to. And there are humbling lessons from these nascent Big Data, overfit models.
The chart here shows some basic information regarding the current GDP nowcasts. As we via the two blue bars, we have the Atlanta nowcast on the left (the bar was recently as high as 3.4% earlier this year). And the NY nowcast on the right (the bar was recently as low as 1.5%). That's right, both nowcasts passed each other, while aggressively moving further in the opposite direction! The large swings in each are also doubtful, given each nowcast's eventually advertised, margin of error. For a good chronology of these nowcast reports, refer to MishTalk.
Each nowcast boasts a margin of error of just ~1%, and this clearly poses a problem since the average of these two nowcasts (shown in orange at 2.1%) is clearly outside of both the Atlanta and the NY stated margin of error! As supportive reference, we also show (in green) that the current 2016 Q4 GDP is nearby at 1.9%. Now we should ask some important questions about how we keep getting into more strange nowcasts in the past year that they both have operated. The first thing to appreciate is that the nowcasts are supposed to predict very tight errors that are uncorrelated to the variance in the actual GDP itself. And good nowcasts should have errors independent of one another, except since the NY and Atlanta Fed operate independent of one another there is a good chance that there may be some modeling similarities. We modestly assume this and derive through the variance formula (VarianceAtlanta+VarianceNY+2?Atlanta?NY?Atlanta,NY) that the margin of error of the difference between the models is just less than 1% (silver vertical interval arrows in chart above). This is a highly plausible tight expected variance. Sample size is also trivial here as we don't have the true expectation to model a limit from. And with this, the probability of seeing an inadvertent 2.3% difference between the two correct Federal Reserve models is <5%. Or that their publicized margin of error is awkwardly too low (to the point we'll show that randomly guessing the GDP would be safer).
We also have the probability that one of the nowcasts is correct, which due to symmetry means applying one of the nowcast stated margin of errors to the other nowcast value. Or that the one nowcast is unintentionally correct, which would be like the <5% probability above. So, in total the probability of one of the Federal Reserve nowcasts being correct and the other being wrong is ~10% (not the >½ that many may generously assume as they critique these divergent outputs).
That leaves us with two other possible outcomes still! That is to split up the remaining 85% probability that both models are individually wrong into: (a) the average of the two models is still correct, and (b) even the average of both models is wrong. This leaves us with no choice but to conclude that the probability that there is ~40% probability that the correct 2016 Q1 GDP is nowhere near either nowcasts nor the average of the two, and a >½ probability that the GDP is near the 2.1% average that inappropriately happens to be well outside both two nowcasts' margin of error. And between those two we can safely claim that there is a 2/3 chance that both models are total wrong (and merely <5% chance they are still both right). There is perhaps a 30% chance one can smartly use both models in a deliberate way, though this is conditional on how they use the information and not at their endorsed specified face value.
Now a more practical assumption of this is that the margins of error should be more than doubled (to 2.6%!), in which case no one would even use such nowcast models. However, the probability breakdown in such a scenario is this:
- <30% chance both models -with their current 2.3% chasm- are correct
- ~60% chance one of the models is wrong and one is correct
- ~10% chance both models are still wrong individually, though the average in rare cases is correct
In all cases, all three probabilities sum to 100% as they should. And they lend a healthy sense of respect that incessantly observing each of these GDP nowcasts is commonly a waste of time, and that rarely will one gain insight from it other than from ex post luck. It's the same as the more senior open market committee models vainly attempt to forecast other macro-economic variables. Sometimes simply looking at the most recent quarter's GDP (in this case ~2%) is as good of a guess as any. So is giving a little more weight to the near-0 probability most have of an outright contraction this quarter. As a business CEO, one would want to be arranged for anything at this point, which a genuine 2.6% margin of error about GDP infers.
End Salil Mehta
I have been following this divergence for some time. I expect model revisions following the next GDP report.
I last covered the divergence in GDPNow Forecast Dips to 0.9%: Divergence with Nowcast Hits 2.3 Percentage Points – Why?
Here are my charts.
Neither report had a significant movement on February 24. Let's start there for a closer look.
- On March 1, Construction spending took 0.7 percentage points off GDPNow but only 0.046 percentage points off Nowcast.
- On March 2, light vehicle sales took 0.3 percentage points off GDPNow. Nowcast did not factor in light vehicles sales.
- On March 6, the manufacturing report took 0.2 percentage points off GDPNow. The same inventory report (different name) added 0.031 percentage points to Nowcast.
- On March 7, the import-export trade report did nothing for GDPNow. The import-export trade report added a net 0.032 to Nowcast.
- On March 10, the jobs report subtracted 0.3 percentage points from GDPNow. The Jobs report added 0.003 (nothing) to Nowcast.
- March 1, 2, 6, and 7 reports subtracted 1.2 percentage points from GDPNow.
- March 1, 2, 6, and 7 reports added 0.017 percentage points to Nowcast.
- The big difference is in how the models treat (or don't treat at all), construction spending and light vehicle sales.
- The jobs report decline (an additional 0.3 percentage point decline for GDPNow) may be related to a model change.
I have a simple rule: Whenever one of the models seems ridiculously high, I place a mental bet against that outcome.
Last quarter I leaned towards the Nowcast forecast. This quarter I think GDPNow will be much closer.
Many recent economic reports have been weak. And failure to take into consideration vehicle sales when they have been holding up retail sales seems like a mistake.
However, construction spending, which took 0.7 percentage points off GDP Now, is so volatile and so frequently revised that it is hard to have much faith in that particular slide.
Finally, one additional problem with both models is they do not think. For a synopsis, please see Formulas Don't Think: Investigating Weather-Related GDP.