# Fermi Problem Solving for Data Scientists

How many new tires can be sold in the Philadelphia area just prior to its first snow storm? How many people will die from the next pandemic that infects North America? What is the global revenue protential for a new medical app on the iPad Pro that helps first time parents with their new born child? There are relatively simple questions that data scientists are often asked to address.

As simple as they might seem, the real world is fraught with networks of complexity, while at the same time, data scientist are often accused of overthinking solutions as they try to make sense of it. Even the simplest of explorations, like determining the number of tires sold, can take on unbounded fidelity without proper problem scoping. In turn, this can result in both the exponential growth of data as well as the uncertainty in our confidence of observing that data.

It is important for the analyst to grossly understand, to estimate, the solution without spending time and money on detailed analyses, supported by countless models. One such type of estimation is call a Fermi Problem, which is a framework designed to teach dimensional analysis and can be thought of as “back-of-the-envelope calculations.” Fermi problems are often used in engineering and sciences scope the larger problem before attempting to build complex models that address more precise answers.

Michael Mitchell does an excellent job at TED Ed talking about Fermi approaches when dealing with complex problems:

Interesting. Yes?

Moving on…while Fermi estimation has no formal calculus, with the help of Sherman Kent’s (CIA Analyst) perspective on information, one can break down the approach the following equation:

Fermi Estimation = things we know for certain (facts) + things we should know, but don’t (assumptions, which have ranges) + things we don’t know we don’t know (error term)

The first term is as close as one can come to a statement of indisputable fact. It describes something knowable and known with a high degree of certainty.

The second term is a judgment or estimate. It describes something which is knowable in terms of the human understanding but not precisely known by the man who is talking about it.

The third term is another judgment or estimate, this one made almost without any evidence direct or indirect. It may be an estimate of something that no man alive can know or will ever know. As such, it truly represents that ultimate error in our knowledge.

The Fermi estimation approach, as you can see, provides an answer before turning to more sophisticated modeling methods and a useful check on their results. As long as the assumptions in the estimate are reasonable, the Fermi estimation gives a quick and simple way to obtain a “frame of reference” for what might be a reasonable expectation of the final answer.

This site uses Akismet to reduce spam. Learn how your comment data is processed.