Heilmeier Catechism: Nine Questions To Develop A Meaningful Data Science Project


As director of ARPA in the 1970’s, George H. Heilmeier developed a set of questions that he expected every proposal for a new research program to answer. No exceptions. He referred to them as the “Heilmeier Catechism” and are now the basis of how DARPA (Defense Advance Research Projects Activity) and IARPA (Intelligence Advance Research Project Activity) operate.  Today, it’s equally important to answer these questions for any individual data science project, both for yourself and for communicating to others what you hope to accomplish.

While there have been many variants on Heilmeier’s questions, I still prefer to use the original catechism to guide the development of my data science projects:

1. What are you trying to do? Articulate your objectives using absolutely no jargon. 2. How is it done today, and what are the limits of current practice? 3. What’s new in your approach and why do you think it will be successful? 4. Who cares? 5. If you’re successful, what difference will it make? 6. What are the risks and the payoffs? 7. How much will it cost? 8. How long will it take? 9. What are the midterm and final “exams” to check for success?

Each question is critical in the success chain of events, but number 3 and 5 are most aligned to the way business leaders think. Data science is fought with failures, by the definition of science. As such, business leaders are still a bit (truthfully – a lot) suspicious of how data science teams do what they do and how their results would integrate into the larger enterprise in order to solve real business problems. Part of the data science sales cycle, addressed by question 3, needs to address these concerns. For example, in the post “Objective-Based Data Monetization: A Enterprise Approach to Data Science (EDS),” I present a model for scaling out the our results.

In terms of the differences a project makes (question 5), we need to be sure to cover the business as well as technical differences. The business difference are the standard three: impact on revenue, margin (combined ratios for insurance), and market share. If there is not business value (data/big data economics), then your project is a sunk cost that somebody else will need to make up for.

Here is an example taken from a project proposed in the insurance industry. Brokers are third party entities that sell insurance products on behalf of a company. They are not employees and often are under the governance of underwriters (employee that sells similar products). There are instances where brokers “shop” around looking get coverage for a prospect that might have above average risk (e.g., files too many claims, in high risk business, etc.). They do this by manipulating answers to pre-bind questions (prior to issuing a policy) in order to create a product that will not necessarily need underwriter review and/or approval. This project is designed to help stop this practice, which would help the improve business financial fundamentals. Here is Heilmeier’s Catechism for the Pre-Bind Gaming Project:

1. What are you trying to do? Automate the identification of insurance brokers that use corporate policy pricing tools as a means to undersell through third party providers.

2. How is it done today? Corporate underwriters observer broker behaviors and pass judgement based on person criteria.

3.  What is new in your approach? Develop signatures algorithms, based on the analysis of gamer/no gamer pre-bind data, that can be implemented across enterprise product applications.

4. Who cares? Business executives – CEO, President, CMO, and CFO.

5. What difference will it make? In an insurance company that generates $350 M in premiums at a combined ratio (margin) of 97%, addressing this problem could result in  an additional $12M to $32M of incremental revenue while improving the combined ratio to 95.5%.

6. What are the risks and payoffs? Risks – Not having collect or access to relevant causal data reflecting the gamers patterns. Payoffs – Improved revenue and combined ratios.

7. How much will it cost? Proof of concept (POC) will cost between $80K and $120K. Scaling the POC into the enterprise (implementing algorithms into 5 to 10 product applications) will cost between $500K and $700K.

8. How long will it take? Proof of concept (POC) will take between a 8 to 10 weeks. Scaling the POC into the enterprise will take between 3 to 7 months.

9. What are the midterms & final check points for success? The POC will act as the initial milestone that demonstrates gaming algorithms can be identify with existing data.

Regardless of whether you use Heilmeier’s questions or other research topic development methodologies (e.g., The Craft of Research), it is important to systematically address the who, what, when, where, and why of the project. While a firm methodology does not guarantee success, not addressing these nine questions are sure to put you on a risky path, one that will need work to get off of.


3 Replies to “Heilmeier Catechism: Nine Questions To Develop A Meaningful Data Science Project”

  1. Technically, this is not a catechism, defined generally as “a summary of religious doctrine often in the form of questions and answers”. Since the answers depend upon the individual project, and are not dictated by the ARPAs, this would be better referred to as the Heilmeier Questions.

  2. So asking “what is it?” and “who cares?” can be considered catechism? The inaccurate and presumptuous name aside, the Heilmeier questions barely touch upon other simple yet critical market research or SWOT analysis information needed to ensure that the developer won’t be late to market or that the competition is not poised to crush a market incursion. While I think the competitiveness of the private sector drives the due diligence to go beyond the Heilmeier questions before striking out on new product development, the incompleteness of these questions seems particularly dangerous in the hands of government R&D entities. Here, suboptimal processes survive because competition is largely absent and annual budgets are stable, making meaningful market research arbitrary. Because government organizations are graded on how completely they expend their annual budgets, and not on successful or timely product development and transition, organizations duplicate effort at Taxpayer expense year after year. I’m sure the catechism was boiled down to these nine questions in an attempt to simplify it, but to be truly effective, each of these nine questions really need to be broken out into many more that look both internally and externally to gain the intimate understanding of the operating environment required to put a new project on a firm footing. As written this catechism won’t save anyone.

  3. So asking “what is it?” and “who cares?” is considered catechism? The presumptuous naming aside, these questions were probably boiled down to nine to be succinct, but they all really need to be broken out into several more in order to develop the intimate understanding of both the internal and external operating environments to ensure a new product developer is capable of succeeding vs. merely poised for success. The fierce competition of the private sector has likely engineered the weaknesses out of the Heilmeier questions through honed internal processes which drive those developers to fuller market research/SWOT analysis. This catechism, despite being developed by gov’t for gov’t use, seems particularly dangerous in the hands of government R&D entities. Here, competition is largely absent, budgets are stable, and success is tied to how completely those budgets are expended, period. These realities perpetuate suboptimal processes and de-emphasize the kind of thorough market research and risk assessment that might prevent massive waste and duplication of effort. These questions are a start, but this catechism won’t save anyone.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.