Datalandia – Invasion of the Cattle Snatchers [VIDEO]

2013 07 28 11 24 21

It’s not everyday that big ideas comes to the little screen, but GE has done just that with Datalandia. This short video promo of a fictional land where menacing space aliens collide with brilliant machines and Big Data is brilliant reminder that visualization and story telling are important capabilities in the quest of finding revelations in data.

R: The Video!

Alien 1979 sigourney weaver movie poster

In the last ten years, the open source R statistics language has exploded in popularity and functionality, emerging as the data scientist’s tool of choice. Today, R is used by over 2 million analysts worldwide, many having been introduced to its elegance and power in academia. Users around the world have embraced R to solve their most challenging problems in fields ranging from computational biology to quantitative finance, and to train their students in these same fields. The result has been an explosion of R analysts and applications, leading to enthusiastic adoption by premier analytics-driven companies like Google, Facebook, and and the New York Times.



Enterprise Data Science (EDS) – Updated Framework Model


Companies continue to struggle with how to implement an organic and systematic approach to data science. As part of an ongoing trend to generate new revenues through enterprise data monetization, products and services owners have turned to internal business analytics teams for help, only to find their individual efforts fall very short of achieving business expectations. Enterprise Data Science (EDS), based on the proven techniques of  Cross Industry Standard Process for Data Mining (CRISP-DM), is designed to overcome most of the traditional limitations found in common business intelligence units.

The earlier post “Objective-Based Data Monetization: A Enterprise Approach to Data Science (EDS)” was in initial cut a describing the framework. It defines data monetization, hypothesis driven assessments, objective-based data science framework, and the differences between business intelligences and data science. While it was a good first cut, several refinements (below) have bee made to better clarify each phase and their explicit interactions.

Data Science Architecture Insurance Prebind Example

In addition to restructuring the EDS framework and its insurance pre-bind data (all the data that goes into quoting insurance policies) example, it was important to document the data science processes that come with an overall enterprise solution (below).

Data Science Process


Machine Learning Recommender Systems to Aid in Product Indentification for Affordable Health Care Insurance MarketPlaces

Recommender Systems

The Affordable Health Care Insurance Marketplaces (Exchanges) are massive interconnected governmental systems that might be destine to not only complicate the health insurance process, but confuse and bewilder individuals seeking personal and coverage information. But this destiny is not certain, especially if federal and state governments leverage the collective intelligence presence in these highly distributed exchanges through a governmental variant of the Machine Learning Recommender System.

As envisioned, these marketplaces are a state run set of government-regulated and quasi-standardized health care plans in the United States, from which individuals may purchase health insurance eligible for federal subsidies. Each marketplace will interconnect to and serviced by the Federal Exchange Program System Data Services Hub (AKA Fed Data Hub), which will relay individual, program, and meta information between the exchanges and through several federal agencies (e.g., IRS, Citizenship and Immigration Services, Department of Homeland Security). It is in these hubs where vast amounts of information on users’ preferences, activities, and behaviors (AKA Psychographics) are located and can be the basis for predicting what health care users would like or need, based on their similarity to other users in the system.

Recommender systems are a type of information filtering system that seek to predict certain psychographic characteristics that user would give to an item (e.g., insurance policy, source of information, desire inquiry response, etc.) or social element (e.g. people or groups) they had not yet considered, using a model built from the characteristics of an item itself (content-based approaches) or the user’s social environment (collaborative filtering approaches). These systems can be used promote (market) items that are highly likely to be interest or value by end users. In the vast world of interrelated clusters of information governed by highly complex sets of rules (e.g., exchanges, fed hub, etc.), recommender systems can be oracle through which valuable information can be disseminated to the masses.


While the uses of machine learning recommender systems is only bounded by imagination, here are just a few use cases that could be uniquely in the early days of building out this insurance ecosystem:

  • Educate users about valuable health plan benefits and minimize costs based on what other people with similar economic, lifestyle, and geographic characteristics have selected.
  • Present additional information and resources that are relative to particular situations faced by others in similar positions.
  • Identification of buying patterns and how they might be under or over exposing the users to risk (e.g., lack of particular medical coverage for correlative care items).

The beauty of machine learning recommender systems in the Affordable Health Care Insurance Marketplace is that they improve with time. They learn from successful and unsuccessful recommendations that are either acted or not acted upon by the users. Their strength is naturally derived from the weakness that often pelages most enterprise system – unbounded growth. They will grow through the changes that will naturally occur as the uncertainty rules are resolved its adolescence.

In the end, as our needs for healthcare related information grows, so will the collective ability of a machine learning recommender system. Their use in the evolving insurance exchanges could be the intellectual catalyst that provides users with enough information stability needed to rid out any potential confusion or bewilderment that could result for the certainty arising out of any new governmental program.