UPDATE: U.S. On The Brink: Near-Depression Levels Losses In Wealth Expected


U.S. employers’ labor cost sustained its five year high into the third quarter of 2014. Economist believe this is being driven by a tightening labor market, which often results in company pressure to raise wages and salaries. According to the Bureau of Labor Statistics, wage and salaries, which make up about 70% of compensation costs, rose 0.7% over the last two quarters.

2014 11 01 13 13 32

In the original “U.S. On The Brink: Near-Depression Levels Losses In Wealth Expected” article, the expected median wealth loss was projected to be 18% to 27% over the next 2 to 5 years, respectively. This was driven by a decline in the Wealth to Income index and lower than expected rise in Median Income. Give this sustained change in wages and salaries, the following revised losses in Wealth are based on projected mean Median US Incomes (upward revision):

2014 11 01 12 09 33

The revised analysis now shows a median wealth loss of 15% to 23% over the next 2 to 5 years, respectively. This means that for a family who has a median net wealth of $182K (Federal Reserve, 2013), they are likely to see it fall to $154K by 2016 and $140K by 2019.



U.S. On The Brink: Near-Depression Levels Losses In Wealth Expected

NewImageThe U.S. is on the brink of witnessing some of the largest economic losses in net wealth since the Great Depression. The US Wealth To Income index (reported in Credit Suisse Global Wealth Report 2014) has exceed its mean 3rd quartile for only the forth time in history (see below). While the significance of this most recent event can not be overstated, one can determine the actual economic impact likely to be seen with a bit of time series and probabilistic modeling. 

2014 10 26 15 56 46

In order to quantify the impact on US wealth, we need to forecast the future US Wealth to Income index, along with the expect Median Income for the same period of time. Let’s start by looking at a few of the more interesting characteristics of Wealth to Income index. A stationarity analysis (Augmented Dickey Fuller test) of the index data indicates that we can not reject the null hypothesis that is non-stationary (Dickey-Fuller = -2.3486, Lag order = 0, p-value = 0.4319), which means we can use Autoregressive Integrated Moving Average (ARIMA) time series modeling to forecast future events.

ARIMA are the most general class of models for forecasting a time series which can be made to be “stationary” by differencing (if necessary), perhaps in conjunction with nonlinear transformations such as logging or deflating (if necessary). An ARIMA model is classified as an “ARIMA(p,d,q)” model, where: 

  • p is the number of autoregressive terms, 
  • d is the number of nonseasonal differences, and 
  • q is the number of lagged forecast errors in the prediction equation.

Through experimental evaluation, the most appropriate ARIMA model is ARIMA (1,1,2), which is forecasted for 10 years and added to the original data series in order to produce the graph below. Here we see the fitted mean, forecasted mean, upper and lower 95% confidence interval, as well as the historical Wealth to Income data.  

2014 10 27 09 35 06

At first glance, one expects an equal likelihood of realizing either the forecasted upper or lower values. However, history can provide event-oriented insights that will allow a more probabilistic approach to determining the most likely forecast. Given a certain threshold value of the Wealth to Income index, we can count that number of years it takes for the index to return to pre-threshold level, once exceed. For example, if we set a Wealth to Income index threshold of 5.5, the mean number years spent above this threshold is 4.6 yrs, with a standard deviation (sd) of 2.198 and standard error (se) of 0.98. In addition, the upper and lower 95% confidence levels are 6.52 and 2.68 yrs, respectively. Here is a complete table of years spent above aWealth to Income threshold value:

2014 10 26 17 55 36

With this new threshold data, one can see that the Wealth to Income index stays above the 6.0 level for only 1.08 to 4.42 yrs. Given that this phase is 2 yrs into the cycle, it is more likely that the Wealth to Income index will see a decline in the next 2 years. Thus, we can reject the upper bounds of the forecast model and accept the lower bounds (forecasted lower 95%) for modeling purposes.

A similar analysis, to the one above, was used to forecast the median US Income (see below). In this case, the ARIMA(2,1,0) model was experimentally found to best represent this time series. The median US income is projected to have low to moderate growth over the next ten years and does not have significant volatility, as seen in the Wealth to Income index. Given some of the downward economic and regulatory pressures, the lower bounds (forecasted lower 95%) of forecast will be used in the analysis.

2014 10 27 09 45 09

The last step in the analysis to compute the cumulative percentage change (cumPercentWealthDiff) in wealth as a function of a forecasted Wealth to Income index and US Median Income. The table below show the results of multiplying the respective values and differencing them over the periods in question.

2014 11 01 12 04 11

The analysis shows a median wealth loss of 18% to 27% over the next 2 to 5 years, respectively. This means that for a family who has a median net wealth of $182K (Federal Reserve, 2013), they are likely to see it fall to $150K by 2016 and $133K by 2019. In comparison to 2007-2010 recession, the Federal Reserve said the median net worth of families plunged by 39 percent in just three years, from $126,400 in 2007 to $77,300 in 2010. This analysis appears to be consistent with the reality seen over the last few years.

NewImageThe cause and effect relationship of this correlative model remains unclear. So, while some can probably find faults with this analysis (e.g., assume the Wealth to Income index continues to increase – like during the depression), the final story seem likely to remain the same – an dramatic loss in wealth for the United States over the next few years. The only real question that now remains is identifying and implementing the best investment strategy to undertake given that we are on this brink. I hear there are great specials going on at MattressesAreUs.com.



Data Monetization: A Road Paved On Top Of Data Sets

Paving Road Construction Sign Royalty Free Clipart Picture 090626 203307 625048The road to efficient data monetization is paved on top of effective data sets. No single source of data is comprehensive enough to be an all being source of transformational insights. It is only through the fusion of orthogonal data sets (independent subject area) that true insights into those thing we don’t know we don’t know (level three knowledge) can be revealed. While we have access to data of interest (ERPs, IT, etc.), where can we find others sources to aid in the third level knowledge spelunking?

NewImageWhile data is everywhere, useful data sets are not. A google search on terms like “open data sets” or “data sets in R” reveal thousands of sources. Over the years as a CTO and Data Scientist, I have collected a few hundred myself. In 2011, however, I came across the work of RevoJoe, Revolution Analytics, that more or less got me organized in this area. So here are a few data sets from my list that I maintain today:

Commercial Sources
Data MarketPlace: http://www.infochimps.com/marketplace

UMD:: http://inforumweb.umd.edu/econdata/econdata.html
World bank: http://data.worldbank.org/indicator

CBOE Futures Exchange: http://cfe.cboe.com/Data/
Gapminder: http://www.gapminder.org
Google Finance: http://finance.yahoo.com/ (R)
Google Trends: http://www.google.com/trends?q=google&ctab=0&geo=all&date=all&sort=0
St Louis Fed: http://research.stlouisfed.org/fred2/ (R)
NASDAQ: https://data.nasdaq.com/
OANDA: http://www.oanda.com/ (R)
Yahoo Finance: http://finance.yahoo.com/ (R)

Archived national government statistics: http://www.archive-it.org/
Australia: http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/3301.02009?OpenDocument
Canada: http://www.data.gc.ca/default.asp?lang=En&n=5BCD274E-1
Civic Commons: http://wiki.civiccommons.org/Initiatives
DataMarket: http://datamarket.com/
Datamob: http://datamob.org
Fed Stats: http://www.fedstats.gov/cgi-bin/A2Z.cgi
Guardian world governments: http://www.guardian.co.uk/world-government-data
List of cities/states by Simply Statitistics: http://simplystatistics.org/2012/01/02/list-of-cities-states-with-open-data-help-me-find/
London, U.K. data: http://data.london.gov.uk/catalogue
New Zealand: http://www.stats.govt.nz/tools_and_services/tools/TableBuilder/tables-by…
NYC data: http://nycplatform.socrata.com/
Open Government Data (Hub): http://opengovernmentdata.org
Open Government Data – United States of America: http://www.data.gov
Open Government Data – United Kingdom: http://data.gov.uk
Open Government – France: http://www.data.gouv.fr
OECD: http://www.oecd.org/document/0,3746,en_2649_201185_46462759_1_1_1_1,00.html
San Francisco Data sets: http://datasf.org/
U.K. Government Data:http://data.gov.uk/data
United Nations: http://data.un.org/
U.S. Federal Government Agencies: http://www.data.gov/metric
US CDC Public Health datasets: http://www.cdc.gov/nchs/data_access/ftp_data.htm
The World Bank: http://wdronline.worldbank.org/

Machine Learning
Causality Workbench: http://www.causality.inf.ethz.ch/repository.php
Kaggle competition data: http://www.kaggle.com/
KDNuggets competition site: www.kdnuggets.com/datasets/
UCI Machine Learning Repository: http://archive.ics.uci.edu/ml/
Machine Learning Data Set Repository: http://mldata.org/
Microsoft Research: http://research.microsoft.com/apps/dp/dl/downloads.aspx
Million songs: http://blog.echonest.com/post/3639160982/million-song-dataset
Social Networking: http://www.cs.cmu.edu/~jelsas/data/ancestry.com/
The Koblenz Network Collection: http://konect.uni-koblenz.de/

Datasets: http://www.reddit.com/r/datasets
Datasets: http://www.reddit.com/r/opendata/
Hilary Mason’s research data (Chief Data Scientist at Bit.ly): http://bitly.com/bundles/hmason/1
Kaggle Contests: http://www.kaggle.com/
R Datasets: http://vincentarelbundock.github.com/Rdatasets/datasets.html

Public Domain Collections
Data360: http://www.data360.org/index.aspx
Datamob.org: http://datamob.org/datasets
Factual: http://www.factual.com/topics/browse
Freebase: http://www.freebase.com/
Google: http://www.google.com/publicdata/directory
infochimps: http://www.infochimps.com/
numbray: http://numbrary.com/
Sample R data sets: http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html (R)
SourceForge Research Data: http://www.nd.edu/~oss/Data/data.html
UFO Reports: http://www.nuforc.org/webreports.html
Wikileaks 911 pager intercepts: http://911.wikileaks.org/files/index.html
Stats4Stem.org: R data sets: http://www.stats4stem.org/data-sets.html (R)
The Washington Post List: http://www.washingtonpost.com/wp-srv/metro/data/datapost.html

Agricultural Experiments: http://www.inside-r.org/packages/cran/agridat/docs/agridat (R)
Climate data: http://www.cru.uea.ac.uk/cru/data/temperature/#datter
and ftp://ftp.cmdl.noaa.gov/
Gene Expression Omnibus: http://www.ncbi.nlm.nih.gov/geo/
Geo Spatial Data: http://geodacenter.asu.edu/datalist/
Human Microbiome Project: http://www.hmpdacc.org/reference_genomes/reference_genomes.php
KDD Nugets Datasets: http://www.kdnuggets.com/datasets/index.html
MIT Cancer Genomics Data: http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi
NASA: http://nssdc.gsfc.nasa.gov/nssdc/obtaining_data.html
NIH Microarray data: ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE6532/ (R)
Protein structure: http://www.infobiotic.net/PSPbenchmarks/
Public Gene Data: http://www.pubgene.org/
Stanford Microarray Data: http://smd.stanford.edu//

Social Sciences
Analyze Survey Data for Free: http://www.asdfree.com/
General Social Survey: http://www3.norc.org/GSS+Website/
ICPSR: http://www.icpsr.umich.edu/icpsrweb/ICPSR/access/index.jsp
UCLA Social Sciences Archive: http://dataarchives.ss.ucla.edu/Home.DataPortals.htm
UPJOHN INST: http://www.upjohn.org/erdc/erdc.html

Time Series
Time Series data Library: http://robjhyndman.com/TSDL/

Carnegie Mellon University Enron email: http://www.cs.cmu.edu/~enron/
Carnegie Mellon University StatLab: http://lib.stat.cmu.edu/datasets/
Carnegie Mellon University JASA data archive: http://lib.stat.cmu.edu/jasadata/
CMU Statlib: http://lib.stat.cmu.edu/datasets/
Ohio State University Financial data: http://fisher.osu.edu/fin/osudata.htm
Stanford Large Newtork Data: http://snap.stanford.edu/data/
UC Berkeley: http://ucdata.berkeley.edu/
UCI Machine Learning: http://archive.ics.uci.edu/ml/
UCLA: http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data
UC Riverside Time Series: http://www.cs.ucr.edu/~eamonn/time_series_data/
University of Toronto: http://www.cs.toronto.edu/~delve/data/datasets.html