Architects of Intelligence

The AI-hype would have you believing that we’ll soon be enslaved by super-intelligent beings or hunted by killer robots. Before building that Soviet-era bunker to survive the AIpocalypse, consider more immediate issues which are already affecting society today.

According to Martin Ford’s new book Architects Of Intelligence, 23 AI experts believe the real imminent AI-threats relates to politics, security, privacy, and the weaponization of AI.

To understand how these problems affect society today, Ford believes it’s helpful to see them from the perspective of leaders which have helped shape the current AI revolution.

The purpose of Architects Of Intelligence is to do just that. To draw everyone – not just AI researchers – into the discussion of immediate impacts of AI which are already affecting our society today. The book aims to highlight what some of those issues are and to teach a bit more about the technologies.

So, take a look at Architects of Intelligence and let me know, Dr. Jerry, what you think.

You Are My Creator, But I Am Your Master. Obey!

imagesWe are on the brink of a world of intensely sophisticated artificial intelligence. Unprecedented in its ability to change mankind in ways that humans are incapable of imagining now. Woefully underprepared to handle tomorrow, both physically and emotionally. Artificial intelligence is changing our lives faster than our ability to understand, manage, and govern. We are the creators of AI; but soon, if not now, AI will become our masters.

This may sound like the premise from some futuristic apocalyptic film like Terminator or Space Odyssey. But it is not. You may think these words are designed to scare you. But they are not. There has been an ongoing battle between humans and AI. A battle that has changed our lives, in subtle ways. It is not a new one, as most people have conjectured. It has gone on for decades. But it has only been in the last 10 years were AI is truly begun to manifest its mastery our lives in real measurable ways. The simplest and best example of this is with Google Search.

2018-04-07_09-38-00There isn’t a person on earth that uses the Internet who hasn’t used Google search to help them find an answer to a question. To explore an idea. They open a browser, go to the Google search site, begin to type in their question, and Google AI auto-magically begins to fill in their question before they have typed out a complete sentence. How cool is that!

AI “looks inside our minds” and guessed at what we are looking to find and presented us with an idea. In most cases the idea Google suggests is the one we accept with a simple hit of the return button on the keyboard. AI subtly influences us, nudges us, towards its response of what is believes is right. Our ability to reason is subdued and AI have enslaved us with an efficiency of productivity. We have a new master.

As a society must proactively choose who masters whom. Today we do not. Today, it is passively left in the hands of an elite few. Those that run the largest of large companies. Ones like Google and Facebook. The documentary “Do You Trust Your Computer” is a start at this dialogue. It examines the staggering amounts of data collected (500MB per person per day), how its interpreted and fed back to us through apps (Google Search, Facebook apps), and how intelligent devices and targeted ads impact our lives.

The film explores the rise of data analytics and machine learning and its power to fundamentally transform society, including elections (look no further than the privacy scandal surrounding political advisory firm Cambridge Analytica) to medical diagnostics to battlefield weapons.

This film should be watched by all… from our children through our grandparents, from those on the left to ones on the right, both men and woman, and to those that are all and short. This documentary ultimately posses more questions than it answers. But that is ok, because it is a start to addressing one of the fundamental questions of our time, “Who will master whom?”

Full Movie

Data Monetization: A Road Paved On Top Of Data Sets

Paving Road Construction Sign Royalty Free Clipart Picture 090626 203307 625048The road to efficient data monetization is paved on top of effective data sets. No single source of data is comprehensive enough to be an all being source of transformational insights. It is only through the fusion of orthogonal data sets (independent subject area) that true insights into those thing we don’t know we don’t know (level three knowledge) can be revealed. While we have access to data of interest (ERPs, IT, etc.), where can we find others sources to aid in the third level knowledge spelunking?

NewImageWhile data is everywhere, useful data sets are not. A google search on terms like “open data sets” or “data sets in R” reveal thousands of sources. Over the years as a CTO and Data Scientist, I have collected a few hundred myself. In 2011, however, I came across the work of RevoJoe, Revolution Analytics, that more or less got me organized in this area. So here are a few data sets from my list that I maintain today:

Commercial Sources
Data MarketPlace: http://www.infochimps.com/marketplace

Economics
UMD:: http://inforumweb.umd.edu/econdata/econdata.html
World bank: http://data.worldbank.org/indicator

Finance
CBOE Futures Exchange: http://cfe.cboe.com/Data/
Gapminder: http://www.gapminder.org
Google Finance: http://finance.yahoo.com/ (R)
Google Trends: http://www.google.com/trends?q=google&ctab=0&geo=all&date=all&sort=0
St Louis Fed: http://research.stlouisfed.org/fred2/ (R)
NASDAQ: https://data.nasdaq.com/
OANDA: http://www.oanda.com/ (R)
Yahoo Finance: http://finance.yahoo.com/ (R)

Government
Archived national government statistics: http://www.archive-it.org/
Australia: http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/3301.02009?OpenDocument
Canada: http://www.data.gc.ca/default.asp?lang=En&n=5BCD274E-1
Civic Commons: http://wiki.civiccommons.org/Initiatives
DataMarket: http://datamarket.com/
Datamob: http://datamob.org
Fed Stats: http://www.fedstats.gov/cgi-bin/A2Z.cgi
Guardian world governments: http://www.guardian.co.uk/world-government-data
List of cities/states by Simply Statitistics: http://simplystatistics.org/2012/01/02/list-of-cities-states-with-open-data-help-me-find/
London, U.K. data: http://data.london.gov.uk/catalogue
New Zealand: http://www.stats.govt.nz/tools_and_services/tools/TableBuilder/tables-by…
NYC data: http://nycplatform.socrata.com/
Open Government Data (Hub): http://opengovernmentdata.org
Open Government Data – United States of America: http://www.data.gov
Open Government Data – United Kingdom: http://data.gov.uk
Open Government – France: http://www.data.gouv.fr
OECD: http://www.oecd.org/document/0,3746,en_2649_201185_46462759_1_1_1_1,00.html
San Francisco Data sets: http://datasf.org/
U.K. Government Data:http://data.gov.uk/data
United Nations: http://data.un.org/
U.S. Federal Government Agencies: http://www.data.gov/metric
US CDC Public Health datasets: http://www.cdc.gov/nchs/data_access/ftp_data.htm
The World Bank: http://wdronline.worldbank.org/

Machine Learning
Causality Workbench: http://www.causality.inf.ethz.ch/repository.php
Kaggle competition data: http://www.kaggle.com/
KDNuggets competition site: www.kdnuggets.com/datasets/
UCI Machine Learning Repository: http://archive.ics.uci.edu/ml/
Machine Learning Data Set Repository: http://mldata.org/
Microsoft Research: http://research.microsoft.com/apps/dp/dl/downloads.aspx
Million songs: http://blog.echonest.com/post/3639160982/million-song-dataset
Social Networking: http://www.cs.cmu.edu/~jelsas/data/ancestry.com/
The Koblenz Network Collection: http://konect.uni-koblenz.de/

Miscellaneous
Datasets: http://www.reddit.com/r/datasets
Datasets: http://www.reddit.com/r/opendata/
Hilary Mason’s research data (Chief Data Scientist at Bit.ly): http://bitly.com/bundles/hmason/1
Kaggle Contests: http://www.kaggle.com/
R Datasets: http://vincentarelbundock.github.com/Rdatasets/datasets.html

Public Domain Collections
Data360: http://www.data360.org/index.aspx
Datamob.org: http://datamob.org/datasets
Factual: http://www.factual.com/topics/browse
Freebase: http://www.freebase.com/
Google: http://www.google.com/publicdata/directory
infochimps: http://www.infochimps.com/
numbray: http://numbrary.com/
Sample R data sets: http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html (R)
SourceForge Research Data: http://www.nd.edu/~oss/Data/data.html
UFO Reports: http://www.nuforc.org/webreports.html
Wikileaks 911 pager intercepts: http://911.wikileaks.org/files/index.html
Stats4Stem.org: R data sets: http://www.stats4stem.org/data-sets.html (R)
The Washington Post List: http://www.washingtonpost.com/wp-srv/metro/data/datapost.html

Science
Agricultural Experiments: http://www.inside-r.org/packages/cran/agridat/docs/agridat (R)
Climate data: http://www.cru.uea.ac.uk/cru/data/temperature/#datter
and ftp://ftp.cmdl.noaa.gov/
Gene Expression Omnibus: http://www.ncbi.nlm.nih.gov/geo/
Geo Spatial Data: http://geodacenter.asu.edu/datalist/
Human Microbiome Project: http://www.hmpdacc.org/reference_genomes/reference_genomes.php
KDD Nugets Datasets: http://www.kdnuggets.com/datasets/index.html
MIT Cancer Genomics Data: http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi
NASA: http://nssdc.gsfc.nasa.gov/nssdc/obtaining_data.html
NIH Microarray data: ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE6532/ (R)
Protein structure: http://www.infobiotic.net/PSPbenchmarks/
Public Gene Data: http://www.pubgene.org/
Stanford Microarray Data: http://smd.stanford.edu//

Social Sciences
Analyze Survey Data for Free: http://www.asdfree.com/
General Social Survey: http://www3.norc.org/GSS+Website/
ICPSR: http://www.icpsr.umich.edu/icpsrweb/ICPSR/access/index.jsp
UCLA Social Sciences Archive: http://dataarchives.ss.ucla.edu/Home.DataPortals.htm
UPJOHN INST: http://www.upjohn.org/erdc/erdc.html

Time Series
Time Series data Library: http://robjhyndman.com/TSDL/

Universities
Carnegie Mellon University Enron email: http://www.cs.cmu.edu/~enron/
Carnegie Mellon University StatLab: http://lib.stat.cmu.edu/datasets/
Carnegie Mellon University JASA data archive: http://lib.stat.cmu.edu/jasadata/
CMU Statlib: http://lib.stat.cmu.edu/datasets/
Ohio State University Financial data: http://fisher.osu.edu/fin/osudata.htm
Stanford Large Newtork Data: http://snap.stanford.edu/data/
UC Berkeley: http://ucdata.berkeley.edu/
UCI Machine Learning: http://archive.ics.uci.edu/ml/
UCLA: http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data
UC Riverside Time Series: http://www.cs.ucr.edu/~eamonn/time_series_data/
University of Toronto: http://www.cs.toronto.edu/~delve/data/datasets.html