Machine Learning

The success of companies to effectively monetize their information is dependent on how efficiently they can identify revelations in their data sources. While Enterprise Data Science (EDS) is one of the necessary methodologies needed to organically and systematically achieve this goal, it is but one of many such needed frameworks.

Machine Learning, a subdomain of artificial intelligence and a branch of statistical learning, is one such computational methodology composed of techniques and algorithms that enables computing devices to improve their recommendations based on effectiveness of previous experiences (learn). Machine learning is related to data mining (often confused with) and relies on techniques from statistics, probability, numerical analysis, and pattern recognition.

There is a wide variety of machine learning tasks, successful applications, and implementation frameworks.  Mahout, one of the more popular frameworks is a open source project based on Apache Hadoop. Mahout currently can be used for

  • Collaborative filtering (Recommendation systems – user based, item based)
  • Clustering
  • Classification

Varad Meru created and is sharing this introductory Mahout presentation; one that is an excellent source of basis information, as well as implementation details.

1 COMMENT

Leave a Reply