DSI 001 Integrating R and Hadoop with RHadoop

2013 08 25 22 00 19

This is the first in a series of screencasts designed to demonstrate practical aspects of data science. In this episode, I will show you how to integrate R, that awesome awe inspiring statistical processing environment, with Hadoop, the master of distributed data storage  and processing. Once done, we are going to then apply the RHadoop environment to count the number of words in that massive classical book “Moby Dick.” 

In this screencast, we are going to setup a Hadoop environment on a Mac OS X operating system; download, install, and configure hadoop; download and install R and R Studio; download and load RHadoop packages; configure R; and finally, create and execute a test mapreduce problem. Here, let me show you exactly how all this works.

The scripts to this screencast will be posted over the next couple of days.



Categories: Data Science, R, RHadoop/Hadoop, Screencast

Tags: , , , ,

1 reply

  1. Hi,
    Nice video. I am trying to integrate RHadoop and I am following yourr example. Can you please post the script for DSI001…Thank you very much for the video

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: