Archive

Archive for the ‘hadoop’ Category

Random thoughts

April 29th, 2009 Arun Manivannan No comments

1) Read that the open source movement got a big spur during the 2000 slow down and researchers estimate another spur now. Very evident by the increase in the number of participants and projects. Blessing in disguise but lets keep the laid off in our prayers.

2) Planning for integrating IzPack into Snapman over the weekend.  Successfully implemented the InfixToPostFix conversion and evaluation algorithm. Need to have more tricks in my bag (Man !! i really wish i got good comp science education)

3) Had trouble running azureus on Jaunty Jantelope (Ubuntu 9.04). Executing from terminal showed that it is looking for open jdk 6. So, just installed Open JDK after uninstalling the gcj. Noob fix but still it solves my problem.

4) Bug # 1 in Ubuntu bugs.. hahhaha https://launchpad.net/ubuntu/+bug/1 Yup. Thats the real bug.

I think i seriously need to develop some expertise to write something daily for my blog.

HBase Tutorial

October 26th, 2008 Arun Manivannan No comments

These are some excellent links to learn Hbase (or to say, unlearn RDBMS way of thinking)
http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable
http://jimbojw.com/wiki/index.php?title=Understanding_HBase_column-family_performance_options

I use Ubuntu (saves me a lot of trouble setting up the ssh and stuff). For ubuntu users, here is your link
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)

Categories: hadoop, hbase, tutorial Tags: ,

Hadoop and HBase – setting up as pseudo-distributed

October 26th, 2008 Arun Manivannan No comments

For the past many months i have been out of blogging due to personal reasons. By the Grace of God, things have just turned perfectly good for me.  So, i am back to experimenting with some really cool stuff.

These instructions are an excerpt from

http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)

and Hadoop, HBase API documentation.

1) Download Hadoop (hadoop-0.18.1)
2) Download HBase (hbase-0.18.0)
3) I just setup the environment as Pseudo-Distributed.

a) Open up the hadoop-env.sh file in the conf directory of hadoop. Change the JAVA_HOME directory to point to your own machine setup. Mine looks like this :

export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.07

b) Open up the hbase-env.sh file in the conf directory of hbase and set JAVA_HOME.

c) Edit hbase-site.xml and make your configuration tag look like this :

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/hbase</value>
</property>
</configuration>

d) Make hadoop-site.xml to look like the following :

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

e) Issue the following command inside the bin directory of your hadoop
./start-all.sh

eg : arun@arun-laptop:~/softy/hadoop-0.18.1/bin$ ./start-all.sh

f) Issue the following command to start HBase

eg : arun@arun-laptop:~/softy/hbase-0.18.0/bin$ ./start-hbase.sh

You can play with the shell if you wanted to.

arun@arun-laptop:~/softy/hbase-0.18.0/bin$ ./hbase shell

There is an example on http://hadoop.apache.org/hbase/docs/r0.18.0/api/index.html which

i used to test my setup. The example would need us to create a table ‘myTable’ with a column family ‘myColumnFamily’

Execute the following command on the shell

create ‘myTable’,'myColumnFamily’

Here is what i got as output on execution of the example program :

Found row: myRow with value: timestamp=1225023942810, value=columnQualifier1 value!

Let me know if I could help you with any of the setup problems. I didnt copy the hadoop-site.xml to hbase conf. Later i learnt that it is useful only on “truly” distributed environment.

The above setup instructions were just derived from
a) http://hadoop.apache.org/core/docs/r0.18.1/api/index.html
and
b) http://hadoop.apache.org/hbase/docs/r0.18.0/api/index.html

Food for the bots : Here is the list of exceptions i got during the process of setting this up :

INFO  20:39:19 [org.apache.hadoop.hbase.client.HConnectionManager$TableServers (202):getMaster]:

Attempt 0 of 10 failed with <java.io.IOException: Call failed on local exception>.

Retrying after sleep of 5000

Exception in thread “main” org.apache.hadoop.hbase.MasterNotRunningException: localhost:60000

The above log just means that the hbase and hadoop are not started at all. Check the log for exceptions.

Remember, Hadoop must be started first followed by HBase.