Wednesday, June 17, 2009

HBase setup (0.19.3)

Before you begin:

Before you start configure HBase, you need to have a running Hadoop cluster, which will be the storage for hbase. Please refere to Hadoop cluster setup document before continuing.

On the HBaseMaster (master) machine:

1. In file /etc/hosts, define the ip address of the namenode machine and all the datanode machines. Make sure you define the actual ip (eg. 192.168.1.9) and not the localhost ip (eg. 127.0.0.1) for all the machines including the namenode, otherwise the datanodes will not be able to connect to namenode machine).

    192.168.1.9    hbase-masterserver
    192.168.1.8    hbase-regionserver1
    192.168.1.7    hbase-regionserver2
    192.168.1.6    hadoop-nameserver

    Note: Check to see if the namenode machine ip is being resolved to actual ip not localhost ip using "ping hbase-namenode".

2. Configure password less login from masterserver to all regionserver machines. Refer to Configuring passwordless ssh access for instructions on how to setup password less ssh access.

3. Download and unpack hbase-0.19.3.tar.gz from HBase website to some path in your computer (We'll call the hbase installation root as $HBASE_INSTALL_DIR now on).

4. Edit the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and define the $JAVA_HOME.

    export JAVA_HOME=/user/lib/jvm/java-6-sun

5. Edit the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following properties. (These configurations are required on all the node in the cluster)

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

    <configuration>
        <property>
            <name>hbase.master</name>
            <value>hbase-masterserver:60000</value>
            <description>The host and port that the HBase master runs at.
            A value of 'local' runs the master and a regionserver in
            a single process.
            </description>
        </property>

        <property>
            <name>hbase.rootdir</name>
            <value>hdfs://hadoop-nameserver:9000/hbase</value>
            <description>The directory shared by region servers.</description>
        </property>

        <property>
            <name>hbase.regionserver.class</name>
            <value>org.apache.hadoop.hbase.ipc.IndexedRegionInterface</value>
            <description>This configuration is required to enable indexing on
            hbase and to be able to create secondary indexes
            </description>
        </property>

        <property>
            <name>hbase.regionserver.impl</name>
            <value>
            org.apache.hadoop.hbase.regionserver.tableindexed.IndexedRegionServer
            </value>
            <description>This configuration is required to enable indexing on
            hbase and to be able to create secondary indexes
            </description>
        </property>
    </configuration>
                
    Note: Remeber to replace masterserver and regionserver machine names with real machine names here.

6. Edit $HBASE_INSTALL_DIR/conf/regionservers and add the namenode machine

    hbase-regionserver1
    hbase-regionserver2
    hbase-masterserver

    Note: Add masterserver machine name only if you are running a regionserver on masterserver machine.

On HRegionServer (slave) machine:


1. In file /etc/hosts, define the ip address of the namenode machine. Make sure you define the actual ip (eg. 192.168.1.9) and not the localhost ip (eg. 127.0.0.1).

    192.168.1.9    bhase-masterserver

Note: Check to see if the masterserver machine ip is being resolved to actual ip not localhost ip using "ping hbase-masterserver".

2. Configure password less login from all regionserver machines to masterserver machines. Refer to Configuring passwordless ssh access for instructions on how to setup password less ssh access.

3. Download and unpack hbase-0.19.3.tar.gz from HBase website to some path in your computer (We'll call the hadoop installation root as $HBASE_INSTALL_DIR now on).

4. Edit the file $HBASE_INSTALL_DIR/conf/hadoop-env.sh and define the $JAVA_HOME.

    export JAVA_HOME=/user/lib/jvm/java-6-sun

5. Edit the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following properties. (These configurations are required on all the node in the cluster)

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

    <configuration>
        <
property>
            <name>hbase.rootdir</name>
            <value>hdfs://rajeevks-lx:9000/hbase</value>
            <description>The directory shared by region servers.</description>
        </property>

        <property>
            <name>hbase.regionserver.class</name>
            <value>org.apache.hadoop.hbase.ipc.IndexedRegionInterface</value>
            <description>This configuration is required to enable indexing on
            hbase and to be able to create secondary indexes
            </description>
        </property>

        <property>
            <name>hbase.regionserver.impl</name>
            <value>
            org.apache.hadoop.hbase.regionserver.tableindexed.IndexedRegionServer
            </value>
            <description>This configuration is required to enable indexing on
            hbase and to be able to create secondary indexes.
            </description>
        </property>
    </configuration>

Start and Stop hbase daemons:

You need to start/stop the daemons only on the masterserver machine, it will start/stop the daemons in all regionserver machines. Execute the following command to start/stop the hbase.

    $HBASE_INSTALL_DIR/bin/start-hbase.sh
    or
    $HBASE_INSTALL_DIR/bin/stop-hbase.sh

4 comments :

Rajgopal V said...

In client hbase-site.xml configuration, What is hbase.rootdir ?

<value&rt;hdfs://rajeevks-lx:9000/hbase</value&rt;

Should it not be the same as the value of the hbase.rootdir in the master's hbase-site.xml file ?

Jerry said...

Hi,Rajeev,

Excuse me, i have a problem.
It will appear when i run my program for a moment(about 6 minutes). "Task attempt_201101261050_0100_m_000000_1 failed to report status for 601 seconds. Killing!" Because it takes a long time. But i don't know how to correct it. Could you help me? Thank you very much!

Spark

Pranay said...

Hi Rajeev,
I am having problem with hbase. i have configured hadoop and it is working fine. But after configuring hbase, the UI of HMaster is not launching and also not creating table in shell. Following are the details:
I am using hadoop 0.20.2 and HBase 0.90.4 on 2 nodes
1. umaster : namenode, sec namenode, jobtracker, HMaster
/etc/hosts file :
172.25.20.74 umaster
172.25.20.93 slavee

2. slavee : datanode, tasktracker, HRegionserver
/etc/hosts file :
172.25.20.74 umaster
172.25.20.93 slavee


UI for HMaster i.e umaster:60010 gives error like problem accessing master.jsp caused by exception: HRegionInfo was null or empty

And while creating table it gives error like retriesexhaustedexception and gives exception as java.io.IOException: HRegionInfo was null or empty in -ROOT-

Please give me the solution for this.

Anonymous said...

issue the command $jps and check if the HMaster is running or not.