Before you start configure HBase, you need to have a running Hadoop cluster, which will be the storage for hbase. Please refere to Hadoop cluster setup document before continuing.
On the HBaseMaster (master) machine:
1. In file /etc/hosts, define the ip address of the namenode machine and all the datanode machines. Make sure you define the actual ip (eg. 192.168.1.9) and not the localhost ip (eg. 127.0.0.1) for all the machines including the namenode, otherwise the datanodes will not be able to connect to namenode machine).
192.168.1.9 hbase-masterserver
192.168.1.8 hbase-regionserver1
192.168.1.7 hbase-regionserver2
192.168.1.6 hadoop-nameserver
Note: Check to see if the namenode machine ip is being resolved to actual ip not localhost ip using "ping hbase-namenode".
2. Configure password less login from masterserver to all regionserver machines. Refer to Configuring passwordless ssh access for instructions on how to setup password less ssh access.
3. Download and unpack hbase-0.20.0.tar.gz from HBase website to some path in your computer (We'll call the hbase installation root as $HBASE_INSTALL_DIR now on).
4. Edit the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and define the $JAVA_HOME.
export JAVA_HOME=/usr/lib/jvm/java-6-sun
5. Edit the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following properties. (These configurations are required on all the node in the cluster)
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.master</name>
<value>localhost:60000</value>
<description>The host and port that the HBase master runs at.
A value of 'local' runs the master and a regionserver in
a single process.
<name>hbase.master</name>
<value>localhost:60000</value>
<description>The host and port that the HBase master runs at.
A value of 'local' runs the master and a regionserver in
a single process.
</description>
</property>
<property>
Start and Stop hbase daemons:
You need to start/stop the daemons only on the masterserver machine, it will start/stop the daemons in all regionserver machines. Execute the following command to start/stop the hbase.
$HBASE_INSTALL_DIR/bin/start-hbase.sh
or
$HBASE_INSTALL_DIR/bin/stop-hbase.sh
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-nameserver:9000/hbase</value>
<description>The directory shared by region servers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed
Zookeeper true: fully-distributed with unmanaged Zookeeper
Quorum (see hbase-env.sh)
</description>
</property>
</configuration>
Note: Remeber to replace masterserver and regionserver machine names with real machine names here.
6. Edit $HBASE_INSTALL_DIR/conf/regionservers and add the namenode machine
hbase-regionserver1
hbase-regionserver2
hbase-masterserver
Note: Add masterserver machine name only if you are running a regionserver on masterserver machine.
On HRegionServer (slave) machine:
1. In file /etc/hosts, define the ip address of the namenode machine. Make sure you define the actual ip (eg. 192.168.1.9) and not the localhost ip (eg. 127.0.0.1).
192.168.1.9 bhase-masterserver
Note: Check to see if the masterserver machine ip is being resolved to actual ip not localhost ip using "ping hbase-masterserver".
2. Configure password less login from all regionserver machines to masterserver machines. Refer to Configuring passwordless ssh access for instructions on how to setup password less ssh access.
3. Download and unpack hbase-0.20.0.tar.gz from HBase website to some path in your computer (We'll call the hadoop installation root as $HBASE_INSTALL_DIR now on).
4. Edit the file $HBASE_INSTALL_DIR/conf/hadoop-env.sh and define the $JAVA_HOME.
export JAVA_HOME=/usr/lib/jvm/java-6-sun
5. Edit the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following properties. (These configurations are required on all the node in the cluster)
<value>hdfs://hadoop-nameserver:9000/hbase</value>
<description>The directory shared by region servers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed
Zookeeper true: fully-distributed with unmanaged Zookeeper
Quorum (see hbase-env.sh)
</description>
</property>
</configuration>
Note: Remeber to replace masterserver and regionserver machine names with real machine names here.
6. Edit $HBASE_INSTALL_DIR/conf/regionservers and add the namenode machine
hbase-regionserver1
hbase-regionserver2
hbase-masterserver
Note: Add masterserver machine name only if you are running a regionserver on masterserver machine.
On HRegionServer (slave) machine:
1. In file /etc/hosts, define the ip address of the namenode machine. Make sure you define the actual ip (eg. 192.168.1.9) and not the localhost ip (eg. 127.0.0.1).
192.168.1.9 bhase-masterserver
Note: Check to see if the masterserver machine ip is being resolved to actual ip not localhost ip using "ping hbase-masterserver".
2. Configure password less login from all regionserver machines to masterserver machines. Refer to Configuring passwordless ssh access for instructions on how to setup password less ssh access.
3. Download and unpack hbase-0.20.0.tar.gz from HBase website to some path in your computer (We'll call the hadoop installation root as $HBASE_INSTALL_DIR now on).
4. Edit the file $HBASE_INSTALL_DIR/conf/hadoop-env.sh and define the $JAVA_HOME.
export JAVA_HOME=/usr/lib/jvm/java-6-sun
5. Edit the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following properties. (These configurations are required on all the node in the cluster)
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.master</name>
<value>localhost:60000</value>
<description>The host and port that the HBase master runs at.
A value of 'local' runs the master and a regionserver in
a single process
</description>
</property>
<property>
</configuration><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.master</name>
<value>localhost:60000</value>
<description>The host and port that the HBase master runs at.
A value of 'local' runs the master and a regionserver in
a single process
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-nameserver:9000/hbase</value>
<description>The directory shared by region servers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed
Zookeeper true: fully-distributed with unmanaged Zookeeper
Quorum (see hbase-env.sh)
</description>
</property>
<value>hdfs://hadoop-nameserver:9000/hbase</value>
<description>The directory shared by region servers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed
Zookeeper true: fully-distributed with unmanaged Zookeeper
Quorum (see hbase-env.sh)
</description>
</property>
Start and Stop hbase daemons:
You need to start/stop the daemons only on the masterserver machine, it will start/stop the daemons in all regionserver machines. Execute the following command to start/stop the hbase.
$HBASE_INSTALL_DIR/bin/start-hbase.sh
or
$HBASE_INSTALL_DIR/bin/stop-hbase.sh