Thursday, June 18, 2009

Using HBase in Java (0.19.3)

Using HBase in java

Create a HBaseConfiguration object to connect to a HBase server. You need to tell configuration object that where to read the HBase configuration from. to do this add a resource to the HBaseConfiguration object.
    
    HBaseConfiguration conf = new HBaseConfiguration();
    conf.addResource(new Path("/opt/hbase-0.19.3/conf/hbase-site.xml"));

Create a HTable object to a table in HBase. HTable object connects you to a table in HBase.

    HTable table = new HTable(conf, "test_table");

Create a BatchUpdate object on a row to perform update operations (like put and delete)

    BatchUpdate batchUpdate = new BatchUpdate("test_row1");
    batchUpdate.put("columnfamily:column1", Bytes.toBytes("some value"));
    batchUpdate.delete("column1");

Commit the changes to table using HTable#commit() method.

    table.commit(batchUpdate);

To read one column value from a row use HTable#get() method.

    Cell cell = table.get("test_row1", "columnfamily1:column1");
    if (cell != null) {
        String valueStr = Bytes.toString(cell.getValue());
        System.out.println("test_row1:columnfamily1:column1 " + valueStr);
    }

To read one row with given columns, use HTable#getRow() method.
 
    RowResult singleRow = table.getRow(Bytes.toBytes("test_row1"));
    Cell cell = singleRow.get(Bytes.toBytes("columnfamily1:column1"));
    if(cell!=null) {
        System.out.println(Bytes.toString(cell.getValue()));
    }

    cell = singleRow.get(Bytes.toBytes("columnfamily1:column2"));
    if(cell!=null) {
        System.out.println(Bytes.toString(cell.getValue()));
    }

To get multiple rows use Scanner and iterate throw to get values.

    Scanner scanner = table.getScanner(
        new String[] { "columnfamily1:column1" });


    //First aproach to iterate the scanner.

    RowResult rowResult = scanner.next();
    while (rowResult != null) {
        System.out.println("Found row: " + Bytes.toString(rowResult.getRow())
            + " with value: " +
            rowResult.get(Bytes.toBytes("columnfamily1:column1")));

        rowResult = scanner.next();
    }

    // The other approach is to use a foreach loop. Scanners are iterable!
    for (RowResult result : scanner) {
        System.out.println("Found row: " + Bytes.toString(result.getRow())
            + " with value: " +
            result.get(Bytes.toBytes("columnfamily1:column1")));

    }

    scanner.close();

Example Code:

import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Scanner;
import org.apache.hadoop.hbase.io.BatchUpdate;
import org.apache.hadoop.hbase.io.Cell;
import org.apache.hadoop.hbase.io.RowResult;
import org.apache.hadoop.hbase.util.Bytes;

public class HBaseExample {

    public static void main(String args[]) throws IOException {

        HBaseConfiguration conf = new HBaseConfiguration();
        conf.addResource(new Path("/opt/hbase-0.19.3/conf/hbase-site.xml"));

        HTable table = new HTable(conf, "test_table");

        BatchUpdate batchUpdate = new BatchUpdate("test_row1");
        batchUpdate.put("columnfamily1:column1", Bytes.toBytes("some value"));
        batchUpdate.delete("column1");
        table.commit(batchUpdate);

        Cell cell = table.get("test_row1", "columnfamily1:column1");
        if (cell != null) {
            String valueStr = Bytes.toString(cell.getValue());
            System.out.println("test_row1:columnfamily1:column1 " + valueStr);
        }

        RowResult singleRow = table.getRow(Bytes.toBytes("test_row1"));
        Cell cell = singleRow.get(Bytes.toBytes("columnfamily1:column1"));
        if(cell!=null) {
            System.out.println(Bytes.toString(cell.getValue()));
        }

        cell = singleRow.get(Bytes.toBytes("columnfamily1:column2"));
        if(cell!=null) {
            System.out.println(Bytes.toString(cell.getValue()));
        }

        Scanner scanner = table.getScanner(
            new String[] { "columnfamily1:column1" });

        //First approach to iterate a scanner
        RowResult rowResult = scanner.next();
        while (rowResult != null) {
            System.out.println("Found row: " + Bytes.toString(rowResult.getRow())
                + " with value: " +
                rowResult.get(Bytes.toBytes("columnfamily1:column1")));

            rowResult = scanner.next();
        }

        // The other approach is to use a foreach loop. Scanners are iterable!
        for (RowResult result : scanner) {
            // print out the row we found and the columns we were looking for
            System.out.println("Found row: " + Bytes.toString(result.getRow())
                + " with value: " +
                result.get(Bytes.toBytes("columnfamily1:column1")));

        }

        scanner.close();
        table.close();
    }
}

9 comments :

Anonymous said...

Can you please guide me how to use HBase in Java programs, I tried running your example but it gives this error

11/05/02 17:44:40 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 0 time(s).
11/05/02 17:44:42 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 1 time(s).
11/05/02 17:44:44 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 2 time(s).
11/05/02 17:44:46 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 3 time(s).
11/05/02 17:44:48 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 4 time(s).
11/05/02 17:44:50 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 5 time(s).
11/05/02 17:44:52 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 6 time(s).
11/05/02 17:44:54 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 7 time(s).
11/05/02 17:44:56 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 8 time(s).
11/05/02 17:44:58 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 9 time(s).
11/05/02 17:44:59 INFO client.HConnectionManager$TableServers: Attempt 0 of 10 failed with . Retrying after sleep of 2000

please let me know where am i going wrong. please suggest a simple way to use HBase

RamaKrishna said...

hi raj..
very good..post...
can u give where i need to use this
hbase in real time??
and what is difference between
RDBMS and HBase??

Thanks,
Ramky

Anonymous said...

Hi Every one

For the problem Retrying connect to server: localhost/127.0.0.1:60000. Already tried 0 time(s).

you can use the following configuration it might serve you

Configuration conf = HBaseConfiguration.create();

conf.addResource(new Path("$Hbase_Home/conf/core-site.xml"));
conf.addResource(new Path("$Hbase_Home/conf/hdfs-site.xml"));
conf.addResource(new Path("$Hbase_Home/conf/hbase-site.xml"));

regards
anoop

Anonymous said...

Hi,

It cant resolve :
org.apache.hadoop.hbase.client.Scanner
org.apache.hadoop.hbase.io.BatchUpdate;
org.apache.hadoop.hbase.io.Cell;
org.apache.hadoop.hbase.io.RowResult;

I have added all the external jar files.

Anonymous said...

hi
i have created a table in hbase with 12 columns in each row and each column has 8 qualifiers.when i try to read complete row it returns correct value for 1:1 in row 1 but returns null for 1:2
it reads all the columns correctly from 2 to 12....
plz help how to solve this problem
i m using this code for reading....it is inside for loop thar runs fron 1 to 12..
train[0][i] = Double.parseDouble(Bytes.toString (r.getValue(Bytes.toBytes(Integer.toString(i)), Bytes.toBytes("1"))));
train[1][i] = Double.parseDouble (Bytes.toString (r.getValue(Bytes.toBytes(Integer.toString(i)), Bytes.toBytes("2"))));
train[2][i] = Double.parseDouble (Bytes.toString (r.getValue(Bytes.toBytes(Integer.toString(i)), Bytes.toBytes("3"))));
System.out.println("train" + i + ": " + train[2][i]);
train[3][i] = Double.parseDouble (Bytes.toString (r.getValue(Bytes.toBytes(Integer.toString(i)), Bytes.toBytes("4"))));
train[4][i] = Double.parseDouble (Bytes.toString (r.getValue(Bytes.toBytes(Integer.toString(i)), Bytes.toBytes("5"))));
train[5][i] = Double.parseDouble (Bytes.toString (r.getValue(Bytes.toBytes(Integer.toString(i)), Bytes.toBytes("6"))));


train[6][i] = Double.parseDouble (Bytes.toString (r.getValue(Bytes.toBytes(Integer.toString(i)), Bytes.toBytes("7"))));
train[7][i] = Double.parseDouble (Bytes.toString (r.getValue(Bytes.toBytes(Integer.toString(i)), Bytes.toBytes("8"))));

Anonymous said...

Most of BatchUpdate is deprecated

anju said...

Hi I tried your program..but the classes and functions deprecated.Could you upload the functions that corresponds to new API?

Alok said...

I tried example code from your post. but I am getting error. please tell me where I am wrong

thank you

I run
javac -classpath ~/packages/hbase-0.94.6/hbase-0.94.6.jar:~/packages/hbase-0.94.6/lib/*:~/packages/hadoop-1.1.2/*.jar:~/packages/hadoop-1.1.2/lib/*.jar HBaseExample.java -verbose

please look at the error I am getting at http://paste.ubuntu.com/5674111/

vikas singh said...

Hi I am my hbase is running on ubuntu at some other ip lets say 192.168.1.45 and i am usind windows 8 on ip 192.168.42.55. I want to connect hbase from windows using eclips. Can anyone tell me how to connect and what code i should use?