FAQ Database Discussion Community


using hbase as key-value store, need to extract a value using java

java,hbase,key-value-coding,key-value-store
I am working on using Hbase as a Key-Value store where we have a column family with one single value. The java filter gets the row in less than a second but when trying to retrieve the value takes 15 seconds. If someone could look into it and give me...

How do I export a table or multiple tables from Hbase shell to a text format?

hadoop,export,hbase,bigdata,software-engineering
I have a table in my Hbase shell with huge amounts of data and I would like to export it to a text format onto a local file system. could anyone suggest me how to do it. I would also like to know if I could export the Hbase table...

Spring data Hadoop, Hbase Rest API, HBase Java Client : which one would be the best to implement to handle communication between Android and HBase

android,spring,rest,hbase,spring-data-hadoop
Does anybody know what is the best method to communicate between HBase database and Android? Basically I want to do following from my Android app to HBase table: i. Insert data into it. ii. Query table and get data. iii. Update table. I had done some research of my own....

Why hbase memstore size and flushed data size are not equal?

hbase,flush
I was monitoring hbase (0.94.18) data storing and found that memstore size and size of flushed stored data are not the same. When memstore data size grows up to 128 Mb it is flushed to HFile. But store file size diff on disk is 36.8 Mb. Compaction is turned off....

Not understanding the row-key in Python API of Hbase

python,hbase
I have a hbase table (customers) in the following form: hbase(main):004:0> scan 'customers' ROW COLUMN+CELL 4000001 column=customers_data:age, timestamp=1424123059769, value=55 4000001 column=customers_data:firstname, timestamp=1424123059769, value=Kristina 4000001 column=customers_data:lastname, timestamp=1424123059769, value=Chung 4000001 column=customers_data:profession, timestamp=1424123059769, value=Pilot I tried to extract these data using python API http://happybase.readthedocs.org/en/latest/: import happybase connection =...

How to unit test Java Hbase API

java,hadoop,mocking,hbase,storm
I am using the Java HBase API to get a value from Hbase. This is my code. public class GetViewFromHbaseBolt extends BaseBasicBolt { private HTable table; private String zkQuorum; private String zkClientPort; private String tableName; public GetViewFromHbaseBolt(String table, String zkQuorum, String zkClientPort) { this.tableName = table; this.zkQuorum = zkQuorum; this.zkClientPort...

running hadoop program with hbase stuck at htable declaration

java,apache,hadoop,mapreduce,hbase
I'm implementing hadoop program which requires hbase. I'm using Hadoop 2.5.1 and HBase 0.20.6 (I first used 0.94.8 but after facing the problem I just try changing to 0.20.6 because the document of my original source code tell me, unfortunately that didn't solve the problem.) After compiling the code using...

Hbase: Having just the first version of each cell

hadoop,hbase
I was wondering how can I configure Hbase in a way to store just the first version of each cell? Suppose the following Htable: row_key cf1:c1 timestamp ---------------------------------------- 1 x t1 After putting ("1","cf1:c2",t2) in the scenario of ColumnDescriptor.DEFAULT_VERSIONS = 2 the mentioned Htable becomes: row_key cf1:c1 timestamp ---------------------------------------- 1...

./bin/hbase shell command not working

hbase,nutch
i am integrating nutch with hbase . While dummy testing hbase . by typing ./bin/hbase shell.... i am getting the following error ./bin/hbase: line 392: /etc/java-7-openjdk//bin/java: No such file or directory thank you...

Best way to retrieve all the table records in Apache Gora 0.5

apache,hbase,gora
I know about query.setStartKey(startKey); query.setEndKey(endKey); Isn't there something similar to SELECT * FROM TABLE; in Apache Gora while creating queries, that would return me all the result set. EDIT* I executed the program without setting anything. But still, the result set is null. Query<String, Obj> query = store.newQuery(); Result<String, Obj>...

Phoenix - No current connection - HRegion.mutateRowsWithLocks : java.lang.NoSuchMethodError

java,hadoop,hbase,phoenix
I try to run Phoenix in localhost and can't resolve the error (can't find where is mutateRowsWithLocks). I would like a lot to run SQL queries on HBase so hope someone will help me: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: SYSTEM.CATALOG: org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(Ljava/util/Collection;Ljava/util/Collection;)V ..... Caused by: java.lang.NoSuchMethodError:...

Nutch, NoSuchElementException error after removing table from Hbase

hbase,nutch,nosuchelementexception
I use nutch for crawling some sites. One time i decide to clear all crawling result and just remove "webpage" table from Hbase store, using hbase shell. After that nutch trow exception java.util.NoSuchElementException at java.util.TreeMap.key(TreeMap.java:1221) at java.util.TreeMap.firstKey(TreeMap.java:285) at org.apache.gora.memory.store.MemStore.execute(MemStore.java:125) at org.apache.gora.query.impl.QueryBase.execute(QueryBase.java:73) at org.apache.gora.mapreduce.GoraRecordReader.executeQuery(GoraRecordReader.java:68) at...

Is the installation of Pig,Hive,Hbase,Oozie,Zookeeper same in Hadoop 2.0 as in Hadoop 1.0?

hadoop,hive,hbase,apache-pig,oozie
I recently installed hadoop v_2 with the YARN Configuration. I am planning to install Hadoop ecosystem stack such as Pig,Hive,Hbase,Oozie,Zookeeper etc. I would like to know if I should install the tools from the same link that I did for Hadoop 1.0 Configuration. If not, Could anyone please send me...

Why to use multiple column families in HBase?

hadoop,hbase
Why to use multiple column families in HBase and what are the advantages of these tuples?

Hbase on hadoop not connecting on distrubuted mode

hadoop,hbase,bigdata,ubuntu-14.04,distributed
Hi I AM TRYING TO SETUP HBASE(hbase-0.98.12-hadoop2) ON HADOOP(hadoop-2.7.0) Hadoop is running on localhost:560070 its running fine . my hbase-site.xml as show below <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://localhost:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name>...

unable to perform CURD operations on hbase even the hbase master and region server is up and running

hadoop,hbase
I am trying to run my hbase master from ambari and it has been started....even I used JPS command to see whether master is up or not and I can see that it is up but then also when I am trying to create table or listing table it is...

Rows with identical keys

hbase,bigdata
When I need to create an HBase-row, I have to call Put(row_key) method. Then, what happens if I'll call Put() method again with the same row_key value? Will the existing row be updated or HBase will create the new row? Is it possible to create 2 rows with identical keys?...

How do I split my Hbase Table(which is huge) into equal parts so that I can store it into local file system?

hadoop,export,hbase,bigdata,software-engineering
I have a Hbase Table of Size 53 GB that I want to store into my local file system. However I have only two drives of size 30gb each and I can't store the file completely into one drive. Could anyone please tell me how do I split and store...

gremlin shell hangs after opening a connection

hadoop,hbase,zookeeper,titan,gremlin
my dev environment is Hadoop 2.6.0 HBase 0.98.10.1-hadoop2 titan 0.5.3 i tried to open a connection by conf = new BaseConfiguration(); conf.setProperty("storage.backend","hbase"); conf.setProperty("storage.hostname","127.0.0.1"); conf.setProperty("storage.hbase.ext.hbase.zookeeper.property.clientPort","2181") conf.setProperty("storage.hbase.table","smart_titan") g = TitanFactory.open(conf); after that shell doesn't release the control, i verified zookeeper logs, everything looks normal session establishment and all. any pointers on this...!!!...

Why do I need to keep hbase/lib folder in hdfs?

hadoop,hbase
I have a main cluster which has some data in Hbase, and I want to replicate it. I've already created a backup cluster and created snapshot of the table I want to replicate. I am trying to export the snapshot from source cluster to destination, but I am getting some...

sqoop-merge can this command use on hbase import?

hbase,sqoop
I use sqoop import data from sql server to hbase. Can I also use sqoop-merge command to update data in hbase? Thanks...

Estimate row size HBase/HyperTable

hbase,hypertable
Is there a way to estimate row size if I know what kind of data I'll be storing (with compression in mind)? I'm looking at something like bson_id | string (max 200 chars) | int32 | int32 | int32 | bool | bool | DateTime | DateTime | DateTime |...

Hadoop ClassNotFoundException with class that's imported anymore

java,hadoop,mapreduce,hbase
I'm using Hadoop 2.5.1 with HBase 0.98.11 on Ubuntu 14.04 I was once using HBase. unfortunately it didn't work as expected. So, I decided to write the multioutput and filereader instead of using HBase. After commenting all HBase-related lines of code (also didn't include them in the javac -cp anymore),...

Exception in mapreduce code which is accessing Hbase table java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString

mapreduce,hbase
Hi getting the following exception, when running the map reduce program. The code has access to Hbase table and doing Put operation. Exception in thread "main" java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString ...

Switch a disk containing cloudera hadoop / hdfs / hbase data

hadoop,hbase,database-migration,cloudera,disk-partitioning
we have a Cloudera 5 installation based on one single node on a single server. Before adding 2 additional nodes on the cluster, we want to increase the size of the partition using a fresh new disk. We have the following services installed: yarn with 1 NodeManager 1 JobHistory and...

Difference between MapR-DB and Hbase

hadoop,hbase,mapr
I am bit new in MapR but i am aware about hbase. I was going through one of the video where I found that Mapr-DB is a NoSQL DB in MapR and it similar to Hbase. In addition to this Hbase can also be run on MapR. I am confused...

Ganglia fails to communicate with Apache HBase

hadoop,hbase,monitoring,ganglia
I installed Ganglia to monitor the HBase cluster. I'm using ganglia-3.3.0. Hadoop version: hadoop-1.1.2 HBase version : hbase-0.94.8 My Hadoop cluster comprises of 1 master node and 2 slave nodes. Ganglia gmetad_server is configured on the master node I changed the hbase/conf/hadoop-metrics.properties file. hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 hbase.period=10 hbase.servers=hostname_of_ganglia_server:8649 I started the service...

how to find out number of regions for a hbase table?

hbase
I thought this is easy, but couldn't find any answer. Hopefully, this can be done using command line tools. Or a python tool. Or at least find out how many hfiles? ...

How to I access HBase table in Hive & vice-versa?

hive,hbase,sqoop,apache-sqoop,apache-hive
As a developer, I've created HBase table for our project by importing data from existing MySQL table using sqoop job. The problem is our data analyst team are familiar with MySQL syntax, implies they can query HIVE table easily. For them, I need to expose HBase table in HIVE. I...

Get column data from HBase via JRuby script

hadoop,hbase,jruby
I can get value for certain column in HBase table via hbase shell: hbase(main):002:0> scan 'some_table', {STARTROW => '7af02800f4c6478cde0f55e8bce34f4a2efa48f2', LIMIT => 1, COLUMNS => ['foo:bar']} ROW COLUMN+CELL 7af02800f4c6478cde0f55e8bce34f4a2efa48f2 column=foo:bar, timestamp=0, value=http://someurl.com/some/path 1 row(s) in 0.4430 seconds I would like to reproduce the same with jruby script. The following is my...

running Hadoop with HBase: org.apache.hadoop.hbase.client.HTable.(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String

apache,hadoop,mapreduce,hbase,yarn
I'm trying to make a mapreduce program on Hadoop using HBase. I'm using Hadoop 2.5.1 with HBase 0.98.10.1. The program can be compiled successfully and being made into a jar file. But, when I try to run the jar using "hadoop jar" the program shows error says: "org.apache.hadoop.hbase.client.HTable.(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String". Here is...

HBase reading slowing down after reading few miilion records

hbase,apache-kafka
I have batch job scheduled to load about 250 million records from HBase table to Kafka Queue. The batch initially starts the scan or reading at about 1250 rows/sec. But after reading about 4 to 5 million records the read slows down to 90 rows/sec and maintains it forever. I...

Hbase column family design importance

java,hbase
I am studying HBase but can't find for myself answer for one question. Let's consider the following situation. We have five physical (hardware) servers (0-4). Hmaster is installed on server 0 and four hregion servers are installed on server 1-4. And we have one very big table which we need...

MapReduce (Hadoop-2.6.0)+ HBase-1.0.1.1 class not found exception

eclipse,hadoop,mapreduce,hbase
I have written a Map-Reduce program to fetch data from an input file and output it to a HBase table. But I am not able to execute. I am getting the following error Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration at beginners.VisitorSort.main(VisitorSort.java:123) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at...

How can I customize Sqoop Import serialization from Mysql to HBase?

mysql,serialization,import,hbase,sqoop
Currently, I have a MySql table "email_history" as below. email_address updated_date modification [email protected] 2014-10-20 NEW:confidence::75|NEW:sources::cif [email protected] 2014-10-20 NEW:confidence::75|NEW:sources::cif|NEW:user::r.wagland The field "email_address" and "modification" are VARCHAR and "updated_date" is DATE. When importing to HBase, the row key needs to be email_address concatenating byte array presented date. And the value needs to...

Wrong data coming out when curl from HBASE

rest,curl,hive,hbase
A table scan [scan 'mytable']from hbase shell(ssh putty) is showing me correct values But if I give a command from ssh, curl -H "Accept: application/json" http://localhost:54321/mytable/first/cf - it shows all the cells but data is coming as junk. eg: "Zmlyc3Q=" instead of "first" Note : I am using a hortonworks...

How does HBase mapReduce TableOutputFormat use Flush and WAL

hadoop,mapreduce,hbase
So while writing to HBase from a MapReduce job which is using TableOutputFormat how often does it write to HBase. I dont imagine it doing a put command for every row. How do we control AutoFlush and Write Ahead Log (WAL) while using in MapReduce?...

update query in hive/hbase

hadoop,hive,hbase
I have already created a table in hbase using hive: hive> CREATE TABLE hbase_table_emp(id int, name string, role string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role") TBLPROPERTIES ("hbase.table.name" = "emp"); and created another table to load data on it : hive> create table testemp(id int, name string, role string)...

How to enable compression on an existing Hbase table?

compression,hbase
I have a very big Hbase table apData, but it was not set as compressed when it was created. Right now it's 1.5TB. So I wanna enable compression feature on this table. I did the following: (1)disable apData (2)alter apData,{NAME=>'cf1',COMPRESSION=>'snappy'} (3)enable 'apData'. But when I use "desc apData" to see...

What does the HBase MILLIS_BETWEEN_NEXTS counter represent?

hadoop,mapreduce,hbase
I am running a map reduce job reading from HBase. There are some mappers that are much slower than others and the only significant difference in their counters is MILLIS_BETWEEN_NEXTS. I tried looking for an explanation of the metric but did not find anything. Do you know what this metric...

User used by Java Client API to access Hbase

java,hbase,hadoop2
I am learning Hbase.I want to know a Java Client will communicate with Hbase data ? I can see there are config ,HConnectionManager Classes to communicate with Hbase.I am curios to understand which userID does the client uses for this communication. For Example : Say, I am running a hbaseTest.jar...

Squirrel Setup to connect to Phoenix - HBASE: Error java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NoSuchMethodError:

jdbc,hbase,phoenix,squirrel
I am a newbie to Hbase & phoenix. I am trying to connect to HBASE via Phoenix JDBC Driver using Squirrel Client. Somehow I seem to get a strange error where the runtime complains of a NoSuchMethod Exception. I have included the relevant client jar phoenix-4.4.0-HBase-1.0-client in the lib folder...

error while executing insert overwrite query in hive

hadoop,hive,hbase
I'm using hadoop 1.2 , hbase 0.94.8 and hive 0.14 . I'am trying to insert data into a hbase table using hive. I have already created the table: CREATE TABLE hbase_table_emp(id int, name string, role string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role") TBLPROPERTIES ("hbase.table.name" = "emp"); and load...

HBase Java client api not connecting

java,hbase
I'm writing simple "hello world" application using Java API for HBase. Here's my code: public static void main(String[] args) throws IOException { Configuration conf = HBaseConfiguration.create(); conf.set("hbase.zookeeper.quorum", "localhost"); conf.set("hbase.zookeeper.property.clientPort", "2181"); HTable table = new HTable(conf, "myTable"); Scan s = new Scan(); s.addColumn(Bytes.toBytes("a"), Bytes.toBytes("b")); ResultScanner scanner = table.getScanner(s); for (Result rr...

why HBase count operation so slow

cassandra,hbase
The command as: count 'tableName'. It's very slow to get the total row number of the whole table. My situation is: I have One master and two slaves, each node with 16 cpus and 16G memory. My table only has one column family with two columns: title and Content. The...

cannot start Hadoop daemons: Insufficient memory

java,ubuntu,hadoop,mapreduce,hbase
At first I could be able to start daemons and run jobs properly, then out of nowhere, I cant start the daemons (start-dfs, start-yarn). After running .sh the terminal waits forever (as in the picture http://imgur.com/Sr5I5aw). The only way to stop is ctrl+c. The logs hs_error_pidxxxx.log says something about insufficient...

Insert data into Hbase using Hive (JSON file)

json,hadoop,hive,hbase
I have already created a table in hbase using hive: hive> CREATE TABLE hbase_table_emp(id int, name string, role string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role") TBLPROPERTIES ("hbase.table.name" = "emp"); and created another table to load data on it : hive> create table testemp(id int, name string, role string)...

Example How to represent table from RDBMS to HBase

hadoop,nosql,hbase
I read few articles and videos on YouTube about HBase. I understood that HBase is hadoop database. And it has different architecture (like column group etc.) compared to RDBMs. But I am still not clear how a RDBMs table will be represented in HBase? Let me know if there is...

Is it true that deleted key-value are removed in Hbase only during major compaction

hbase
According to this, hbase only remove duplicate or deleted key-value during major compaction. In a major compaction, deleted key/values are removed, this new file doesn’t contain the tombstone markers and all the duplicate key/values (replace value operations) are removed. Major compaction merges all HFiles into one big HFile while minor...

Storing and reading images from hbase

hbase
I want to store images in my hbase table.And I want to read the image file from hbase table as a image.Is it possible to do it in hbase. Image size may not exceed 10MB.For each row there should be an image associated with it.Wondering how to do it.Need some...

Can I use Hbase snapshots for bulk loading

java,hadoop,hbase
I am wondering if I can use hbase snapshot output for bulk loading? I am trying to load data into another cluster and the org.apache.hadoop.hbase.snapshot.ExportSnapshot doesn't work for me as we have over 1Tb of data to transfer. So I was looking at Snapshots, and it looks like creating and...

Use case HBase on EMR

hadoop,amazon-web-services,hbase,storage,emr
I read the documentation on AWS, but a point is still unclear. Is S3 the primary storage of EMR cluster? or does the data are in EC2 and S3 is just a copy? In the doc : "HBase on Amazon EMR provides the ability to back up your HBase data...

Using Mockito to test Java Hbase API

unit-testing,junit,hbase,mockito
This is the method that I am testing. This method gets some Bytes from a Hbase Database based on an specific id, in this case called dtmid. The reason I why I want to return some specific values is because I realized that there is no way to know if...

NameError: uninitialized constant SingleColumnValueFilter

hbase,cloudera
I am trying to use hbase filter using this code, hbase(main):001:0> scan 'students', { FILTER => SingleColumnValueFilter.new(Bytes.toBytes('account'),Bytes.toBytes('name'), CompareFilter::CompareOp.valueOf('EQUAL'),BinaryComparator.new(Bytes.toBytes('emp1')))} and this code give the error like, NameError: uninitialized constant SingleColumnValueFilter Please let me know what I am doing wrong or what I need to do for get filter result....

how to query hbase based on row keys

hbase
new to Hbase: I am working with Hbase now and there is something I can't figure out (for a while now). I was wondering if you can help me here. I have this Hbase table with one column-family (cf) and certains columns and a row-key. my row key contains certain...

java.io.IOException: Merging of credentials not supported in this version of hadoop

hadoop,hive,hbase
I am trying to access a table through Hive created in HBase. The below commands executed successfully. hbase(main):032:0> create 'hbasetohive', 'colFamily' 0 row(s) in 1.9540 seconds hbase(main):033:0> put 'hbasetohive', '1s', 'colFamily:val','1strowval' 0 row(s) in 0.1020 seconds hbase(main):034:0> scan 'hbasetohive' ROW COLUMN+CELL 1s column=colFamily:val, timestamp=1423936170125, value=1strowval 1 row(s) in 0.1170 seconds...

How do I determine the size of my HBase Tables ?. Is there any command to do so?

hadoop,export,hbase,bigdata
I have multiple tables on my Hbase shell that I would like to copy onto my file system. Some tables exceed 100gb. However, I only have 55gb free space left in my local file system. Therefore, I would like to know the size of my hbase tables so that I...

reverse domain name row key, automatic splitting, and load balancing

hbase,load-balancing,sharding,primary-key-design
I'm designing an HBase schema with a row key that starts with the domain name reversed. E.g., com.example.www. Although there are many more domains that end in .com than say .org or .edu, I assume that I don't have to manage splitting myself, and I can rely on HBase's automatic...

Create External Hive Table Pointing to HBase Table

sql,hadoop,hive,hbase,impala
I have a table named "HISTORY" in HBase having column family "VDS" and the column names ROWKEY, ID, START_TIME, END_TIME, VALUE. I am using Cloudera Hadoop Distribution. I want to provide SQL interface to HBase table using Impala. In order to do this we have to create respective External Table...

What is the difference between JDBC and a Java API?

java,hbase
I'm learning about HBase, which is written in Java and therefore has a Java API. I assumed it also supported JDBC but it looks like it doesn't, and now I'm thinking I don't really understand what JDBC means. What is the difference? What can I do with a Java API...

Unable to access HBase from MapReduce code

java,hadoop,mapreduce,hbase,zookeeper
I am trying to use HDFS file as source and HBase as sink. My Hadoop cluster has following specification: master 192.168.4.65 slave1 192.168.4.176 slave2 192.168.4.175 slave3 192.168.4.57 slave4 192.168.4.146 The Zookeeper nodes are on following ip address: zks1 192.168.4.60 zks2 192.168.4.61 zks3 192.168.4.66 The HBase nodes are on following ip...

LongComparator does not work in HBase/BigTable

java,hbase,google-cloud-bigtable
I'm trying to build some filters to filter data from BigTable. I'm using bigtable-hbase drivers and hbase drivers. Actually here is my dependencies from pom.xml <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-common</artifactId> <version>${hbase.version}</version> </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-protocol</artifactId> <version>${hbase.version}</version> </dependency>...

Solr Indexing in Storm topology vs Hbase NG Indexer

indexing,solr,hbase,storm
I am working on designing the Data Indexing feature into Solr. We are using Storm Topology and have a Hbase Bolt where it is adding data into Hbase. The requirement is what ever data we are adding into Hbase, needs to be indexed as well. The following are the options:...

gremlin hangs with sigle node hbase,titan-all 0.4.4

cassandra,hbase,analytics,zookeeper,titan
I have set up single node hadoop and hbase onto it. I also set up titan onto it. But as soon as I start gremlin and do TitanFactory.open(conf) , it hangs and nothing happens. my titan-hbase.properties is as follows: storage.backend=hbase storage.hostname=127.0.0.1 storage.port=2181 cache.db-cache = true cache.db-cache-clean-wait = 20 cache.db-cache-time =...

Zookeeper quorum issue with external hbase client when running hbase on Amzon EMR

amazon-web-services,hbase,zookeeper,emr
I am running HBase on Amazon EMR. <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property><name>fs.hdfs.impl</name><value>emr.hbase.fs.BlockableFileSystem</value></property> <property><name>hbase.regionserver.handler.count</name><value>100</value></property>...

secure hbase application - kerberos authentication

security,hbase,kerberos,keytab
I am running a infinite loop for testing expiration of kerberos credentials, I have the following code. UserGroupInformation.loginUserFromKeytab(user, keytablocn); Configuration config = HBaseConfiguration.create(); HConnection conn = HConnectionManager.createConnection(config); for (;;) { HTableInterface ht = conn.getTable(tableName); getAndPriintRow(rowkey); } I expect the kerberos credentials to expire about 10 hrs after the program starts...

ZooKeeper start successfully but doesn't work

java,hadoop,hbase,zookeeper,apache-kafka
I am trying to deploy a zookeeper ensemble in fully distributed mode using three nodes. After starting the server no entry comes under jps. On giving "zkServer.sh status" the output is: JMX enabled by default Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg Error contacting service. It is probably not running On giving the command...

D3 visualization from Hbase using REST api/json

rest,d3.js,base64,hbase
I have a standalone Vbox setup with hortonworks sandbox. And I have an HBASE table called 'mytable' and column family 'cf'. REST service is started and I can curl through ssh to 127.0.0.1 address to get the data in Base64 encoded format as JSON. Need some help on visualizing this...

Migrate java code from hbase 0.92 to 0.98.0-hadoop2

java,hbase
I hava some code, wrote with hbase 0.92: /** * Writes the given scan into a Base64 encoded string. * * @param scan The scan to write out. * @return The scan saved in a Base64 encoded string. * @throws IOException When writing the scan fails. */ public static String...

Saving multiple versions in HBase cell

java,hadoop,mapreduce,hbase,zookeeper
I am new to HBase. I am trying to save multiple versions in a cell in HBase but I am just getting the last saved value only. I tried the following two commands to retrieve multiple saved versions: get 'Dummy1','abc', {COLUMN=>'backward:first', VERSIONS=>12} and scan 'Dummy1', {VERSIONS=>12} Both returned the output...

Fetch HBase Column in Bash Array Using Impala

bash,hadoop,hbase,impala
I have following data in HBase Table named HISTORY. ID VALUES 51 101 52 102 QUERY="SELECT VALUES FROM HISTORY"; How to apply the above query on HBase table to fetch data in bash array using Impala?...

HBase to Hive example with Scalding

scala,hadoop,hive,hbase,scalding
I'm trying to read data from HBase, process it and then write to Hive. I'm new to both Scalding and Scala. I have looked in to SpyGlass for reading from HBase. It works well and I can read the data and then write the it a file. val data =...

Running HBase in standalone mode but get hadoop “retrying connect to server” message?

hadoop,hbase
I'm trying to run HBase in standalone mode following this tutorial: http://hbase.apache.org/book.html#quickstart I get the following exception when I try to run create 'test', 'cf' in the HBase shell ERROR: org.apache.hadoop.hbase.PleaseHoldException: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing I've seen questions here regarding this error, but the solutions haven't worked for me. What...

How can I pre split in hbase

hadoop,hbase
I am storing data in hbase having 5 region servers. I am using md5 hash of url as my row keys. Currently all the data is getting stored in one region server only. So I want to pre-split the regions so that data will go uniformly across all region server,...

Datsac Cassandra binding with Apache Cassandra

hbase,bigdata,amazon-dynamodb,cassandra-2.0,bigtable
I am tryng to use the Datsax Cassandra (community endition) , but not able to figure out the Datasax git repo for the same . Can someone please help me out in figuring out which release of apache cassandra is used by Datasax cassandra (Community edition ) ??? or does...

How to get all versions of an hbase cell in a spark newAPIHadoopRDD?

hadoop,hbase,apache-spark
I know when you use the Get API you can set MAX_VERSION_COUNT to get all versions of a cell. But I didn' t find any documentation on how to get all versions of cell with a map operation of spark newAPIHadoopRDD. I' ve tried with a naive result.getColumnCells() and it...

Importtsv command gives : Container exited with a non-zero exit code 1 error

hadoop,hbase,classpath,yarn
I am trying to load a tsv file into an existing hbase table. I am using the following command: /usr/local/hbase/bin$ hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,cf:value '-Dtable_name.separator=\t' Table-name /hdfs-path-to-input-file But when I execute the above command, I get the following error Container id: container_1434304449478_0018_02_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)...

Spark can't pickle method_descriptor

python,hbase,apache-spark,pickle,happybase
I get this weird error message 15/01/26 13:05:12 INFO spark.SparkContext: Created broadcast 0 from wholeTextFiles at NativeMethodAccessorImpl.java:-2 Traceback (most recent call last): File "/home/user/inverted-index.py", line 78, in <module> print sc.wholeTextFiles(data_dir).flatMap(update).top(10)#groupByKey().map(store) File "/home/user/spark2/python/pyspark/rdd.py", line 1045, in top return self.mapPartitions(topIterator).reduce(merge) File "/home/user/spark2/python/pyspark/rdd.py", line 715, in reduce vals =...

unable to connect to hbase using java(in Eclipse) in Cloudera VM

java,eclipse,hadoop,hbase
I am trying to connect to Hbase using Java(in Eclipse) in Cloudera VM, but getting below error. Am able to run same program in command line(by converting my program into jar) my java program `import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.TableName; import org.apache.hadoop.hbase.client.*; import org.apache.hadoop.hbase.util.Bytes; //import...

POM entries for Hadoop (version 2.4) with HBase (version 0.94.18)

maven,hadoop,hbase,pom.xml
What are the POM entries for Hadoop (version 2.4) with HBase (version 0.94.18) Do i have to use hadoop-core? If so, which version?...

How do I convert my HBase cluster to use Google Cloud Bigtable?

hbase,bigtable,google-cloud-bigtable
I'm currently running a HBase Cluster on Google Cloud Platform and would like to switch it to using Cloud Bigtable -- what should I do?

HBase Get values where rowkey in

hadoop,apache-spark,hbase,apache-spark-sql
How do I get all the values in HBase given Rowkey values? val tableName = "myTable" val hConf = HBaseConfiguration.create() val hTable = new HTable(hConf, tableName) val theget= new Get(Bytes.toBytes("1001-A")) // rowkey values (1001-A, 1002-A, 2010-A, ...) val result = hTable.get(theget) val values = result.listCells() The code above only works...

scan.addcolumn or qualifierfilter to retreive values

hbase
To retrieve values of a specific column in Hbase, should I use scan.addcolumn or qualifierfilter? I am wondering which method gives better performance....

How to export data from hbase to SQL Server

sql-server,hbase,sqoop
How can I export data from hbase to SQL Server? Can I do it directly using some tools? I use sqoop to export data from SQL Server to hbase. But how can I use sqoop-export to export data from hbase to SQL Server? Thanks...

Does Hadoop use HBase as an “auxiliar” between the map and the reduce step? [closed]

hadoop,mapreduce,hbase,mapper
Or HBase does not have anything to do with this process? I have read that Hbase works on top of hadoop, and I have seen some diagrams that shows Hbase as part of the MapReduce part of Hadoop, but I have not found anything concrete about my question....

Writing Web methods to Hbase

java,hbase,apache-pig
I am not from Java ,so my question may be very easy but I need clear steps how to implement. Existing project : Webmethods connecting to Oracle Data base to fetch certain properties file and insert log information into some tables. Problem: Many a times data base goes down and...

`hbase.rootdir` configuration from job setup not honoured

java,hadoop,mapreduce,hbase
I was running map reduce jobs on HDFS, on data present in hbase tables. WHile I was playing with configurations, I observed this. conf.set( "hbase.rootdir", "hdfs://" + hdfsRootNodeIp + ":" + hdfsRootPort + "/" + hbaseDirectoryName ); For the above code, I understand that hbaseDirectoryName should be the folder created...

Copy Hbase table to another with different queue for map reduce

hadoop,mapreduce,hbase
I run CopyTable action on Hbase hbase -Dhbase.client.scanner.caching=100000 -Dmapred.map.tasks.speculative.execution=false org.apache.hadoop.hbase.mapreduce.CopyTable --new.name=desc src but map reduce is spowned on the default queue. How to run this task on different Application Queue?...

Multiple FilterLists in HBase

nosql,hbase
Is it possible to have multiple FilterLists while performing a scan in HBase ? If yes, how ? By multiple FilterLists I do not mean multiple Filters.

Streaming from HBase using Spark not serializable

scala,hbase,apache-spark
I am trying to stream data from HBase using Spark. When I run the scala script, this is the error I get: ERROR Executor: Exception in task 0.0 in stage 10.0 (TID 10) java.io.NotSerializableException: org.apache.hadoop.hbase.io.ImmutableBytesWritable I was thinking at first that my data was formatted incorrectly, so I tried creating...

Export data from Hbase to hadoop

hadoop,export,hbase
I want to export hbase table data from hbase to hadoop as .txt file so that I can using other tools to import the .txt file to SQL Server. I triedhbase org.apache.hadoop.hbase.mapreduce.Export test /usr/hadoop/hadoop-2.2.0/test It only gave me a file folder not a txt file. Anyone can help? Thanks...

Google Cloud Bigtable coprocessor support

hbase,phoenix,google-cloud-bigtable
Google Cloud BigTable doesn't support coprocessors: Coprocessors are not supported. You cannot create classes that implement the interface org.apache.hadoop.hbase.coprocessor. https://cloud.google.com/bigtable/docs/hbase-differences I can understand that coprocessors require deployment of customer code (jars) on each Tablet (RS) node. Still, Endpoint coprocessors are vital to HBase applications to ensure data locality in some...

Spark give Null pointer exception during InputSplit for Hbase

scala,hadoop,mapreduce,hbase,apache-spark
I am using Spark 1.2.1,Hbase 0.98.10 and Hadoop 2.6.0. I got a null point exception while retrieve data form hbase. Find stack trace below. [sparkDriver-akka.actor.default-dispatcher-2] DEBUG NewHadoopRDD - Failed to use InputSplit#getLocationInfo. java.lang.NullPointerException: null at scala.collection.mutable.ArrayOps$ofRef$.length$extension(ArrayOps.scala:114) ~[scala-library-2.10.4.jar:na] at scala.collection.mutable.ArrayOps$ofRef.length(ArrayOps.scala:114) ~[scala-library-2.10.4.jar:na] at...

Uploading HFiles in Hbase fails because of method not found error

hadoop,mapreduce,hbase,hdfs
I am trying to upload Hfiles to Hbase using bulkload. While doing so I am encountering method not found error . Giving the logs and command below. Command hadoop jar /usr/lib/hbase/lib/hbase-server-0.98.11-hadoop2.jar completebulkload /output NBAFinal2010 where output is the Hfiles output folder and NBAFinal2010 is table in Hbase. logs :- 15/05/05...