ubuntu,hadoop,terminal , Namenode and Datanode are not starting in hadoop

Namenode and Datanode are not starting in hadoop


Tag: ubuntu,hadoop,terminal

I installed hadoop 2.6.0 in my laptop running Ubuntu 14.04LTS. I started the hadoop daemons by running start-all.sh. But when I type jps only 4 are running

10545 SecondaryNameNode
10703 ResourceManager
11568 Jps
10831 NodeManager

Previously only datanode only was not running so I deleted the tmp folder and created it again. Now namenode and datanode both are not running. I also checked whether 50070 and 50075 are being used by any other processes but there are no processes using them.

tcp        0      0*               LISTEN      1000       52304       6129/java       
tcp        0      0 *               LISTEN      1000       70108       10545/java      
tcp        0      0 *               LISTEN      1000       50441       6129/java       
tcp6       0      0 :::8033                 :::*                    LISTEN      1000       70199       10703/java      
tcp6       0      0 :::8040                 :::*                    LISTEN      1000       74863       10831/java      
tcp6       0      0 :::8042                 :::*                    LISTEN      1000       71055       10831/java      
tcp6       0      0 :::46573                :::*                    LISTEN      1000       74854       10831/java      
tcp6       0      0 :::8088                 :::*                    LISTEN      1000       71049       10703/java      
tcp6       0      0 :::13562                :::*                    LISTEN      1000       71054       10831/java      
tcp6       0      0 :::8030                 :::*                    LISTEN      1000       72716       10703/java      
tcp6       0      0 :::8031                 :::*                    LISTEN      1000       72175       10703/java      
tcp6       0      0 :::8032                 :::*                    LISTEN      1000       72182       10703/java  

This is what I have in my datanode logs:

STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = srimanth/
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 2.6.0
STARTUP_MSG:   classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-2.6.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/snappy-java-*.jar:/contrib/capacity-scheduler/*.jar:/contrib/capacity-scheduler/*.jar
STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z
STARTUP_MSG:   java = 1.7.0_65
2015-01-27 19:30:29,640 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
2015-01-27 19:30:31,491 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-01-27 19:30:32,241 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-01-27 19:30:32,655 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-01-27 19:30:32,656 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2015-01-27 19:30:32,672 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is srimanth
2015-01-27 19:30:32,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting DataNode with maxLockedMemory = 0
2015-01-27 19:30:32,826 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened streaming server at /
2015-01-27 19:30:32,838 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s
2015-01-27 19:30:32,838 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Number threads for balancing is 5
2015-01-27 19:30:33,233 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-01-27 19:30:33,246 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.datanode is not defined
2015-01-27 19:30:33,284 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2015-01-27 19:30:33,291 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context datanode
2015-01-27 19:30:33,291 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2015-01-27 19:30:33,292 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2015-01-27 19:30:33,346 INFO org.apache.hadoop.http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.datanode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2015-01-27 19:30:33,357 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 50075
2015-01-27 19:30:33,358 INFO org.mortbay.log: jetty-6.1.26
2015-01-27 19:30:34,395 INFO org.mortbay.log: Started [email protected]:50075
2015-01-27 19:30:34,443 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnUserName = srimanth
2015-01-27 19:30:34,443 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: supergroup = supergroup
2015-01-27 19:30:34,611 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-01-27 19:30:34,690 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 50020
2015-01-27 19:30:34,938 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened IPC server at /
2015-01-27 19:30:34,993 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Refresh request received for nameservices: null
2015-01-27 19:30:35,078 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting BPOfferServices for nameservices: <default>
2015-01-27 19:30:35,119 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (Datanode Uuid unassigned) service to localhost/ starting to offer service
2015-01-27 19:30:35,139 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-01-27 19:30:35,139 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2015-01-27 19:30:36,112 INFO org.apache.hadoop.hdfs.server.common.Storage: DataNode version: -56 and NameNode layout version: -60
2015-01-27 19:30:36,187 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /usr/local/hadoop/hdfs/datanode/in_use.lock acquired by nodename [email protected]
2015-01-27 19:30:36,210 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to localhost/ Exiting. 
java.io.IOException: Incompatible clusterIDs in /usr/local/hadoop/hdfs/datanode: namenode clusterID = CID-9748dc33-5035-4bcc-9b51-cb75e0a7eadc; datanode clusterID = CID-41e9d369-787a-4595-8827-6bb3277787e9
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:646)
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:320)
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:403)
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:422)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1311)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1276)
    at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:828)
    at java.lang.Thread.run(Thread.java:745)
2015-01-27 19:30:36,252 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to localhost/
2015-01-27 19:30:36,360 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
2015-01-27 19:30:38,360 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2015-01-27 19:30:38,366 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2015-01-27 19:30:38,371 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 
SHUTDOWN_MSG: Shutting down DataNode at srimanth/

I would appreciate some help. thank you.


In your hdfs-site.xml file there should be dfs.data.dir property that points to a local directory. Delete everything under the directory and not the directory itself. Careful!! if you have any data on hdfs you will lose all of it.


Would using Vagrant be overkill? [on hold]

I'm a developer-hobbyist running Windows 8.1 on a Yoga 2 Pro. I mostly do Python/Django work but I think I'm gonna pick up Ruby soon. The thing is, Windows always seems to be the limiting factor for any project I want to pick up. Last time I tried to install...

Spark on yarn jar upload problems

I am trying to run a simple Map/Reduce java program using spark over yarn (Cloudera Hadoop 5.2 on CentOS). I have tried this 2 different ways. The first way is the following: YARN_CONF_DIR=/usr/lib/hadoop-yarn/etc/hadoop/; /var/tmp/spark/spark-1.4.0-bin-hadoop2.4/bin/spark-submit --class MRContainer --master yarn-cluster --jars /var/tmp/spark/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar simplemr.jar This method gives the following error: diagnostics: Application application_1434177111261_0007...

Apache Spark: Error while starting PySpark

On a Centos machine, Python v2.6.6 and Apache Spark v1.2.1 Getting the following error when trying to run ./pyspark Seems some issue with python but not able to figure out 15/06/18 08:11:16 INFO spark.SparkContext: Successfully stopped SparkContext Traceback (most recent call last): File "/usr/lib/spark_1.2.1/spark-1.2.1-bin-hadoop2.4/python/pyspark/shell.py", line 45, in <module> sc =...

HIVE: apply delimiter until a specified column

I am trying to move data from a file into a hive table. The data in the file looks something like this:- StringA StringB StringC StringD StringE where each string is separated by a space. The problem is that i want separate columns for StringA, StringB and StringC and one...

Merging two columns into a single column and formatting the content to form an accurate date-time format in Hive?

these are the 2 columns(month,year). I want to create a single column out of them having an accurate date-time format('YYYY-MM-DD HH:MM:SS') and add as new column in the table. Month year 12/ 3 2013 at 8:40pm 12/ 3 2013 at 8:39pm 12/ 3 2013 at 8:39pm 12/ 3 2013 at...

JMH Benchmark on Hadoop YARN

I have written a JMH benchmark for my MapReduce job. If I run my app in local mode, it works, but when I run it with the yarn script on my hadoop cluster, then I get the following error: [[email protected] Desktop]$ ./launch_mapreduce.sh # JMH 1.10 (released 5 days ago) #...

connect to mysql database which is in ubuntu server

I am using below code to connect MySQL database in PHP. try { shell_exec("ssh -f -L 3307: [email protected]_ip sleep 60 >> logfile"); $this->_conn = $this->dbh = new PDO('mysql:host=;dbname=my_db', DB_USER, DB_PASS); $this->dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); } catch (PDOException $e) { die("Couldn't connect to database. Please try again!"); } I want to direct connect...

Can not increase max_open_files for Mysql max-connections in Ubuntu 15

I am running this version of Mysql Ver 14.14 Distrib 5.6.24, for debian-linux-gnu (x86_64) On this version of Ubuntu Distributor ID: Ubuntu Description: Ubuntu 15.04 Release: 15.04 Codename: vivid This is the config I set for Mysql: key_buffer_size = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 innodb_buffer_pool_size=20G...

Sqoop Export with Missing Data

I am trying to use Sqoop to export data from HDFS into Postgresql. However, I receive an error partially through the export that it can't parse the input. I manually went into the file I was exporting and saw that this row had two columns missing. I have tried a...

Hive external table not reading entirety of string from CSV source

Relatively new to the Hadoop world so apologies if this is a no-brainer but I haven't found anything on this on SO or elsewhere. In short, I have an external table created in Hive that reads data from a folder of CSV files in HDFS. The issue is that while...

Flink error - org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4

I am trying to run a flink job using a file from HDFS. I have created a dataset as following - DataSource<Tuple2<LongWritable, Text>> visits = env.readHadoopFile(new TextInputFormat(), LongWritable.class,Text.class, Config.pathToVisits()); I am using flink's latest version - 0.9.0-milestone-1-hadoop1 (I have also tried with 0.9.0-milestone-1) whereas my Hadoop version is 2.6.0 But,...

Why we are configuring mapred.job.tracker in YARN?

What I know is YARN is introduced and it replaced JobTracker and TaskTracker. I have seen is some Hadoop 2.6.0/2.7.0 installation tutorials and they are configuring mapreduce.framework.name as yarn and mapred.job.tracker property as local or host:port. The description for mapred.job.tracker property is "The host and port that the MapReduce job...

Cannot set PHP include_path

I have uploaded all my files in var/www/html and in one of my php files I have this line : require_once('libraries/stripe/init.php'); the structure of my folders are list this: www -html/ -libraries -> Stripe -> init.php -register.php I keep getting this error message: Warning: require_once(libraries/Stripe/init.php): failed to open stream: No...

Ubuntu down load a package and all of its dependencies without installing them

I need to download a package and all of its dependencies without installing any of them. I'm looking for a command like apt-get -R --download-only install package-name Or any solution that would produce the same result. Based on my research I could not find a solution that produces this and...

How to insert and Update simultaneously to PostgreSQL with sqoop command

I am trying to insert into postgreSQL DB with sqoop command. sqoop export --connect jdbc:postgresql:// --table table1 --username user1 --password pass1--export-dir /hivetables/table/ --fields-terminated-by '|' --lines-terminated-by '\n' -- --schema schema It is working fine if there is not primary key constrain. I want to insert new records and update old records...

Oozie on YARN - oozie is not allowed to impersonate hadoop

I'm trying to use Oozie from Java to start a job on a Hadoop cluster. I have very limited experience with Oozie on Hadoop 1 and now I'm struggling trying out the same thing on YARN. I'm given a machine that doesn't belong to the cluster, so when I try...

ERROR jdbc.HiveConnection: Error opening session Hive

i try to run JBDC code for Hive2 get error. i have hive 1.2.0 version hadoop 1.2.1 version. but in command line hive and beeline works fine without any problem.but with jdbc getting error. import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; public class HiveJdbcClient { private static...

how to drop partition metadata from hive, when partition is drop by using alter drop command

I have dropped the all the partitions in the hive table by using the alter command alter table emp drop partition (hiredate>'0'); After droping partitions still I can see the partitions metadata.How to delete this partition metadata? Can I use the same table for new partitions? ...

Unable to connect to mariadb database server with qt 4.8.5 and Ubuntu 12.04

I use the following code to connect to a MySQL server database. QSqlDatabase db_Server = QSqlDatabase::database("Test"); //find mysql driver db_Server = QSqlDatabase::addDatabase("QMYSQL","Test"); db_Server.setHostName("188.**.***.***"); db_Server.setPort(3306); db_Server.setDatabaseName("Test"); db_Server.setUserName("Test"); db_Server.setPassword("*********"); bool ret = db_Server.open(); if(ret) qDebug() << "Database open" else qDebug() << db_Server.lastError().text(); Lately they changed the server to mariadb and I assumed...

Write Access for user on all repos on Gitolite

I'm trying to add access to read, write and create new repos from my local to a gitolite server. I have the following config on my gitolite server, but it doesn't want to let me push to a new repo: repo @all RW+ = git repo gitolite-admin RW+ = git...

Boost unit test dynamic linking on Ubuntu

I am trying to build a unit test using Boost's unit test framework. I would like to dynamically link test suite libraries with the auto generated test module that Boost provides. Here is the basic construction I've been using: test_main.cpp: #define BOOST_TEST_DYN_LINK #define BOOST_TEST_MAIN #include <boost/test/unit_test.hpp> lib_case.cpp: #define BOOST_TEST_DYN_LINK #include...

Ubuntu 14.04 - An error occurred while installing pg (0.18.2), and Bundler cannot continue

This issue doesn't let me go ahead and I don't know whether it's possible for me to deploy my Rails App ever on Heroku. When I try bundle install by having gem 'pg' in my Gemfile it gives following error. An error occurred while installing pg (0.18.2), and Bundler cannot...

Error installing gem twitter on Ubuntu 15.04

Am trying to install twitter gem on Ubuntu 15.04 and this error keeps popping up gem install twitter Building native extensions. This could take a while... ERROR: Error installing twitter: ERROR: Failed to build gem native extension. /usr/bin/ruby2.1 extconf.rb mkmf.rb can't find header files for ruby at /usr/lib/ruby/include/ruby.h extconf failed,...

Error execute studio.sh unrecognized vm option 'MaxPermSize=350m' on Ubuntu 14.04

Error to Execute/Install studio.sh on Ubuntu: [email protected]:~$ cd android-studio/bin [email protected]:~/android-studio/bin$ ./studio.sh Unrecognized VM option 'MaxPermSize=350m' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. And with some knowledge after searching on search engines, I open the studio.vmoptions file in edit mode and...

Is it possible to dump the core but not exit the process?

I want to be able to generate a core dump but not exit the process afterwards. I don't need it to continue execution, just not die. This is a C++ Ubuntu process. I believe I'm dumping the core in a pretty standard way: I catch the offending signal via setting...

Can't install Composer on Ubuntu behind proxy

I'm trying to install Composer, in order to use Laravel, but I'm behind the company proxy. The proxy is already configured in the system, so wget --proxy-user=<my_user_name> --proxy-password=<my_password> https://getcomposer.org/installer works (curl doesn't!), and I get the 270kB "installer" file. Next, I'm trying to run php installer as the manual says,...

How to run hadoop appliaction automatically?

I know that a MapReduce program can be ran using the command line "hadoop jar *.jar" for a time. But now the program is required to be ran a time for every hour in background. Are there any methods to make the MR program be hourly submitted to hadoop automatically?...

Add PARTITION after creating TABLE in hive

i have created a non partitioned table and load data into the table,now i want to add a PARTITION on the basis of department into that table,can I do this? If I do: ALTER TABLE Student ADD PARTITION (dept='CSE') location '/test'; It gives me error: FAILED: SemanticException table is not...

Pretty URLs aren't working after upgrade to Mediawiki 1.24.2

So I am moving a Mediawiki site of mine to a new server. The version 1.20.3 worked fine on the old server running Ubuntu 12.04. However, when I copied everything over to my new server running Ubuntu 14.04 it didn't. So after messing with it for a while I decided...

Access binaries inside docker

I am using Meteor and Meteur Up package to push a bundle to server. It uses docker. The problem is that I cannot access graphicsmagick or imagemagick from inside a docker to use it in my app. However it is installed on the server and I can access it when...

JFrame wrong location with Ubuntu (Unity ?)

It seems that there is a bug with Ubuntu (maybe only unity). The decoration of the JFrame is taken into account for getLocation() and getSize(), but not for setLocation() and setSize(). This leads to weird behaviour. For instance, if you use pack() after the frame is displayed and the dimensions...

issue monitoring hadoop response

I am using ganglia to monitor Hadoop. gmond and gmetad are running fine. When I telnet on gmond port (8649) and when I telnet gmetad on its xml answer port, I get no hadoop data. How can it be ? cluster { name = "my cluster" owner = "Master" latlong...

issues with installing newer cabal version for haskell vim now

I would like to install this vim plugin: https://github.com/begriffs/haskell-vim-now When trying to run the suggested installation script: curl -o - https://raw.githubusercontent.com/begriffs/haskell-vim-now/master/install.sh | bash I get: --- Cabal version 1.18 or later is required. Aborting. I then try to install a newer version of cabal: [email protected]:~/Downloads/cabal-install-$ ./bootstrap.sh The response I get:...

Error in reading Ubuntu 14.04 mouse event file (/dev/input/event3) with java programmig

I want to handle mouse event in Linux terminal via java programming. I wrote two program via c++ and java that they do same process. when i use c++ programming to open and read file ("/dev/input/event3"-mouse event file), there is no problem while running executable file. (Ubuntu 14.04 terminal and...

How do I respond to a prompt for password in a shell script?

I'm writing a shell script to set a VNC password using vncpasswd. The only way to use vncpasswd is in interactive mode (enter password, return, confirm password, return). How can I respond to the prompts in my shell script so I can set the password automatically? (i.e. non-interactive). Thanks! Chris....

SQL Server 2012 & Polybase - 'Hadoop Connectivity' configuration option missing

As described in the title, I am using SQL Server 2012 Parallel Data Warehouse with Polybase feature to try to access a HDInisght Hadoop cluster. As a starting point for every connection to Hadoop from SQL Server, I find to execute the command sp_configure @configname = 'hadoop connectivity', @configvalue =...

From Hadoop logs how can I find intermediate output byte sizes & reduce output bytes sizes?

From hadoop logs, How can I estimate the size of total intermediate outputs of Mappers(in Bytes) and the size of total outputs of Reducers(in Bytes)? My mappers and reducers use LZO compression, and I want to know the size of mapper/reducer outputs after compression. 15/06/06 17:19:15 INFO mapred.JobClient: map 100%...

Creating a docker swarm cluster in Vagrant

I'm trying to create a swarm cluster of diffferent ubuntu VMs running in Vagrant. These have docker enabled via the vagrant file that boots them. Of the three VM's I started the swarm cluster on one machine in the following way docker pull swarm docker run --rm swarm create This...

Hadoop append data to hdfs file and ignore duplicate entries

How can I append data to HDFS files and ignore duplicate values? I have a huge HDFS file (MainFile) and I have 2 other new files from different sources and I want to append data from this files to the MainFile. Main File and the other files has same structure....

Importtsv command gives : Container exited with a non-zero exit code 1 error

I am trying to load a tsv file into an existing hbase table. I am using the following command: /usr/local/hbase/bin$ hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,cf:value '-Dtable_name.separator=\t' Table-name /hdfs-path-to-input-file But when I execute the above command, I get the following error Container id: container_1434304449478_0018_02_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)...

Vertica: Input record 1 has been rejected (Too few columns found)

I am trying to copy file from Hadoop to a Vertica table and get the an error. The problem is same copy sometimes pass and some times fails,any idea? The Error: Caused by: java.sql.SQLException: [Vertica]VJDBC ERROR: COPY: Input record 1 has been rejected (Too few columns found) at com.vertica.util.ServerErrorData.buildException(Unknown Source)...

C++ Ubuntu select() if serial interface has data on asynchronous read

I´m writing an asynchronous serial data reader class for Ubuntu using C++ and termios and I´m facing difficulties checking is there is data available. Here is my code: #include <iostream> #include <string> #include <sstream> #include <vector> #include <stdio.h> #include <fcntl.h> #include <unistd.h> #include <termios.h> class MySerialClass { public: MySerialClass(std::string port);...

Python3 input() error: can't initialize sys standard streams

I'm running Python 3.4.3 on Ubuntu 15.04 and just encountered a very strange problem when trying to use the input() function. To isolate the problem Iv'e created a file called test.py contaning: print(input()) When running it, I receive this error: $ python3 test.py Fatal Python error: Py_Initialize: can't initialize sys...

Save flume output to hive table with Hive Sink

I am trying to configure flume with Hive to save flume output to hive table with Hive Sink type. I have single node cluster. I use mapr hadoop distribution. Here is my flume.conf agent1.sources = source1 agent1.channels = channel1 agent1.sinks = sink1 agent1.sources.source1.type = exec agent1.sources.source1.command = cat /home/andrey/flume_test.data agent1.sinks.sink1.type...

Input of the reduce phase is not what I expect in Hadoop (Java)

I'm working on a very simple graph analysis tool in Hadoop using MapReduce. I have a graph that looks like the following (each row represents and edge - in fact, this is a triangle graph): 1 3 3 1 3 2 2 3 Now, I want to use MapReduce to...

How do I make all files executable that have the file extension .cgi?

My server downloads its files from git hub to make sure they are up to date and make it easier for me to edit the files. I have set up a cron job that will update the files every few minutes. However I am having a problem as the CGI...

hadoop complains about attempting to overwrite nonempty destination directory

I'm following Rasesh Mori's instructions to install Hadoop on a multinode cluster, and have gotten to the point where jps shows the various nodes are up and running. I can copy files into hdfs; I did so with $HADOOP_HOME/bin/hdfs dfs -put ~/in /in and then tried to run the wordcount...

Hadoop map reduce Extract specific columns from csv file in csv format

I am new to hadoop and working on a big data project where I have to clean and filter given csv file. like if given csv file has 200 columns then I need to select only 20 specific columns (so called data filtering) as a output for further operation. also...

Create an external Hive table from an existing external table

I have a set of CSV files in a HDFS path and I created an external Hive table, let's say table_A, from these files. Since some of the entries are redundant, I tried creating another Hive table based on table_A, say table_B, which has distinct records. I was able to...

jets3t cannot upload file to s3

I'm trying to upload files from local to s3 using hadoop fs and jets3t, but I'm getting the following error Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: Request Error. HEAD '/project%2Ftest%2Fsome_event%2Fdt%3D2015-06-17%2FsomeFile' on Host 'host.s3.amazonaws.com' @ 'Thu, 18 Jun 2015 23:33:01 GMT' -- ResponseCode: 404, ResponseStatus: Not Found, RequestId: AVDFJKLDFJ3242, HostId: D+sdfjlakdsadf\asdfkpagjafdjsafdj I'm...