FAQ Database Discussion Community
flume,flume-ng,flume-twitter
I am using below configuration details to push Twitter feeds into HDFS using Flume, but getting Expected timestamp in the Flume event headers, but it was null twitter.conf TwitterAgent.sources = Twitter TwitterAgent.channels = MemChannel TwitterAgent.sinks = HDFS TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource TwitterAgent.sources.Twitter.channels = MemChannel TwitterAgent.sources.Twitter.consumerKey = xxxxxxxxxxxxxxxxxxxxx TwitterAgent.sources.Twitter.consumerSecret = xxxxxxxxxxxxxxxxxxxxxxxx...
hadoop,bigdata,router,syslog,flume
I am trying to collect syslog from 10 devices(routers). I came to know that I can use syslog source, but need clarification about the host and ports in the properties. Whether they are the local port on the machine where flume agent is running. Also how to redirect syslogs to...
java,elasticsearch,flume
I've got a question about the TTL in elasticsearch sink of apache flume I've working on elastic search + flume integration. I'm using elasticsearch version 1.4.1 and flume version 1.5.2 Both are running locally on my machine In Flume My ElasticSearch Sink is configured as follows: agent.sinks.elasticSearchSink.type = org.apache.flume.sink.elasticsearch.ElasticSearchSink agent.sinks.elasticSearchSink.channel...
java,flume
I've just started learning Big Data, and at this time, I'm working on Flume. The common example I've encountered is for processing of tweets (the example from Cloudera) using some Java. Just for testing and simulation purposes, can I use my local file system as a Flume source? particularly, some...
hadoop,log4j,bigdata,log4j2,flume
I have Log4j2 configuration: <?xml version="1.0" encoding="UTF-8"?> <configuration> <appenders> <Console name="console" target="SYSTEM_OUT"> <PatternLayout pattern="%d %-5p - %m%n"/> </Console> <Flume name="flume" > <MarkerFilter marker="FLUME" onMatch="ACCEPT" onMismatch="DENY"/> <Agent host="IP_HERE" port="6999"/> </Flume> <File name="file" fileName="flume.log"> <MarkerFilter marker="FLUME" onMatch="ACCEPT" onMismatch="DENY"/> </File> </appenders>...
cloudera,flume,hortonworks-data-platform,flume-ng,flume-twitter
I am trying to refresh the .tmp file with additional events in every 5 minutes, my source is slow and it takes 30 min to get 128MB file in my hdfs sink. Is there any property in flume hdfs sink where I can control the refresh rate of .tmp file...
hadoop,hive,flume
I am trying to configure flume with Hive to save flume output to hive table with Hive Sink type. I have single node cluster. I use mapr hadoop distribution. Here is my flume.conf agent1.sources = source1 agent1.channels = channel1 agent1.sinks = sink1 agent1.sources.source1.type = exec agent1.sources.source1.command = cat /home/andrey/flume_test.data agent1.sinks.sink1.type...
hadoop,hadoop-streaming,flume,hortonworks-data-platform,flume-ng
i am trying to load csv file (6MB) into HDFS using flume and spooldir as source and HDFS as sink and here's my configuration file: # Initialize agent's source, channel and sink agent.sources = TwitterExampleDir agent.channels = memoryChannel agent.sinks = flumeHDFS # Setting the source to spool directory where the...
twitter,flume
I am a beginner so kindly bear with me. I need to download twitter logs and would like to use Flume. However, I am not familiar with Java. Can Python be use with the Flume Agent ? Any links that I could refer to will be very helpful. thanks!...
flume,flume-ng
I am new to flume.My flume agent having source as http server,from where it getting zip files(compressed xml files) on regular interval.This zip files are very small (less than 10 mb) and i want to put the zip files extracted into the hdfs sink.Please share some idea how to do...
csv,hadoop,unicode,flume
I'm trying to put a CSV file into HDFS using flume, file contains some unicode characters also. Once the file is there in HDFS I tried to view the content, but unable to see the records properly. File content Name age sal msg Abc 21 1200 Lukè éxample àpple Xyz 23 1400...
encryption,amazon-s3,flume
I am streaming some sensitive log data to Amazon S3 using flume. I can't figure out how to set the flag/configuration in flume so that S3 uses server-side encryption.
cassandra,nosql,flume,flume-ng
I am trying to find a template/sample of a Cassandra flume sink. I have looked online, and the two projects I have found on github have outdated dependencies (JARs), and I cant find those artifcats anywhere :(. Thanks! looking forward for any refs. ...
hdfs,flume
I want to use flume to transfert data from hdfs directory into also directory in hdfs, in this transfer I want to apply processing morphline. For example: my source is "hdfs://localhost:8020/user/flume/data" and my sink is "hdfs://localhost:8020/user/morphline/" Is it possible with flume? If yes, what is the type for the source...
hadoop,hdfs,flume,flume-ng
I'm trying to transfer a 700 MB log file from flume to HDFS. I have configured the flume agent as follows: ... tier1.channels.memory-channel.type = memory ... tier1.sinks.hdfs-sink.channel = memory-channel tier1.sinks.hdfs-sink.type = hdfs tier1.sinks.hdfs-sink.path = hdfs://*** tier1.sinks.hdfs-sink.fileType = DataStream tier1.sinks.hdfs-sink.rollSize = 0 The source is a spooldir, channel is memory and...
hadoop,streaming,message-queue,flume
I want to read data from IBM MQ and put it into HDFs. Looked into JMS source of flume, seems it can connect to IBM MQ, but I’m not understanding what does “destinationType” and “destinationName” mean in the list of required properties. Can someone please explain? Also, how I should...
java,twitter4j,cloudera,flume
I am using TwitterSource for Flume from Cloudera. I want to get tweets by country with certain keywords. I'm not sure what to compare to when I want to get tweets from The Netherlands. I have the following which results in nothing being processed: public void onStatus(Status status) { if(status.getPlace().getCountry().equalsIgnoreCase("netherlands"))...
flume,flume-ng
I am trying to setup data pipeline where applications servers send (using log4j logging) logevents to flume (using flume log4j appender) over network , to a avrosource that flume agent is using I tried with below configration but It only appends IP of the host on which agent is running...
hadoop,hive,flume
I am trying to learn Hive by following twitter data tutorial from the below link. https://github.com/cloudera/cdh-twitter-example/ I have successfully installed and configured hadoop and hive and tested simple text file load into hive table. All working good so far. However, even thought files existed in hdfs, external table is showing...
solr,flume,avro,flume-ng
I am using Morphline Solr Sink to store information in Solr. The problem that I am facing is that flume agent never stops retrying the failed requests, which sometimes can increase over time. This results in the flume warning of MaxIO Workers being used and the system suffers with performance...
apache,storm,apache-kafka,flume
I have been reading a lot of articles where implementations of Apache Storm are explained for ingesting data from either Apache Flume or Apache Kafka. My main question remains unanswered after reading several articles. What is the main benefit of using Apache Kafka or Apache Flume? Why not collecting data...
hdfs,flume
I'm writing a number of CSV files from my local file system to HDFS using Flume. I want to know what would be the best configuration for Flume HDFS sink such that each file on local system will be copied exactly in HDFS as CSV. I want each CSV file...