FAQ Database Discussion Community


Conditional Mapping in Talend

etl,talend
I have created a simple job in Talend that will perform an inner join in the data between 2 excel sheets and then dump the result in an output excel sheet. This can be best illustrated by the below diagram :- The mapping used in tMap is :- However the...

Using blank-line delimited records and colon-separated fields in awk

awk,etl
I'd like to be able to work with a file in awk where records are separated by a blank line and each field consists of a name followed by a colon, some optional whitespace to be ignored/discarded, followed by a value. E.g. Name: Smith, John Age: 42 Name: Jones, Mary...

Best architecture to convert JSON to SQL?

sql,json,etl
I was just wondering if anyone had any thoughts on converting a JSON-document database structure into SQL. It needs to be done for data integration/ warehousing. The JSON fields are relatively static, but new 'fields' can spring up every 2-4 weeks. Due to the nature of this, and converting to...

Can I prompt user for Field Mappings in SSIS package?

sql-server,ssis,etl
I am trying to build a tool to facilitate some redundant importing of data into a SQL Server database. The flat text files we get have are mostly static, but there is often about a 5-10% variance in field names and sometimes some extra fields added (in which we will...

Import custom text format without separators [closed]

c#,sql-server,regex,ssis,etl
I would like import this .txt file format to SQL Server Table or to convert each block of text to pipe separated line. Which tools or C# solution suggests you to resolve this issue? Any suggestions would be appreciated. Thank You. ================= INPUT (.txt file) ================= ID: 37 Name: Josephy...

Comparing filenames in PDI

pentaho,etl,kettle,data-integration,pdi
I am trying to import a certain .CSV file into my database using PDI (Kettle). Normally this would be rather easy, as you could just link up a CSV file input step with a Table output step and be good to go. However, the problem is that I don't know...

Kettle hangs on post data using web service step

web-services,etl,kettle
I am using Kettle to bulk load data and i am facing issue in dealing with web service step.As per the inspection after few thousands of call web service becomes unresponsive only time counter is increased but no progress is done.I can notice that all previous steps are finished and...

concurrent statistics gathering on Oracle 11g partiitioned table

oracle,oracle11g,etl,data-warehouse,table-statistics
I am developing a DWH on Oracle 11g. We have some big tables (250+ million rows), partitioned by value. Each partition is a assigned to a different feeding source, and every partition is independent from others, so they can be loaded and processed concurrently. Data distribution is very uneven, we...

How does ETL (database to database) fit into SOA?

database,architecture,soa,etl,decoupling
Lets imagine, that our application needs ETL (extract, transform, load) data from relation database to another relation database. Most simple (and most performance, IMHO) way is to make link between databases and write simple stored procedure. In this case we use minimal technologies and components, all features are "out of...

Pull Text file to SQL server 2008 table

sql-server,sql-server-2008,etl
I want to pull fixed field delimited text file to my SQL table (SQL server 2008). Table contains more than 200 columns. While pulling the text file each and every time I cannot split the file, based on column length. In SQL server 2000, we have DTS package. Is there...

DWH and ETL explained

etl,dimensional-modeling
In this post I am not asking any tutorials, how to do something, in this post, I am asking your help, if someone could explain me with simple words, what is DWH (data warehouse ) and what is ETL. Of course, I google'ed and youtube'd alot, I found many articles,...

Pentaho Dimension lookup/update

csv,pentaho,etl,kettle
I have seen Dimension Lookup/Update documentation here and a few other blogs. But I cannot seem to get a clear idea. I have a table with the following structure: Key Name Code Status IN Out Active The key name code status active comes from a csv file . I need...

Reducing data with data stage

etl,datastage
I've been asked to reduce an existing data model using Data Stage ETL. It's more of an exercice and a way to get to know this program which I'm very new to. Of course, the data shall be reduced following some functionnal rules. Table : MEMBERSHIP (..,A,B,C) # where A,B,C...

OrientDB ETL edge lookup from query - how to access $input?

graph,etl,orient-db
I'm trying my darnedest to do an ETL import from a large dataset that I've been keeping in MongoDB. I've successfully imported the vertices, and I feel like I'm one little syntax misunderstanding away from importing the edges too. I am pretty sure that the error is in this transformer:...

How to return no matched row in Pentaho Data Inegration (Kettle)?

java,pentaho,lookup,etl,kettle
I look for a solution to perform SSIS lookup in Pentaho Data Integration. I'll try to explain with an exemple : I have two tables A and B. Here , data in table A : 1 2 3 4 5 Here , data in table B: 3 4 5 6...

Datastage - run user defined sql query file using odbc connector

sql-server,etl,datastage
Using DataStage, I have to read a sequential file, which contains one sql statement, run that sql statement and output the results in a sequential file. This is what I've tried : Using an Oracle connector, I simply set the option to "Read Selected statement from File", I entered the...

Table loading on Simple model still writes to log

sql-server,sql-server-2012,etl
I have a database on SqlServer 2012 Enterprise with Recovery model set to 'Simple'. When data gets pushed into it and I check the resource monitor on the server, I see that MyDB_dat.mdf gets written to with 20MB/sec, and MyDB_log.ldf gets 30MB/sec. Both files are op separate disks. I drop...

Talend Job not letting me map tMap to another tMap Component

etl,talend
I have created a job in Talend as follows :- I wish to map tMap_2 and tMap_3 to tMap_4, but for some reason it just isnt letting me. tMap_2 and _3 show warning messages stating 'This component does not have enough "Row" type outputs'. But For each of these I...

Pentaho to convert tree structure data

pentaho,etl,kettle
I have a stream of data from a CSV. It is a flat structured database. E.g.: a,b,c,d a,b,c,e a,b,f This essentially transforms into: Node id,Nodename,parent id,level 100, a , 0 , 1 200, b , 100 , 2 300, c , 200 , 3 400, d , 300 , 4...

SSIS package data flow tasks report

reporting-services,ssis,etl
We have many SSIS packages that move, import, export around large amount of data. What is the best way to get alerts or notifications if expected amount of data is not processed? or How to get daily report on how different SSIS packages are functioning. Is there a way to...

How to scale concurrent ETL tasks up to an arbitrary number in SSIS?

concurrency,ssis,scale,etl
Problem (See Context below) How can I scale individual tasks (e.g. downloading and parsing) to an arbitrary number of concurrent executions (e.g. 500) in SSIS? Setup Description Our setup is that we have a list of feed urls we want to visit, get all items and insert them into the...

OCDM combined with ODI

oracle,etl,data-warehouse,oracle-data-integrator
ODI = ELT tool OCDM = Data warehouse. Is my understanding of the above correct ? More information/explanation is welcome. Now my question is : Is it possible to load into OCDM's pre-existing tables via ODI, when the source of ODI are in flatfiles/XML format ? If possible, how ?...