I want to perform complex transformation on staging data and load them in fact tables. both staging and fact tables resides in same teradata database.What should be the better choice?Informatica or BTEQ scrips?Volume of data is appx 200,000 rows every day. I am thinking to use informatica for complex transformation and Mload as external loader(Informatica External Loader) to load data.Pls Suggest.
The real question to ask is whether you can do the transformations entirely within the database (Informatica calls it "push down") or if the processing is such that you MUST export it to another server, transform, then load back to Teradata. Transforming within the database is usually much faster.If you absolutely can't do the transformations within the Teradata database, BTEQ alone won't solve the problem. On the other hand, if you already have Informatica as your standard ETL tool and can use "push down", there is no overwhelming benefit to using BTEQ instead.
This scenario looks like a perfect fit for Informatica PowerCenter’s Pushdown Optimization capability where you can use the same design environment in PowerCenter to develop the transformations and have PowerCenter manage the “push” of processing into the Teradata database.Informatica introduced this specifically for the Teradata database and for scenarios where staging and production are in the same database instance to reduce/eliminate the need to move data in and out of the database.You can find more information here: http://www.informatica.com/INFA_Resources/ds_pushdown_optimization_6675.pdf
One potential showstopper for any bteq dependent job is the connection sustainability which will not be there in your case...because both source and target are in the same database. I guess this is a test to your SQL capability to code the complex tranformations which might prove to be a better choice than Informatica.
Ramakrishna Vedantam, Tata consultancy services, Hyderabad
At our company we use Informatica as the standard data integration platform. Currently, we're in the process of implementing an Enterprise Data Warehouse based on Teradata (using cLDM). From an architecture point of view we need to make a choice between ETL/ELT (PowerCenter with push down optimization) and scripting (BTEQ). To my opinion the choice is based on the following criteria:
- developer productivity
- performance & scalability
- manageability (change impact analysis, data lineage, meta data management)
It all boils down to total cost of ownership. I'm looking for best practices at larger companies (pref telcos) where a clear choice between ETL/ELT and scripting has been made...
Hybrid ETLT is the most common used scenario in the real world.
Use the appropriate combination of ETL and ELT based on transformation characteristics.
Informatica creates and defines metadata to drive transformations independent of the execution engine.
ETL versus ELT execution is selectable within Informatica and can be switched back-and-forth without the redefinition of transformation rules.
Why the New Data Integration Technique?
Processing efficiency to optimize the returns on the existing investments:
- Leverage database-specific optimization techniques.
- Fully utilize database capacity.
Increased IT productivity:
- Curtail data movements causing maintenance difficulties and operational risks.
- Streamline debugging and performance tuning efforts.
- Address architectural needs for aligning physical environments with enterprise-wide logical model.
Metadata-driven approach across the board:
- Rapidly perform impact assessment across source and target.
- Automatically generate data lineage and metadata maintenance for auditability and traceability.