A key component to any Business Intelligence (BI) Architecture is the Extract, Translate and Load (ETL) process. The ETL process is typically scheduled on a daily basis and is capable of data movement from legacy systems into a data warehouse. It is also used to facilitate the work of the database administrators who connect different branches of databases as well as integrate or change the existing databases.
There are several factors to consider in ETL tool in order to deliver the optimal value to customers:
- Data delivery and transformation capabilities.
- Data and metadata modeling capabilities.
- Data source and target support.
- Data governance capability.
- Runtime platform capabilities.
- Operations and administration capabilities.
- Service-enablements capability.
How ETL and ELT Differ
In early data warehouses, ETL was typically performed by custom-developed programs and scripts. From the past twenty years, ETL development has matured enough into configurable tools.
Extract, Transform, and Load (ETL) is a process that involves extracting data from outside sources, transforming it to fit operational needs (sometimes using staging tables), then loading it into the end target database or data warehouse.
Extract, Load, and Transform (ELT) is a process where data is extracted for the source, then loaded into a staging table in the database, transforming it where it sits in the database and then loading it into the target database or data warehouse.
A key factor of ELT is loading performance, an advantage it holds because it operates at the infrastructure level, using SQL and related procedural features of the relational database. In contrast, ETL works at the integration server level, using dedicated graphical application development facilities including transformation rules.
ELT is gaining cost-of-performance advantages over ETL in part because software licensing and development costs in DW/BI initiatives now far outweigh the costs of infrastructure. That balance is getting further skewed with the availability of cheaper multi-core (database) servers and related appliances. Conventionally BI environments are the biggest consumers of ETL tools and technologies because of the complex requirements for desperate data sources and data transformation needs.
There are several ETL tools available in the BI market but most popular and commonly tools used with Business Intelligence (BI) are Informatica Power Center and Oracle Data Integrator (ODI). They both have similar features but considerations have to be made to choose the right tool for the Business Intelligence environment.
Informatica Power Center is developed by Informatica (www.informatica.com) and resold through Oracle. Informatica is a well known ETL product in the market and is a foundation component of Oracle Business Intelligence Applications (OBIA). It is the most substantial size and resources on the market of data integration tools vendors. Also has the consistent track record, solid technology, straightforward learning curve, ability to address real-time data integration schemes. Unfortunately, Informatica is more costly than ODI and its server based licensing may be a burden to smaller BI projects and it adds another vendor to the mix. Also, Informatica often requires performance tuning for complex queries. Informatica is highly specialized in ETL and Data Integration and focuses on those topics, not on BI as a whole.
In July 2010, the 11gR1 release of Oracle Data Integrator was made available to the marketplace and it is Oracle's strategic data integration platform. Prior to the acquisition of Sunopsis in 2006, Oracle had focused exclusively on data integration needs for the Oracle Database. Since then it is firmly committed to serving comprehensive heterogeneous data integration needs, regardless of the combination of databases, applications, business intelligence products, middleware products, operating systems, and hardware in use among our customers. Oracle has quickly and aggressively invested in ODI to provide an easy-to-use and comprehensive approach for satisfying data integration requirements within Oracle software products.
ODI is less popular that Informatica, yet it is ideal for situations when much complex logic is required for large volumes of data. It is installed on target databases so performance is faster than Informatica. Limitations of ODI have to do with the use of knowledge modules for Loading, Integration, Journalize, Check and Reverse. Modification to these modules is time consuming.
- ODI is faster, simpler and cost saving.
- Runs on relational technologies and doesn’t require hardware as data sets grow.
- ODI processes used only during integration runs.
- Databases continually available for OLTP, BI, DW, etc.
- Common administration, monitoring and management.
It also looks like ODI is the best long term ETL solution. In the current release Oracle BI Apps 184.108.40.206.1 it has removed the ETL functionality offered by Informatica PowerCenter and replaced it by Oracle Data Integrator (ODI). And it removed the DAC thick-client used for ETL orchestration, and replaced it by a web interface, Oracle Business Applications Configuration Manager (OBACM). Click here for detailed architecture of ODI with Oracle BI Apps (220.127.116.11.1).
It is also said that in the second release of the BI Apps 18.104.22.168 will add support for Informatica PowerCenter, allowing customers to choose between ODI and PowerCenter for ETL, depending on their preferences.
When using Oracle BI Applications, ODI and Informatica are strong contenders but consideration of functionality, budget and other factors will lead to the right selection for your organization.
Sridhar Kasthuri is a Principal Consultant at KPI Partners who specializes in Oracle Business Intelligence Enterprise Edition (OBIEE) and the Oracle BI Applications. He is considered an expert in the area of business intelligence tool migrations. His resume includes MicroStrategy to OBIEE migrations, Brio to OBIEE migrations, and the world's largest conversion of Business Objects to OBIEE. Check out Sridhar's blog at KPIPartners.com.