The Objectives
Our client embarked on a mission-critical project to enhance data accessibility, reduce operational costs, and significantly improve data processing times. This project aims to drive quicker, data-informed decision-making by seamlessly ingesting data from Oracle UCM, applying data processing rules, and making it accessible for various applications, including Elastic search for item maintenance.
Challenge
- Inefficient Data Retrieval: Fetching data from Oracle UCM was cumbersome and time-consuming, causing delays in accessing critical information
- Data Fragmentation: Data was scattered across various sources, lacking a centralized repository, which hindered effective data management
- High Operational Costs: The existing infrastructure incurred significant operational expenses, putting pressure on the organization’s budget
- Lengthy Processing Time: Data transformation jobs took an exorbitant amount of time, resulting in sluggish decision-making processes
- Harmonized Reporting Challenge: Inconsistent financial reporting terminology and data interpretation across multiple ERP systems lead to complexities and errors, hampering the reporting process.
- Data Reconciliation Inefficiency: Problem Statement: The manual reconciliation of data from various sources consumes substantial time and resources, resulting in operational inefficiencies.
- Real-Time Information Delay: Problem Statement: Organizations experience delays in decision-making due to the reliance on outdated data, hindering their ability to respond promptly.
Solution
To address these challenges, our client adopted a comprehensive solution that leveraged Databricks and other modern data processing technologies:- Robust ETL Pipeline Development: A robust ETL pipeline was developed to streamline data ingestion from Oracle UCM. API calls were used to fetch data, which was then processed and transformed efficiently
- Structured Delta Lake: Processed data was organized into Delta Lake tables in Amazon S3
- Unity Catalog Implementation: Unity Catalog, a powerful feature of Databricks, was harnessed to expose datasets within the Databricks environment. This allowed precise user access control and facilitated data queries, eliminating the need for direct AWS access
- Enhanced Data Security: Data permissions were rigorously managed at table and row levels through Unity Catalog, ensuring robust data security and user access control
- Cost Optimization: Migrating from the previous Informatica-based solution to Databricks led to a substantial 40% reduction in infrastructure service costs. Additionally, annual sustaining costs for operations decreased to $20,000
- Performance Improvement: Data processing time was slashed by a remarkable 75%, from 12 hours to 3 hours, enabling faster insights and decision-making
- ElasticSearch Integration: OpenSearch was selected as the target system to ensure seamless integration with Elastic search via API endpoints, allowing data to be easily accessed through web UI
Impact
- 75% Reduction in Data Processing: Data processing time was reduced by 75%, enabling faster decision-making and boosting overall operational agility
- 40% Infrastructure Savings: Achieving a 40% reduction in infrastructure service costs, with expenses decreasing from $50,000 to $30,000 annually, translated into substantial budget efficiency
- 60% Operational Optimization: By streamlining operations, KPI reduced annual operating costs to a lean $20,000, ensuring highly efficient system management
Comments
Comments not added yet!