The cloud comes with many advantages and some challenges. Innovations in cloud technology allow businesses to operate more or less exclusively in the cloud, should they want to go that route. Extract-Transform-Load (ETL) technology allows for a seamless integration of information from various sources to be…
- Extracted from far-flung databases
- Transformed into clean data
- and Loaded into a data warehouse.
While this set of functions serves many purposes, it comes with its own set of challenges once the cloud becomes involved. The big part of that challenge comes with security concerns.
In the cloud or out?
A good question to ponder is just where should the ETL tool sit in relation to the data warehouse. Some choose to host their data onsite (on-premise), while others keep their data in the cloud. Equally, the ETL tool can also reside onsite or in the cloud. No matter where it’s located, it has to work well with the Business Intelligence platform, which, of course, may also exist onsite or in the cloud.
Assuming the data warehouse, BI and ETL are all deployed and floating, then the ETL tool will likely exist as a cloud service.
Security in the cloud
The job of securing the data does not have to be the responsibility of the business that uses the cloud service. It’s just one more thing that a business doesn’t have to focus on while they are focusing on…well, business. But that doesn’t mean they shouldn’t wonder about how secure these data streams are. After all, depending on the nature of the business, the data that’s coming in may come from highly proprietary sources. So it’s only logical to wonder if that stream of data is prone to theft or manipulation in any way.
The path from database to database over IP represents a nice target for would be data thieves. By far the best methods to secure that data involves using encryption and secure channels. The same methods that protect most sensitive data.
Will encryption and decrypting data destroy consistency and reliability?
ETL tools are not built to scale. The right tool depends entirely on the infrastructure that’s using it. ETL also involves a great deal of network sophistication. For example, at a very basic level, it can pull database information that fills a spreadsheet on the user’s end with a name, phone number and address. As the level of information it needs to pull increases in complexity to manage enterprise-level BI, so too does the functions of the ETL tool.
A weak ETL tool or a bad setup can wreak havoc on consistency and reliability of extracted and loaded data. An experienced systems implementation partner will put an application through its paces and make sure that they can handle the loads that they take on from their clients.
Secure hosted services
Once again, if the ETL is hosted in the cloud, then many the complexities and issues that come with using it is not something that a business or institution needs to worry about. If they trust their partners and know they use safe and secure practices, then there really isn’t anything to worry about. ETL tools, depending on their engines, can handle both small and big data. Choosing a provider that implements ETL securely into their process is a step in the right direction for any organization.
Kusal Swarnakar is a founding partner at KPI Partners. As a leader in the consulting industry, he has led hundreds of successful business intelligence, ERP, CRM, data integration, and cloud-based analytical projects. Check out Kusal's blog at KPIPartners.com.