Request was initiated by Roald van den Berg
As the need for reporting across multiple systems is becoming more of a priority at NWU, more systems are being integrated with the operational data store (ODS). The current ETL process that feeds the ODS is a custom developed process which is outdated, difficult to manage and does not allow NWU to easily add additional systems to the ODS. Creating and maintaining an ETL processes is a cumbersome process with many different elements dependent on one another where things could go wrong. Managing & verifying runs is a cumbersome task taking up unnecessary resources.
The current process only caters for Oracle & MySQL source databases which limits the number of systems that can be integrated with the ODS essentially impeding the strategy of an integrated reporting environment. The current process also does not cater for transactions reversed in the source systems and transferring only modified records would result in having to modify some of the existing systems currently being used at NWU. Since we cannot only transfer modified records for some of our existing systems the runtime of the ETL keeps increasing and we will soon be faced with the problem that not all data can be transferred to the ODS in time as a result of the increasing number of data. The ETL tool might also allow real-time integration with the ODS unlike the current process where runs have to be scheduled to run in the evening.
The purpose of this project is to evaluate three different ETL tools in order to determine which would best suit NWU’s requirements and to address all the issues we currently face with our existing process. During the first phase we investigated six different products and after consulting with all stakeholders that list was reduced to the three that would best address the current issues. By doing a POC on the short listed tools we will be able to choose the product that best satisfies NWU’s requirements. It would also assist us in choosing a product that would enable faster development and deployment.
After comparing a number of ETL tools and comparing their features with NWU’s needs we decided on the following tools:
Talend data integration, JBoss data virtualization & Phentaho data integration.
In order to successfully complete the investigation we will need an environment to deploy the tools and perform tests. The following environment is required:
1 Linux machine
8 GB Memory
60 GB System Disk
50 GB Install Disk
No documents at this time.
Nov 24 2015
Sep 30 2016
Overall Project Completion