Understanding ETL and Data Warehousing : Role of ETL routines in Quality Data Warehouse Solutions

Bok av Jaiteg Singh
Testing in data warehouse systems is substantial because it is oriented towards the correctness and validation of data/ information supplied for decision making. Keeping in view the idiosyncratic characteristics of data warehouse testing and the complexity of data warehouse projects, this research has reviewed and revised the scope of automated testing in assuring quality data warehouse solutions. Initially a data set generator has been developed to generate synthetic but near to real data; followed by the classification of anomalies in synthesized data with the help of a hand coded Extraction, Transformation and Loading (ETL) routine. To ensure quality data for a data warehouse and to promulgate the importance of Extraction, Transformation and Loading (ETL) routines some test cases of prime importance were identified. Later on automated testing procedures were embedded in hand coded ETL routine to ensure quality data. The statistical analysis revealed major enhancement in data quality with the introduction of automated testing procedures. The various data warehouse architectures have been analyzed to endorse a refined data warehouse architecture named as Data Sharehouse.