Total Pageviews

Wednesday 29 February 2012

Challenges of Data warehouse Testing


Challenges of Data warehouse Testing
•                  Data selection from multiple source systems and
Analysis that follows pose great challenge.
•                  Volume and the complexity of the data.
•                  Inconsistent and redundant data in a data warehouse.
•                  Inconsistent and Inaccurate reports.
•                  Non-availability of History data.
3                  Testing Methodology
•                  Use of Traceability to enable full test coverage of
Business Requirements
•                  In depth review of Test Cases
•                  Manipulation of Test Data to ensure full test coverage

Fig 1 Testing Methodology (V- Model)
•                  Provision of appropriate tools to speed the process
of Test Execution & Evaluation
•                  Regression Testing
4                  Testing Types
The following are types of Testing performed for Data
warehousing projects.
1.                 Unit Testing.
2.                 Integration Testing.
3.                 Technical Shakedown Testing.
4.                 System Testing.
5.                 Operation readiness Testing
6.                 User Acceptance Testing.
4.1              Unit Testing:
The objective of Unit testing involves testing of Business transformation rules, error conditions, mapping fields at staging and core levels.
Unit testing involves the following
1.Check the Mapping of fields present in staging
level.
2.Check for the duplication of values generated using
Sequence generator.
3.Check for the correctness of surrogate keys, which
uniquely identifies rows in database.
4.Check for Data type constraints of the fields
present in staging and core levels.
5.Check for the population of status and error
messages into target table.
6.Check for string columns are left and right trimmed.
7.Check every mapping needs to implement the process
abort mapplet which is invoked if the number of record read from source is not equal to trailer count.
8.Check every object, transformation, source and
target need to have proper metadata. Check visually in data warehouse designer tool if every transformation has a meaningful description.
4.2)Integration Testing:
                       The objective of Integration Testing is to ensure that work flows are executed as scheduled with correct dependency.
Integration testing involves the following
1.To check for the execution of workflows at the following stages
      Source to Staging  A.
Staging A to Staging  B.
      Staging B to Core.
2.To check target tables are populated with correct
number of records.
3.Performance of the schedule is recorded and
analysis is performed on the performance result.
4.To verify the dependencies among workflows between
source to staging, staging to staging and staging to core
is have been properly defined.
5.To Check for Error log messages in appropriate file.
6.To verify if the start jobs starts at pre-defined
starting time. Example if the start time for first job has been configured to be at 10:00AM and the Control-M group has been ordered at 7AM, the first job would not start in Control-M until 10:00AM.
7.To check for restarting of Jobs in case of failures.
4.3)Technical Shakedown Test:
                             Due to the complexity in integrating the various source systems and tools, there are expected to be several teething problems with the environments. A Technical Shakedown Test will be conducted prior to commencing System Testing, Stress & Performance, User Acceptance testing and Operational Readiness Test to ensure the following points
are proven:
•Hardware is in place and has been configured
correctly (including Informatica architecture, Source
system connectivity and Business Objects).
•All software has been migrated to the testing
environments correctly.
•All required connectivity between systems are in
place.
•End-to-end transactions (both online and batch
transactions) have been executed and do not fall over.
4.4)System Testing:
                   The objective of System Testing is to ensure that the required business functions are implemented correctly. This phase includes data verification which tests the quality of data populated into target tables.
 System Testing involves the following
1.To check the functionality of the system meets the
business specifications.
2.To check for the count of records in source table
and comparing with the number of records in the target
table followed by analysis of rejected records.
3.To check for end to end integration of systems and
connectivity of the infrastructure (e.g. hardware and
network configurations are correct),
4.To check all transactions, database updates and
data flows functions for accuracy.
5.To validate Business reports functionality.

Reporting functionality               Ability to report data as required
by Business using Business Objects
Report Structure       Since the universe and reports have
been migrated from previous version of Business Objects,
it’s necessary to ensure that the upgraded reports
replicate the structure/format and data requirements (until
and unless a change / enhancement has been documented in
Requirement Traceability Matrix / Functional Design
Document).
Enhancements          Enhancements like reports’ structure,
prompts ordering which were in scope of upgrade project
will be tested
Data Accuracy            The data displayed in the reports / prompts matches with the actual data in data mart.
Performance              Ability of the system to perform certain
functions within a prescribed time.
That the system meets the stated performance criteria
according to agreed SLAs or specific non-functional
requirements.
Security That the required level of security access
is controlled and works properly, including domain
security, profile security, Data Security, UserID and
password control,and access procedures.That the security system cannot be bypassed.
Usability That the system is useable as per specified
requirements
User Accessibility      That specified type of access to
data is provided to users
Connection Parameters             Test the connection
Data provider Check for the right universe and duplicate data Conditions/Selection criteria  Test the for selection criteria for the correct logic Object testing Test the objects definitions Context testing            Ensure formula is with input or output Context Variable testing Test the variable for its syntax and data type compatible
Formulas or calculations Test the formula for its
syntax and validate the data given by the formula
Filters Test the data has filter correctly
Alerts Check for extreme limits Report alerts
Sorting Test the sorting order of Section headers fields,
Blocks Totals and subtotals Validate the data results
Universe Structure Integrity of universe is maintained
and there are no divergences in terms of joins / objects / prompts
4.5)User Acceptance Testing:
The objective of this testing to ensure that System meets
the expectations of the business users.  It aims to prove
that the entire system operates effectively in a production environment and that the system successfully supports the business processes from a user's perspective. Essentially,these tests will run through “a day in the life of” business users.The tests will also include functions that involve source systems connectivity, jobs scheduling and Business reports functionality.
4.6)Operational Readiness Testing (ORT):
This is the final phase of testing which focuses on
verifying the deployment of software and the operational
readiness of the application. The main areas of testing in
this phase include:
 Deployment Test
1.Tests the deployment of the solution
2.Tests overall technical deployment “checklist” and
timeframes
3.Tests the security aspects of the system including
user authentication and   authorization, and user-access
levels.
 Operational and Business Acceptance Testing
1.Tests the operability of the system including job
control and scheduling.
2.Tests include normal scenarios, abnormal, and fatal
scenarios
5Test Data
Given the complexity of Data warehouse projects;
preparation of test data is daunting task. Volume of data
required for each level of testing is given below.
Unit Testing - This phase of testing will be performed with a small subset (20%) of production data for each source system.
Integration Testing - This phase of testing will be performed with a small subset of production data for each
source system.
System Testing – This phase of a subset of live data will
be used which is sufficient in volume to contain all
required test conditions that includes normal scenarios,
abnormal, and fatal scenarios but small enough that
workflow execution time does not impact the test schedule
unduly.
6.Conclusion:
Data warehouse solutions are becoming almost ubiquitous as a supporting technology for the operational and strategic functions at most companies. Data warehouses play an integral role in business functions as diverse as
enterprise process management and monitoring, and
production of financial statements. The approach described here combines an understanding of the business rules applied to the data with the ability to develop and use testing procedures that check the accuracy of entire data sets. This level of testing rigor requires additional effort and more skilled resources. However, by employing this methodology, the team can be more confident, from day one of the implementation of the DW, in the quality of the data. This will build the confidence of the end-user community, and it will ultimately lead to a more effective implementation.


Please go through the above document. As far as my knowledge we don't have any ETL Testing Certifications.

DWH/ETL Applications Testing Framework Tool

1.       Introduction 

A lot of failures of DWH Projects in recent past due to Inadequate Testing
1.1.    Lack of business perspective know how of the application
1.2.    End-to-End testing not carried out 
1.3.    Reports data tested by comparing with DWH tables alone and tables not in sync with Source data
1.4.    ETL knowledge of the testing team not adequate ..etc

Wikipedia defines a framework as "A basic conceptual structure used to solve a complex issue”

When I use the term, I mean any support, infrastructure, tool or "scaffolding" designed to make testing easier, and (often) automated.

DWH Testing Framework tool is an Absolute guide to design the End-to-End testing of DWH application

                                                           

2.       DWH Testing Framework tool

This Tool provides a structured framework for DWH testing. This tool is a,
*      This stand-alone application is developed with HTML
*      Lists different types of testing, which takes to the details on click
*      Also, briefs on –
*      What the common misconceptions regarding DWH testing
*      How DWH testing different from traditional approach of testing under various prospects
*      Navigation through the tool is shown in coming slides (taking navigation through Extraction testing)

This framework can be applied to all data or for one specific business area (Spot Check of a fact area). Besides these test areas, it is also important to pay attention to topics such as security, back-up and recovery, performance and meta data.
DWH Testing should be done in parallel with the DWH-Development life cycle (V-model approach) in every test area.
Working with this tool will result in:
  • Increased confidence of the employees in using Data Warehouse, supporting their decisions.
  • Use of a Structured Data Validation approach.
  • Reduced time of development and user acceptance.
  • Reduced debugging time and cost by detecting errors earlier in the process.

DWH framework tool gives all possible scenarios and possible solution/apprach to validate the any DWH application.This can be used as framework.Following types of testing comes under DWH application and same is covered in tool.



2.1.    Extraction Testing

ü  Data is able to extract the required fields.
ü  The Extraction logic for each source system is working
ü  Extraction scripts are granted security access to the source systems.
ü  Updating of extract audit log and time stamping is happening.
ü  Source to Extraction destination is working in terms of completeness and accuracy.
ü  Extraction is getting completed with in the expected window.

2.1.1.    Simple Scenario for Extraction:

Extraction– Complete table Extraction from a core system with robust DBMS.

2.1.2.    Complex Scenario for Extraction:

Extraction – Data Extraction from an excel sheet involving filtering out the customers not matching the standard customer code.

2.1.3.    Boundary Scenario for Extraction:

Extraction– No data in the source system.

2.1.4.    Negative Scenario for Extraction:

            Extraction– Wrong OR unexpected data in the table. (For example you place the wrong customer ID format, character fields in what should be numeric etc.)Volume of Test Data

2.2.    Transformation Testing

This testing checks the following:

ü  Transaction scripts are transforming the data as per the expected logic.
ü  The one time Transformation for historical snap-shots are working.
ü  Detailed and aggregated data sets are created and are matching
ü  Transaction Audit Log and time stamping is happening
ü  There is no pilferage of data during Transformation process.
ü  Transformation is getting completed with in the given window.

2.2.1.    Simple Scenario for Transformation:


Transformation– Creation of simple derived attributes (creating complete bill amount from individual billing items) OR creating aggregates.

2.2.2.    Complex Scenario for Transformation:

Transformation – 'De-Dup', 'Integration'.

2.2.3.    Boundary Scenario for Transformation:

Transformation– Creating derived attribute with input figure being very large OR very small. (For example a % sales revenue figure for sales of USD 10 out of the total sales of USD one million).

2.2.4.    Negative Scenario for Transformation:

Transformation– having negative sales numbers, age of 200 years etc. This is important, as the transformation logic should not only work on what it wants to do, but what all it could face.

2.3.    Loading Testing

This testing checks the following:
ü  There is no pilferage during the Loading process.
ü  Any Transformations during Loading process is working.
ü  Data sets in staging to Loading destination is working.
ü  One time historical snap-shots are working
ü  Both incremental and total refresh are working.
ü  Loading is happening with in the expected window.

2.3.1.    Simple Scenario for Loading:

Loading– Loading a dimension set with lesser attributes and without any Transformation during Loading.

2.3.2.    Complex Scenario for Loading:

Loading – Loading dimensions with large set of attributes.

2.3.3.    Negative Scenario for Loading:

            Loading– Having wrong data sets. For example having data set of dimension 'location' has two columns less OR not existing OR having null values. There should be some fundamental checks, which need to be run by Loading system before it goes for bulk Loading

2.4.    End User Browsing and OLAP Testing

ü  This testing checks the following:
ü  The Business views and dashboard are displaying the data as expected.
ü  The scheduled reports are accurate and complete.
ü  The scheduled reports and other batch operations like view refresh etc. is happening in the expected window.
ü  'Analysis Functions' and 'Data Analysis' are working.
ü  There is no pilferage of data between the source systems and the views.

2.4.1.    Simple Scenario for OLAP:

                        OLAP– Testing using 'Basic Functions' .

2.4.2.    Complex Scenario for OLAP:

                        OLAP – Testing population of OLAP population.

2.4.3.    Boundary Scenario for OLAP:

                        OLAP– No data in the source system.

2.4.4.    Negative Scenario for OLAP:

                        OLAP– Users entering wrong formulae.

2.5.    Down Stream Flow Testing

                        This testing checks the following:
a) Data is extracted from the data warehouse and updated in the down-stream systems/data marts.
b) There is no pilferage.

2.6.    Adhoc Querry Testing

                        This testing checks the following:
a) Ad-hoc queries creation is as per the expected functionalities.
b) Ad-hoc queries output response time is as expected.

2.7.    One Time Population testing

                        This testing checks the following:
a) The one time ETL for the production data is working
b) The production reports and the data warehouse reports are matching
c) The time taken for one time processing will be manageable within the conversion weekend.

2.8.    End-to-End Integrated Testing

                        This testing checks the following:
a) End to end data flow from the source system to the down stream system is complete and accurate.

3.       Salient features of the tool:


  •  Provides SQL queries and common steps involved while validating a Data Warehouse application for all phases of testing.
  • Scope of coverage of the test cases in a Data Warehouse project can be enhanced.
  • Reduces relentless effort and time in making use of the queries available in the tool.
  • Adherence to the common strategy of validation of Data Warehouse application can be sustained, by using the tool.

4.       Conclusion

Reduces the relentless effort and thus, aids in prevailing the productivity and quality of the end product by giving sustained confidence of the Business application.
And moreover, as no automation tool is available in the market, this can be even beneficial to get more business.

Integration Testing
The objective of Integration Testing is to ensure that workflows are executed as scheduled with correct dependency.
Integration testing involves the following
1.To check for the execution of workflows at the following stages
Source to Staging A.Staging A to Staging B. Staging B to Core.
2.To check target tables are populated with correct number of records.
3. Performance of the schedule is recorded and analysis is performed on the performance result.
4.To verify the dependencies among workflows between source to staging, staging to staging and staging to core is have been properly defined.
5.To Check for Error log messages in appropriate file.
6.To verify if the start jobs starts at pre-defined starting time. Example if the start time for first job has been configured to be at 10:00AM and the Control-M group
has been ordered at 7AM, the first job would not start in Control-M until 10:00AM.
7.To check for restarting of Jobs in case of failures.
4.3. Technical Shakedown Test
Due to the complexity in integrating the various source systems and tools, there are expected to be several teething problems with the environments. A Technical
Shakedown Test will be conducted prior to commencing System Testing, Stress & Performance, User Acceptance testing and Operational Readiness Test to ensure the following points are proven:
•Hardware is in place and has been configured correctly (including Informatica architecture, Source system connectivity and Business Objects).
•All software has been migrated to the testing environments correctly.
•All required connectivity between systems are in place.
•End-to-end transactions (both online and batch transactions) have been executed and do not fall over.
4.4. System Testing
The objective of System Testing is to ensure that the required business functions are implemented correctly. This phase includes data verification which tests the quality of data populated into target
tables.

45 comments:

  1. ETL Testing Online Training, ONLINE TRAINING – IT SUPPORT – CORPORATE TRAINING http://www.21cssindia.com/courses/etl-testing-online-training-249.html The 21st Century Software Solutions of India offers one of the Largest conglomerations of Software Training, IT Support, Corporate Training institute in India - +919000444287 - +917386622889 - Visakhapatnam,Hyderabad ETL Testing Online Training, ETL Testing Training, ETL Testing, ETL Testing Online Training| ETL Testing Training| ETL Testing| "Courses at 21st Century Software Solutions
    Talend Online Training -Hyperion Online Training - IBM Unica Online Training - Siteminder Online Training - SharePoint Online Training - Informatica Online Training - SalesForce Online Training - Many more… | Call Us +917386622889 - +919000444287 - contact@21cssindia.com
    Visit: http://www.21cssindia.com/courses.html"

    ReplyDelete
  2. Thanks for this valuable information.I was really learn about content how impact on etl testing but i get the answer of most of my queries.

    ReplyDelete
  3. This is a very good content I read this blog, please share more content on msbi online training hyderabad

    ReplyDelete
  4. Existing without the answers to the difficulties you’ve sorted out through this guide is a critical case, as well as the kind which could have badly affected my entire career if I had not discovered your website.
    Digital Marketing online training

    full stack developer training in pune

    full stack developer training in annanagar

    full stack developer training in tambaram

    full stack developer training in velachery

    ReplyDelete
  5. You got an extremely helpful website I actually have been here reading for regarding an hour. I’m an initiate and your success is incredibly a lot of a concept on behalf of me.
    python training in OMR
    python training in Bangalore
    python training in Bangalore

    ReplyDelete
  6. I believe there are many more pleasurable opportunities ahead for individuals that looked at your site.
    Blueprism training in btm

    Blueprism online training

    AWS Training in chennai

    ReplyDelete
  7. Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
    Data Science training in Chennai
    Data science training in bangalore
    Data science online training
    Data science training in pune

    ReplyDelete
  8. When I initially commented, I clicked the “Notify me when new comments are added” checkbox and now each time a comment is added I get several emails with the same comment. Is there any way you can remove people from that service? Thanks.

    AWS Interview Questions And Answers

    AWS Training in Chennai | Best AWS Training in Chennai


    AWS Training in Pune | Best Amazon Web Services Training in Pune

    AWS Tutorial |Learn Amazon Web Services Tutorials |AWS Tutorial For Beginners

    ReplyDelete
  9. Really awesome blog. Your blog is really useful for me
    Regards,
    Data Science Course in Chennai

    ReplyDelete
  10. Thank you for benefiting from time to focus on this kind of, I feel firmly about it and also really like comprehending far more with this particular subject matter. In case doable, when you get know-how, is it possible to thoughts modernizing your site together with far more details? It’s extremely useful to me.
    Microsoft Azure online training
    Selenium online training
    Java online training
    Java Script online training
    Share Point online training

    ReplyDelete
  11. I am really very happy to find this particular site. I just wanted to say thank you for this huge read!! I absolutely enjoying every petite bit of it and I have you bookmarked to test out new substance you post.

    devops online training

    aws online training

    data science with python online training

    data science online training

    rpa online training

    ReplyDelete
  12. Nice post. I learned some new information. Thanks for sharing.

    englishlabs
    Article submission sites

    ReplyDelete
  13. The responsibilities and the duties of the best data warehouse consultant are to assist the team or the individual data analysis property with high efficiency. That is why; I had chosen this blog because the services provided by your blog helped me with the advanced training and data quality processes. You helped in improving the database performances by solving many of the problematic queries.

    ReplyDelete
  14. Thanks for sharing valuable information. Keep posting.

    etl testing course
    etl testing certification

    ReplyDelete
  15. Thanks for this post. It proves very informative for me. Great post to read. Keep blogging.

    If you are looking for the top security companies in London that provide its customer with remarkable security services. Check out this site for security services and cctv camera monitoring services

    ReplyDelete
  16. Such a very useful Blog. Very interesting to read this article. I have learn some new information.thanks for sharing. data science courses

    ReplyDelete
  17. It's really a nice and useful piece of information. I am satisfied that you shared this helpful info with us. Please stay us up to date like this. Thanks for sharing.

    If you are looking for the top security companies in London that provide its customer with remarkable security services. Check out our site for security services and CCTV Service in London.We can help.

    ReplyDelete
  18. This information is really awesome thanks for sharing most valuable information.
    ETL Testing Online
    ETL Testing Training

    ReplyDelete

  19. This article discusses the importance of this blog for business promotion. Promoting Buy gmail Accountsa blog is just like promoting any other website and can be difficult to do if you are not sure what needs to be done, but with the right tips and tricks, this doesn't have to be. Hopefully, you find this article on this blog helpful and decide to visit the site below to see just what they are talking about.Buy snapchat account

    ReplyDelete
  20. In this article I am going to explain why this post is important for business promoiton. Promoting a business promotion involves Buy gmail accounts raising awareness and bringing visibility to a business or a company's message. This message gets out to potential customers, it lets them know what you are saying, what you're doing, and why they should support you. And the key is that your message needs to be specific enough to reach that audience. So if you are going to write a post on why this post is important for business promoiton here are some things that you will want to take into consideration:
    Buy youtube accounts

    ReplyDelete
  21. In case you are wondering what is tecnic all about, it is the combination of words that we usually call the tenses in Spanish. We say something like "I am going to study Spanish" or "I am studying Spanish." Then we make a distinction between the active voice and the passive voice, which are used in Spanish as well, by putting the words in the past tense. Buy gmail accounts

    ReplyDelete
  22. Buy google voice accounts The medical term "health stucture" describes the surgical process for the removal of a portion of the lung. It is a surgical procedure aimed at improving the airway's function, specifically the ability to inhale and exhale air normally.

    ReplyDelete
  23. This is a smart blog. I mean it. You have so much knowledge about this issue, and so much passion. You also know how to make people rally behind it, obviously from the responses.
    data analytics courses in hyderabad with placements

    ReplyDelete
  24. Your amazing insightful information entails much to me and especially to my peers. Thanks a ton; from all of us. data science training in surat

    ReplyDelete
  25. Really impressed! Everything is very open and very clear clarification of issues. It contains true facts. Your website is very valuable. Thanks for sharing.
    data analytics course in hyderabad

    ReplyDelete
  26. Really an awesome blog. Informative and knowledgeable content. Keep sharing more blogs with us. Thank you.
    Best Data Science Training in Hyderabad

    ReplyDelete