ETL testing :
Testing is an important phase in the project life cycle.The first part of an ETL process involves extracting the data from the source systems. In many cases this is the most challenging aspect of ETL, as extracting data correctly will set the stage for how subsequent processes will go.
Extraction:
The first part of an ETL process involves extracting the data from the
source systems. Most data warehousing projects consolidate data from different
source systems. Each separate system may also use a different data format.
Common data source formats are relational databases and flat files, but may
include non-relational database structures such as Information Management
System (IMS) or other data structures such as Virtual Storage Access Method
(VSAM) or Indexed Sequential Access Method (ISAM), or even fetching from
outside sources such as through web spidering or screen-scraping. Extraction
converts the data into a format for transformation processing.
An intrinsic part of the extraction involves the parsing of extracted data, resulting in a check if the data meets an expected pattern or structure. If not, the data may be rejected entirely or in part.
Transformation:
Transformation is the series of tasks that
prepares the data for loading into the warehouse. Once data is secured, you
have worry about its format or structure. Because it will be not be in the
format needed for the target. Example the grain level, data type, might be
different. Data cannot be used as it is. Some rules and functions need to be
applied to transform the data
One of the purposes of ETL is to consolidate the data in a central repository or to bring it at one logical or physical place. Data can be consolidated from similar systems, different subject areas, etc.
ETL must support data integration for the data coming from multiple sources and data coming at different times. This has to be seamless operation. This will avoid overwriting existing data, creating duplicate data or even worst simply unable to load the data in the target
Loading:
Loading process is critical to
integration and consolidation. Loading process decides the modality of how the
data is added in the warehouse or simply rejected. Methods like addition,
Updating or deleting are executed at this step. What happens to the existing data?
Should the old data be deleted because of new information? Or should the data
be archived? Should the data be treated as additional data to the existing one?
So data to the data warehouse has to loaded with utmost care for which data auditing process can only establish the confidence level. This auditing process normally happens after the loading of data.
Testing is an important phase in the project life cycle.The first part of an ETL process involves extracting the data from the source systems. In many cases this is the most challenging aspect of ETL, as extracting data correctly will set the stage for how subsequent processes will go.
Extract, transform and load (ETL) is a process in database usage and especially in data warehousing that involves:
- Extracting data from outside sources
- Transforming it to fit operational needs (which can include quality levels)
- Loading it into the end target (database or data warehouse)
Data warehousing and its Concepts:
What is Data warehouse?
Data Warehouse is a central managed and
integrated database containing data from the operational sources in an
organization (such as SAP, CRM, ERP system). It may gather manual inputs from
users determining criteria and parameters for grouping or classifying records.
Data warehouse database contains structured data for query analysis and can be accessed by users. The data warehouse can be created or updated at any time, with minimum disruption to operational systems. It is ensured by a strategy implemented in ETL process.
A source for the data warehouse is a data extract from operational databases. The data is validated, cleansed, transformed and finally aggregated and it becomes ready to be loaded into the data warehouse.
Data warehouse database contains structured data for query analysis and can be accessed by users. The data warehouse can be created or updated at any time, with minimum disruption to operational systems. It is ensured by a strategy implemented in ETL process.
A source for the data warehouse is a data extract from operational databases. The data is validated, cleansed, transformed and finally aggregated and it becomes ready to be loaded into the data warehouse.
Data warehouse is a dedicated database
which contains detailed, stable, non-volatile and consistent data which can be
analyzed in the time variant.
Sometimes, where only a portion of detailed data is required, it may be worth considering using a data mart.
Sometimes, where only a portion of detailed data is required, it may be worth considering using a data mart.
A data mart is generated from the data
warehouse and contains data focused on a given subject and data that is
frequently accessed or summarized.
Advantages
of Data warehouse:
Ø Data warehouse provides a common data model for all data
of interest regardless of the data's source. This makes it easier to report and
analyze information than it would be if multiple data models were used to
retrieve information such as sales invoices, order receipts, general ledger
charges, etc.
Ø Inconsistencies are identified and resolved prior to
loading of data in the Data warehouse. This greatly simplifies reporting and
analysis.
Ø Information in the data warehouse is under the control of
data warehouse users so that, even if the source system data is purged over
time, the information in the warehouse can be stored safely for extended
periods of time.
Ø Because they are separate from operational systems, data
warehouses provide retrieval of data without slowing down operational systems.
Ø Data warehouses
enhance the value of operational business applications, notably customer
relationship management (CRM) systems.
Ø Data warehouses
facilitate decision support system applications such as trend reports (e.g.,
the items with the most sales in a particular area within the last two years),
exception reports, and reports that show actual performance versus goals.
Disadvantages of Data Warehouse:
Ø Data warehouses are not the optimal environment for
unstructured data.
Ø Because data must be extracted, transformed and loaded
into the warehouse, there is an element of latency in data warehouse data.
Ø Over their life, data warehouses can have high costs.
Maintenance costs are high.
Ø Data warehouses can get outdated relatively quickly.
There is a cost of delivering suboptimal information to the organization.
Ø There is often a fine line between data warehouses and
operational systems. Duplicate, expensive functionality may be developed. Or,
functionality may be developed in the data warehouse that, in retrospect,
should have been developed in the operational systems and vice versa.
ETL Concept:
ETL is
the automated and auditable data acquisition process from source system that
involves one or more sub processes of data extraction, data transportation,
data transformation, data consolidation, data integration, data loading and
data cleaning.
E - Extracting
data from source operational or archive systems which are primary source of
data for the data warehouse.
T - Transforming the data – which may
involve cleaning, filtering, validating and applying business rules.
L - Loading
the data into the data warehouse or any other database or application that
houses the data.
Extraction:
The first part of an ETL process involves extracting the data from the
source systems. Most data warehousing projects consolidate data from different
source systems. Each separate system may also use a different data format.
Common data source formats are relational databases and flat files, but may
include non-relational database structures such as Information Management
System (IMS) or other data structures such as Virtual Storage Access Method
(VSAM) or Indexed Sequential Access Method (ISAM), or even fetching from
outside sources such as through web spidering or screen-scraping. Extraction
converts the data into a format for transformation processing.An intrinsic part of the extraction involves the parsing of extracted data, resulting in a check if the data meets an expected pattern or structure. If not, the data may be rejected entirely or in part.
Transformation:
Transformation is the series of tasks that
prepares the data for loading into the warehouse. Once data is secured, you
have worry about its format or structure. Because it will be not be in the
format needed for the target. Example the grain level, data type, might be
different. Data cannot be used as it is. Some rules and functions need to be
applied to transform the dataOne of the purposes of ETL is to consolidate the data in a central repository or to bring it at one logical or physical place. Data can be consolidated from similar systems, different subject areas, etc.
ETL must support data integration for the data coming from multiple sources and data coming at different times. This has to be seamless operation. This will avoid overwriting existing data, creating duplicate data or even worst simply unable to load the data in the target
Loading:
Loading process is critical to
integration and consolidation. Loading process decides the modality of how the
data is added in the warehouse or simply rejected. Methods like addition,
Updating or deleting are executed at this step. What happens to the existing data?
Should the old data be deleted because of new information? Or should the data
be archived? Should the data be treated as additional data to the existing one?So data to the data warehouse has to loaded with utmost care for which data auditing process can only establish the confidence level. This auditing process normally happens after the loading of data.
List of ETL tools:
Below is the list of ETL Tools available in the market:
List of ETL Tools
|
ETL Vendors
|
Oracle Warehouse Builder (OWB)
|
Oracle
|
Data Integrator & Data Services
|
SAP Business Objects
|
IBM Information Server (Datastage)
|
IBM
|
SAS Data Integration Studio
|
SAS Institute
|
PowerCenter
|
Informatica
|
Elixir Repertoire
|
Elixir
|
Data Migrator
|
Information Builders
|
SQL Server Integration Services
|
Microsoft
|
Talend Open Studio
|
Talend
|
DataFlow Manager
|
Pitney Bowes Business Insight
|
Data Integrator
|
Pervasive
|
Open Text Integration Center
|
Open Text
|
Transformation Manager
|
ETL Solutions Ltd.
|
Data Manager/Decision Stream
|
IBM (Cognos)
|
Clover ETL
|
Javlin
|
ETL4ALL
|
IKAN
|
DB2 Warehouse Edition
|
IBM
|
Pentaho Data Integration
|
Pentaho
|
Adeptia Integration Server
|
Adeptia
|
ETL Testing:
Following are some common goals for testing an ETL
application:
Data completeness - To ensure that all
expected data is loaded.
Data Quality - It promises that the ETL
application correctly rejects, substitutes default values, corrects and reports
invalid data.
Data transformation - This is meant for
ensuring that all data is correctly transformed according to business rules and
design specifications.
Performance and scalability- This is to ensure that the data
loads and queries perform within expected time frames and the technical
architecture is scalable.
Integration testing- It is to ensure that ETL process functions
well with other upstream and downstream applications.
User-acceptance testing - It ensures the solution
fulfills the users’ current expectations and also anticipates their future
expectations.
Regression testing - To keep the existing
functionality intact each time a new release of code is completed.
Basically data warehouse testing is divided into two
categories ‘Back-end testing’ and ‘Front-end testing’. The former applies where
the source systems data is compared to the end-result data in Loaded area which
is the ETL testing. While the latter refers to where the user checks the data
by comparing their MIS with the data that is displayed by the end-user tools.
Data Validation:
Data completeness is one of the basic ways for data
validation. This is needed to verify that all expected data loads into the data
warehouse. This includes the validation of all the records, fields and ensures
that the full contents of each field are loaded.
Data Transformation:
Validating that the data is transformed correctly based on
business rules, can be one of the most complex parts of testing an ETL
application with significant transformation logic. Another way of testing is to
pick up some sample records and compare them for validating data transformation
manually, but this method requires manual testing steps and testers who have a
good amount of experience and understand of the ETL logic.
Data Warehouse Testing Life Cycle:
Like any other piece of software a DW implementation undergoes the
natural cycle of Unit testing, System testing, Regression testing, Integration
testing and Acceptance testing.
Unit testing: Traditionally this has been
the task of the developer. This is a white-box testing to ensure the module or
component is coded as per agreed upon design specifications. The developer
should focus on the following:
a) That all inbound and outbound directory structures are created
properly with appropriate permissions and sufficient disk space. All tables
used during the ETL are present with necessary privileges.
b) The ETL routines give expected results:
i. All transformation logics work as designed from source
till target
ii. Boundary conditions are satisfied− e.g. check for date
fields with leap year dates
iii. Surrogate keys have been generated properly
iv. NULL values have been populated where expected
v. Rejects have occurred where expected and log for rejects
is created with sufficient details
vi. Error recovery methods
vii. Auditing is done properly
c) That the data loaded into the target is complete:
i. All source data that is expected to get loaded into
target, actually get loaded− compare counts between source and target and use
data profiling tools
ii. All fields are loaded with full contents− i.e. no data
field is truncated while transforming
iii. No duplicates are loaded
iv. Aggregations take place in the target properly
v. Data integrity constraints are properly taken care of
System testing: Generally the QA team owns
this responsibility. For them the design document is the bible and the entire
set of test cases is directly based upon it. Here we test for the functionality
of the application and mostly it is black-box. The major challenge here is
preparation of test data. An intelligently designed input dataset can bring out
the flaws in the application more quickly. Wherever possible use
production-like data. You may also use data generation tools or customized
tools of your own to create test data. We must test for all possible
combinations of input and specifically check out the errors and exceptions. An
unbiased approach is required to ensure maximum efficiency. Knowledge of the
business process is an added advantage since we must be able to interpret the
results functionally and not just code-wise.
The QA team must test for:
i.
Data completeness− match source to target counts terms of
business. Also the load windows refresh period for the DW and the views created
should be signed off from users.
ii.
Data aggregations− match aggregated data against staging
tables.
iii.
Granularity of data is as per specifications.
iv.
Error logs and audit tables are generated and populated
properly.
v.
Notifications to IT and/or business are generated in proper
format
Regression testing: A DW application is not a
one-time solution. Possibly it is the best example of an incremental design
where requirements are enhanced and refined quite often based on business needs
and feedbacks. In such a situation it is very critical to test that the
existing functionalities of a DW application are not messed up whenever an
enhancement is made to it. Generally this is done by running all functional
tests for existing code whenever a new piece of code is introduced. However, a
better strategy could be to preserve earlier test input data and result sets
and running the same again. Now the new results could be compared against the
older ones to ensure proper functionality.
Integration testing: This is done to ensure that
the application developed works from an end-to-end perspective. Here we must
consider the compatibility of the DW application with upstream and downstream
flows. We need to ensure for data integrity across the flow. Our test strategy
should include testing for:
i. Sequence of jobs to be executed with job dependencies and
scheduling
ii. Re-startability of jobs in case of failures
iii. Generation of error logs
iv. Cleanup scripts for the environment including database
This activity is a combined responsibility and participation of
experts from all related applications is a must in order to avoid
misinterpretation of results.
Acceptance testing: This is the most critical
part because here the actual users validate your output datasets. They are the
best judges to ensure that the application works as expected by them. However,
business users may not have proper ETL knowledge. Hence, the development and
test team should be ready to provide answers regarding ETL process that relate
to data population. The test team must have sufficient business knowledge to
translate the results in terms of business. Also the load windows, refresh
period for the DW and the views created should be signed off from users.
Performance testing: In addition to the above
tests a DW must necessarily go through another phase called performance
testing. Any DW application is designed to be scalable and robust. Therefore,
when it goes into production environment, it should not cause performance
problems. Here, we must test the system with huge volume of data. We must
ensure that the load window is met even under such volumes. This phase should
involve DBA team, and ETL expert and others who can review and validate your
code for optimization.
Summary:
Testing a DW application should be done with a sense of utmost
responsibility. A bug in a DW traced at a later stage results in unpredictable
losses. And the task is even more difficult in the absence of any single
end-to-end testing tool. So the strategies for testing should be methodically
developed, refined and streamlined. This is also true since the requirements of
a DW are often dynamically changing. Under such circumstances repeated
discussions with development team and users is of utmost importance to the test
team. Another area of concern is test coverage. This has to be reviewed
multiple times to ensure completeness of testing. Always remember, a DW tester
must go an extra mile to ensure near defect free solutions.
Defect
Severity determines
the defect's effect on the application where as
Defect
Priority determines
the defect urgency of repair.
Severity is
given by Testers and Priority by Developers
1. High Severity & Low Priority : For example an application which generates some banking related reports weekly, monthly, quarterly & yearly by doing some calculations. If there is a fault while calculating yearly report. This is a high severity fault but low priority because this fault can be fixed in the next release as a change request.
2. High Severity & High Priority : In the above example if there is a fault while calculating weekly report. This is a high severity and high priority fault because this fault will block the functionality of the application immediately within a week. It should be fixed urgently.
3. Low Severity & High Priority : If there is a spelling mistake or content issue on the homepage of a website which has daily hits of lakhs. In this case, though this fault is not affecting the website or other functionalities but considering the status and popularity of the website in the competitive market it is a high priority fault.
1. High Severity & Low Priority : For example an application which generates some banking related reports weekly, monthly, quarterly & yearly by doing some calculations. If there is a fault while calculating yearly report. This is a high severity fault but low priority because this fault can be fixed in the next release as a change request.
2. High Severity & High Priority : In the above example if there is a fault while calculating weekly report. This is a high severity and high priority fault because this fault will block the functionality of the application immediately within a week. It should be fixed urgently.
3. Low Severity & High Priority : If there is a spelling mistake or content issue on the homepage of a website which has daily hits of lakhs. In this case, though this fault is not affecting the website or other functionalities but considering the status and popularity of the website in the competitive market it is a high priority fault.
4. Low Severity & Low Priority : If there is a spelling mistake on
the pages which has very less hits throughout the month on any website. This
fault can be considered as low severity and low priority.
ETL TESTING:
ETL basically stands for Extract Transform Load - which
simply implies the process where you extract data from Source Tables, transform
them in to the desired format based on certain rules and finally load them onto
Target tables. There are numerous tools that help you with ETL process -
Informatica, Control-M being a few notable ones.
So ETL Testing implies - Testing this entire process using a tool or at table level with the help of test cases and Rules Mapping document.
In ETL Testing, the following are validated -
1) Data File loads from Source system on to Source Tables.
2) The ETL Job that is designed to extract data from Source tables and then move them to staging tables. (Transform process)
3) Data validation within the Staging tables to check all Mapping Rules / Transformation Rules are followed.
4) Data Validation within Target tables to ensure data is present in required format and there is no data loss from Source to Target tables.
So ETL Testing implies - Testing this entire process using a tool or at table level with the help of test cases and Rules Mapping document.
In ETL Testing, the following are validated -
1) Data File loads from Source system on to Source Tables.
2) The ETL Job that is designed to extract data from Source tables and then move them to staging tables. (Transform process)
3) Data validation within the Staging tables to check all Mapping Rules / Transformation Rules are followed.
4) Data Validation within Target tables to ensure data is present in required format and there is no data loss from Source to Target tables.
Challenges
of Data warehouse Testing
• Data selection from multiple
source systems and
analysis
that follows pose great challenge.
• Volume and the complexity of
the data.
• Inconsistent and redundant
data in a data warehouse.
• Inconsistent and Inaccurate
reports.
• Non-availability of History
data.
3 Testing Methodology
• Use of Traceability to enable
full test coverage of
Business
Requirements
• In depth review of Test Cases
• Manipulation of Test Data to
ensure full test
coverage
Fig
1 Testing Methodology (V- Model)
• Provision of appropriate tools
to speed the process
of
Test Execution & Evaluation
• Regression Testing
4 Testing Types
The
following are types of Testing performed for Data
warehousing
projects.
1. Unit Testing.
2. Integration Testing.
3. Technical Shakedown Testing.
4. System Testing.
5. Operation readiness Testing
6. User Acceptance Testing.
4.1 Unit Testing
The
objective of Unit testing involves testing of Business
transformation
rules, error conditions, mapping fields at
staging
and core levels.
Unit
testing involves the following
1. Check the Mapping of fields
present in staging
level.
2. Check for the duplication of
values generated using
Sequence
generator.
3. Check for the correctness of
surrogate keys, which
uniquely
identifies rows in database.
4. Check for Data type constraints
of the fields
present
in staging and core levels.
5. Check for the population of
status and error
messages
into target table.
6. Check for string columns are
left and right trimmed.
7. Check every mapping needs to
implement the process
abort
mapplet which is invoked if the number of record read
from
source is not equal to trailer count.
8. Check every object,
transformation, source and
target
need to have proper metadata. Check visually in data
warehouse
designer tool if every transformation has a
meaningful
description.
4.2 Integration Testing
The
objective of Integration Testing is to ensure that
workflows
are executed as scheduled with correct
dependency.
Integration
testing involves the following
1. To check for the execution of
workflows at the
following
stages
Source to Staging A.
Staging
A to Staging B.
Staging B to Core.
2. To check target tables are
populated with correct
number
of records.
3. Performance of the schedule is
recorded and
analysis
is performed
on the performance result.
4. To verify the dependencies
among workflows between
source
to staging, staging to staging and staging to core
is
have been properly defined.
5. To Check for Error log messages
in appropriate file.
6. To verify if the start jobs
starts at pre-defined
starting
time. Example if the start time for first job has
been
configured to be at 10:00AM and the Control-M group
has
been ordered at 7AM, the first job would not start in
Control-M
until 10:00AM.
7. To check for restarting of Jobs
in case of failures.
4.3 Technical Shakedown Test
Due
to the complexity in integrating the various source
systems
and tools, there are expected to be several
teething
problems with the environments. A Technical
Shakedown
Test will be conducted prior to commencing System
Testing,
Stress & Performance, User Acceptance testing and
Operational
Readiness Test to ensure the following points
are
proven:
• Hardware is in place and has
been configured
correctly
(including Informatica architecture, Source
system
connectivity and Business Objects).
• All software has been migrated
to the testing
environments
correctly.
• All required connectivity
between systems are in
place.
• End-to-end transactions (both
online and batch
transactions)
have been executed and do not fall over.
4.4 System Testing
The
objective of System Testing is to ensure that the
required
business functions are implemented correctly. This
phase
includes data verification which tests the quality of
data
populated into target tables.
System Testing involves the following
1. To check the functionality of
the system meets the
business
specifications.
2. To check for the count of
records in source table
and
comparing with the number of records in the target
table
followed by analysis of rejected records.
3. To check for end to end
integration of systems and
connectivity
of the infrastructure (e.g. hardware and
network
configurations are correct),
4. To check all transactions,
database updates and
data
flows functions for accuracy.
5. To validate Business reports
functionality.
Reporting
functionality Ability to
report data as required
by
Business using Business Objects
Report
Structure Since the universe and
reports have
been
migrated from previous version of Business Objects,
it’s
necessary to ensure that the upgraded reports
replicate
the structure/format and data requirements (until
and
unless a change / enhancement has been documented in
Requirement
Traceability Matrix / Functional Design
Document).
Enhancements Enhancements like reports’ structure,
prompts
ordering which were in scope of upgrade project
will
be tested
Data
Accuracy The data displayed in
the reports / prompts
matches
with the actual data in data mart.
Performance Ability of the system to perform
certain
functions
within a prescribed time.
That
the system meets the stated performance criteria
according
to agreed SLAs or specific non-functional
requirements.
Security That the required level of security access
is
controlled and works properly, including domain
security,
profile security, Data Security, UserID and
password
control, and access procedures. That the security
system
cannot be bypassed.
Usability That the system is useable as per specified
requirements
User
Accessibility That specified type of
access to
data
is provided to users
Connection
Parameters Test the connection
Data
provider Check for the right
universe and duplicate
data
Conditions/Selection
criteria Test the for selection
criteria
for the correct logic
Object
testing Test the objects
definitions
Context
testing Ensure formula is with
input or output
context
Variable
testing Test the variable for its
syntax
and
data type compatible
Formulas
or calculations Test the
formula for its
syntax
and validate the data given by the formula
Filters Test the data has filter correctly
Alerts Check for extreme limits Report alerts
Sorting Test the sorting order of Section headers
fields,
blocks
Totals
and subtotals Validate
the data results
Universe
Structure Integrity of universe is
maintained
and
there are no divergences in terms of joins / objects /
prompts
4.5 User Acceptance Testing
The
objective of this testing to ensure that System meets
the
expectations of the business users. It
aims to prove
that
the entire system operates effectively in a production
environment
and that the system successfully supports the
business
processes from a user's perspective. Essentially,
these
tests will run through “a day in the life of”
business
users. The tests will also include
functions that
involve
source systems connectivity, jobs scheduling and
Business
reports functionality.
4.6 Operational Readiness Testing
(ORT)
This
is the final phase of testing which focuses on
verifying
the deployment of software and the operational
readiness
of the application. The main areas of testing in
this
phase include:
Deployment Test
1. Tests the deployment of the
solution
2. Tests overall technical
deployment “checklist” and
timeframes
3. Tests the security aspects of
the system including
user
authentication and authorization, and
user-access
levels.
Operational and Business Acceptance Testing
1. Tests the operability of the
system including job
control
and scheduling.
2. Tests include normal scenarios,
abnormal, and fatal
scenarios
5 Test Data
Given
the complexity of Data warehouse projects;
preparation
of test data is daunting task. Volume of data
required
for each level of testing is given below.
Unit
Testing - This phase of testing will be performed with
a
small subset (20%) of production data for each source
system.
Integration
Testing - This phase of testing will be
performed
with a small subset of production data for each
source
system.
System
Testing – This phase of a subset of live data will
be
used which is sufficient in volume to contain all
required
test conditions that includes normal scenarios,
abnormal,
and fatal scenarios but small enough that
workflow
execution time does not impact the test schedule
unduly.
6 Conclusion
Data
warehouse solutions are becoming almost ubiquitous as
a
supporting technology for the operational and strategic
functions
at most companies. Data warehouses play an
integral
role in business functions as diverse as
enterprise
process management and monitoring, and
production
of financial statements. The approach described
here
combines an understanding of the business rules
applied
to the data with the ability to develop and use
testing
procedures that check the accuracy of entire data
sets.
This level of testing rigor requires additional
effort
and more skilled resources. However, by employing
this
methodology, the team can be more confident, from day
one
of the implementation of the DW, in the quality of the
data.
This will build the confidence of the end-user
community,
and it will ultimately lead to a more effective
implementation.
Please
go through the above document.As far as my knowledge
we
don't have any ETL Testing Certifications.
ETL testing
You can do that by comparing the source
vs warehouse tables in SQL Server you can use like this
source.table1
except
warehouse.table1
Union
Warehouse.table
except
source.table1
Why do you do the above union is you are taking source table as the base table and compare with warehouse then take the warehouse as source and compare that with source table then you get the differences.
source.table1
except
warehouse.table1
Union
Warehouse.table
except
source.table1
Why do you do the above union is you are taking source table as the base table and compare with warehouse then take the warehouse as source and compare that with source table then you get the differences.
ETL testing
ETL testing validates that data is
transformed correctly from OLTP to data
warehouse.
warehouse.
Validating the data transformed
includes following main verification points:
- Ensures that all expected data is loaded. Comparing
record counts between
source data, data loaded to the warehouse and rejected records.
- Ensures that all data is transformed correctly
according to design
specifications.
- Ensures that the ETL application substitutes default
values, ignores
invalid data.
- Validate correct processing of ETL-generated fields
such as surrogate
keys.
- Validate that data types in the warehouse are as
specified in the design
and/or the data model.
- Validate the referential integrity between tables.
Hi
ReplyDeletereally u provided valuable information for etl testing.thanks(sulthanabad,karimnagar,Andhrapradesh,india>
hey buddy !
ReplyDeleteIt will good if you can share ETL testing real challenges faced day to day activity. like data validation from source to target using sql query,defects found during data validation. type of testing performed in staging to dwh, etc etc
Thank you for the nice article here. Really nice and keep update to explore more ideas.
ReplyDeleteBig Data Testing Services
Deep machine learning is the way to understand machine learning in a better way. By this, you can verify machine learning solution providers' quality in many different terms. You need to be alert and updated to avoid any kind of misleading.
ReplyDeleteNice information, this is will helpfull a lot, Thank for sharing, Keep do posting i like to follow this informatica online training
ReplyDeleteThanks your blog and very informative. Keep posting.
ReplyDeleteetl testing course
etl testing
Well written articles like yours renews my faith in today's writers. The article is very informative. Thanks for sharing such beautiful information.
ReplyDeleteBest Data Migration tools
Penetration testing companies USA
What is Data Lake
Artificial Intelligence in Banking
What is Data analytics
Big data Companies USA
What is Data Lake
What is Data Migration
What is Data Science
nice post.
ReplyDeletelinux online training
etl testing online training
web methods online training
IntelliMindz is the best IT Training in Bangalore with placement, offering 200 and more software courses with 100% Placement Assistance.
ReplyDeleteETL Testing Online Training
ETL Testing Course In Bangalore
ETL Testing Course In Chennai
ETL Testing Course In Coimbatore
ETL Testing Training In Tirupur
IT's very informative blog and useful article thank you for sharing with us , keep posting learn more about Product engineering services | Product engineering solutions.
ReplyDeleteGood article on ETL testing. Thank you for sharing this.
ReplyDeleteQA Career Portal
ETL Testing Jobs