Chapter 5. Analyze

Abstract

This chapter deals with one of the most important concepts in DataCleaner: Analysis of data quality.

An analyzer is a component that consumes a (set of) column(s) and generates an analysis result based on the values in the consumed columns.

Here is an example of a configuration panel pertaining to an analyzer:

In the panel there will always be one or more selections of columns. The configuration panel may also contain additional properties for configuration.

Table of Contents

Boolean analyzer
Completeness analyzer
Character set distribution
Date gap analyzer
Date/time analyzer
Number analyzer
Pattern finder
Reference data matcher
Referential integrity
String analyzer
Unique key check
Value distribution
Value matcher
Weekday distribution
Machine Learning analyzers