Value distribution

Gets the distributions of values that occur in a dataset.

Often used as an initial way to see if a lot of repeated values are to be expected, if nulls occur and if a few un-repeated values add exceptions to the typical usage-pattern.

Analyzer rundown

Analyzer Concurrent Distributed execution possible

Result metrics (ValueDistributionAnalyzerResult)

Distinct count
Not parameterized
Null count
Not parameterized
Total count
Not parameterized
Unique count
Not parameterized
Value count
Parameterized by value

Configured properties

Column
InputColumn<Object> Required
Group column
InputColumn<String> Optional
Record unique values
boolean Optional
Record drill-down information

Record extra information to allow drilling to the records that represent a particular value in the distribution
boolean Optional