Value distribution

The value distribution (often also referred to as 'Frequency analysis') allows you to identify all the values of a particular column. Furthermore you can investigate which rows pertain to specific values.

Here are the configuration properties for the value distribution analyzer:

Table 5.5. Value distribution properties

PropertyDescription
Group columnAllows you to define a column for grouping the result. With a group column you can separate the identified value distributions into separate buckets/groups. Imagine for example that you want to check if the postal codes and city names correspond or if you just want to segment your value distribution on eg. country or gender or ...
Record unique valuesBy default all unique values will be included in the result of the value distribution. This can potentially cause memory issues if your analyzed columns contains a LOT of unique values (eg. if it's a unique key). If the actual unique values are not of interest, then uncheck this checkbox to only count (but not save for inspection) the unique values.
Top n most frequent valesAn optional number used if the analysis should only display eg. the "top 5 most frequent values". The result of the analysis will only contain top/bottom n most frequent values, if this property is supplied.
Bottom n most frequent valuesAn optional number used if the analysis should only display eg. the "bottom 5 most frequent values". The result of the analysis will only contain top/bottom n most frequent values, if this property is supplied.