Numerical Binning
Description
The Numerical Binning activity groups numerical data into “bins” or “intervals” based on a chosen rule. This helps to make large sets of continuous numbers easier to analyze by simplifying them into smaller, predefined ranges.
Input
Data Only
Output
Transformed Data
Configuration Fields
- Column
Choose the column with numerical data that you want to bin.
-
Binning mode This is how the system decides how to split the numbers into bins. You can choose from the following
- Sturges Good for smaller datasets; it calculates the number of bins based on a formula.
- Freedman diaconis Works well when there are outliers in the data, calculates bins using a different formula.
- Scott This method minimizes binning errors and is based on bin width.
- Square Root A simple method where the number of bins is the square root of the data size.
- Fixed Size Intervals You define how wide the bins should be, and it uses that size for each bin.
-
Custom You define the exact bins yourself.
-
Output column The new column that will store the binned values.
-
Include original If you want to keep the original numbers alongside the new binned values.
- Enabled Keep the original values.
- Disabled Only show the binned values.
-
Number Of Bins
How many bins you want to divide the data into.
- Minimum Value (rendered only when mode is fixed intervals)
The smallest number that should be included in the bins.
- Maximum Value (rendered only when mode is fixed intervals) The largest number that should be included in the bins.
Sample Input
Age |
---|
23 |
35 |
47 |
59 |
72 |
89 |
Sample Configuration
Sample Output
Age | Age_Binned |
---|---|
23 | 20-40 |
35 | 20-40 |
47 | 40-60 |
59 | 40-60 |
72 | 60-80 |
89 | 80-100 |