The equal count algorithm — equal

This function applies the equal count algorithm to divide a set of observations into intervals which can have certain level of ovelapping. It calls `lattice::equal.count` but extends the output.

equal_count(df, vble, n_int = 6, frac = 0.5)

Arguments

df	dataframe
vble	numeric variable to be analized
n_int	number of intervals
frac	overlapping fraction

Value

a list with two elements:

intervals: a tibble where each rows referes to one of the generated interval, with its lower and upper limits, number of values in it and number of values overlapping with the next interval
df_long: a tibble in long format where each observation appears as many times as the number of intervals in which it belongs, with an identifier of the observation (`id`, its position in the original data.frame) and an identifier of the interval.

Examples

equal_count(iris, Sepal.Length, 15, 0.3)
#> $intervals
#> # A tibble: 15 × 5
#>        n lower upper count   overlap
#>    <int> <dbl> <dbl> <table>   <int>
#>  1     1  4.25  4.85 16            7
#>  2     2  4.65  5.05 23           16
#>  3     3  4.85  5.15 25           19
#>  4     4  4.95  5.25 23           13
#>  5     5  5.05  5.55 27           13
#>  6     6  5.35  5.65 19           13
#>  7     7  5.45  5.75 21            8
#>  8     8  5.65  5.95 18           10
#>  9     9  5.75  6.15 22           12
#> 10    10  5.95  6.35 25           13
#> 11    11  6.15  6.45 20           16
#> 12    12  6.25  6.65 23            7
#> 13    13  6.45  6.85 18           11
#> 14    14  6.65  7.25 20            9
#> 15    15  6.85  7.95 17           NA
#> 
#> $df_long
#> # A tibble: 317 × 7
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species    id interval
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>   <int> <fct>   
#>  1          5.1         3.5          1.4         0.2 setosa      1 3       
#>  2          5.1         3.5          1.4         0.2 setosa      1 4       
#>  3          5.1         3.5          1.4         0.2 setosa      1 5       
#>  4          4.9         3            1.4         0.2 setosa      2 2       
#>  5          4.9         3            1.4         0.2 setosa      2 3       
#>  6          4.7         3.2          1.3         0.2 setosa      3 1       
#>  7          4.7         3.2          1.3         0.2 setosa      3 2       
#>  8          4.6         3.1          1.5         0.2 setosa      4 1       
#>  9          5           3.6          1.4         0.2 setosa      5 2       
#> 10          5           3.6          1.4         0.2 setosa      5 3       
#> # … with 307 more rows
#>