This function applies the equal count algorithm to divide a set of observations into intervals which can have certain level of ovelapping. It calls `lattice::equal.count` but extends the output.

equal_count(df, vble, n_int = 6, frac = 0.5)

Arguments

df

dataframe

vble

numeric variable to be analized

n_int

number of intervals

frac

overlapping fraction

Value

a list with two elements:

intervals

a tibble where each rows referes to one of the generated interval, with its lower and upper limits, number of values in it and number of values overlapping with the next interval

df_long

a tibble in long format where each observation appears as many times as the number of intervals in which it belongs, with an identifier of the observation (`id`, its position in the original data.frame) and an identifier of the interval.

Examples

equal_count(iris, Sepal.Length, 15, 0.3)
#> $intervals #> # A tibble: 15 × 5 #> n lower upper count overlap #> <int> <dbl> <dbl> <table> <int> #> 1 1 4.25 4.85 16 7 #> 2 2 4.65 5.05 23 16 #> 3 3 4.85 5.15 25 19 #> 4 4 4.95 5.25 23 13 #> 5 5 5.05 5.55 27 13 #> 6 6 5.35 5.65 19 13 #> 7 7 5.45 5.75 21 8 #> 8 8 5.65 5.95 18 10 #> 9 9 5.75 6.15 22 12 #> 10 10 5.95 6.35 25 13 #> 11 11 6.15 6.45 20 16 #> 12 12 6.25 6.65 23 7 #> 13 13 6.45 6.85 18 11 #> 14 14 6.65 7.25 20 9 #> 15 15 6.85 7.95 17 NA #> #> $df_long #> # A tibble: 317 × 7 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species id interval #> <dbl> <dbl> <dbl> <dbl> <fct> <int> <fct> #> 1 5.1 3.5 1.4 0.2 setosa 1 3 #> 2 5.1 3.5 1.4 0.2 setosa 1 4 #> 3 5.1 3.5 1.4 0.2 setosa 1 5 #> 4 4.9 3 1.4 0.2 setosa 2 2 #> 5 4.9 3 1.4 0.2 setosa 2 3 #> 6 4.7 3.2 1.3 0.2 setosa 3 1 #> 7 4.7 3.2 1.3 0.2 setosa 3 2 #> 8 4.6 3.1 1.5 0.2 setosa 4 1 #> 9 5 3.6 1.4 0.2 setosa 5 2 #> 10 5 3.6 1.4 0.2 setosa 5 3 #> # … with 307 more rows #>