Solve for a good set of right-exclusive x-cuts such that the overall graph of y~x is well-approximated by a piecewise linear function. Solution is a ready for use with with base::findInterval() and stats::approx() (demonstrated in the examples).

solve_for_partitionc(
  x,
  y,
  ...,
  w = NULL,
  penalty = 0,
  min_n_to_chunk = 1000,
  min_seg = 1,
  max_k = length(x)
)

Arguments

x

numeric, input variable (no NAs).

y

numeric, result variable (no NAs, same length as x).

...

not used, force later arguments by name.

w

numeric, weights (no NAs, positive, same length as x).

penalty

per-segment cost penalty.

min_n_to_chunk

minimum n to subdivied problem.

min_seg

positive integer, minimum segment size.

max_k

maximum segments to divide into.

Value

a data frame appropriate for stats::approx().

Examples


# example data
d <- data.frame(
  x = 1:8,
  y = c(-1, -1, -1, -1, 1, 1, 1, 1))

# solve for break points
soln <- solve_for_partitionc(d$x, d$y)
# show solution
print(soln)
#>   x pred group  what
#> 1 1   -1     1  left
#> 2 2   -1     1 right
#> 3 3   -1     2  left
#> 4 4   -1     2 right
#> 5 5    1     3  left
#> 6 6    1     3 right
#> 7 7    1     4  left
#> 8 8    1     4 right

# label each point
d$group <- base::findInterval(
  d$x,
  soln$x[soln$what=='left'])
# apply piecewise approximation
d$estimate <- stats::approx(
  soln$x,
  soln$pred,
  xout = d$x,
  method = 'constant',
  rule = 2)$y
# show result
print(d)
#>   x  y group estimate
#> 1 1 -1     1       -1
#> 2 2 -1     1       -1
#> 3 3 -1     2       -1
#> 4 4 -1     2       -1
#> 5 5  1     3        1
#> 6 6  1     3        1
#> 7 7  1     4        1
#> 8 8  1     4        1