bag_tree() is a way to generate a specification of a model before fitting and allows the model to be created using different packages in R. The main arguments for the model are:

  • cost_complexity: The cost/complexity parameter (a.k.a. Cp) used by CART models (rpart only).

  • tree_depth: The maximum depth of a tree (rpart).

  • min_n: The minimum number of data points in a node that are required for the node to be split further.

  • class_cost: A cost value to asign to the class corresponding to the first factor level (for 2-class models, rpart and C5.0 only).

These arguments are converted to their specific names at the time that the model is fit. Other options and argument can be set using set_engine(). If left to their defaults here (NULL), the values are taken from the underlying model functions. If parameters need to be modified, update() can be used in lieu of recreating the object from scratch.

bag_tree(
  mode = "unknown",
  cost_complexity = 0,
  tree_depth = NULL,
  min_n = 2,
  class_cost = NULL
)

# S3 method for bag_tree
update(
  object,
  parameters = NULL,
  cost_complexity = NULL,
  tree_depth = NULL,
  min_n = NULL,
  class_cost = NULL,
  fresh = FALSE,
  ...
)

Arguments

mode

A single character string for the type of model. Possible values for this model are "unknown", "regression", or "classification".

cost_complexity

A positive number for the the cost/complexity parameter (a.k.a. Cp) used by CART models (rpart only).

tree_depth

An integer for maximum depth of the tree.

min_n

An integer for the minimum number of data points in a node that are required for the node to be split further.

class_cost

A non-negative scalar for a class cost (where a cost of 1 means no extra cost). This is useful for when the first level of the outcome factor is the minority class. If this is not the case, values between zero and one can be used to bias to the second level of the factor.

object

A bagged tree model specification.

parameters

A 1-row tibble or named list with main parameters to update. If the individual arguments are used, these will supersede the values in parameters. Also, using engine arguments in this object will result in an error.

fresh

A logical for whether the arguments should be modified in-place of or replaced wholesale.

...

Not used for update().

Details

The model can be created using the fit() function using the following engines:

  • R: "rpart" (the default) or "C5.0" (classification only)

Note that, for rpart models, but cost_complexity and tree_depth can be both be specified but the package will give precedence to cost_complexity. Also, tree_depth values greater than 30 rpart will give nonsense results on 32-bit machines.

Examples

library(parsnip) set.seed(9952) bag_tree(tree_depth = 5) %>% set_mode("classification") %>% set_engine("rpart", times = 3) %>% fit(Species ~ ., data = iris)
#> parsnip model object #> #> Fit time: 615ms #> Bagged CART (classification with 3 members) #> #> Variable importance scores include: #> #> # A tibble: 4 x 4 #> term value std.error used #> <chr> <dbl> <dbl> <int> #> 1 Petal.Length 88.9 1.05 3 #> 2 Petal.Width 88.5 2.78 3 #> 3 Sepal.Length 65.4 3.08 3 #> 4 Sepal.Width 44.8 2.01 3 #>
model <- bag_tree(cost_complexity = 10, min_n = 3) model
#> Bagged Decision Tree Model Specification (unknown) #> #> Main Arguments: #> cost_complexity = 10 #> min_n = 3 #>
update(model, cost_complexity = 1)
#> Bagged Decision Tree Model Specification (unknown) #> #> Main Arguments: #> cost_complexity = 1 #> min_n = 3 #>
update(model, cost_complexity = 1, fresh = TRUE)
#> Bagged Decision Tree Model Specification (unknown) #> #> Main Arguments: #> cost_complexity = 1 #>