You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When testing multiple model types (e.g., in an ensemble workflow), one could create a baseline recipe that works for most of the models. Then if one model has slightly different pre-processing requirements (e.g., maybe you'd prefer to remove step_dummy in a rand_forest model or remove step_rm to include more predictors in an elastic net compared to a knn), a single call to this function could remove those steps and avoid the need to copy/paste the original recipe definition and delete certain steps. This helps to keep recipe definitions DRY.
Reproducible Example
FWIW, I chose the verb ignore_ to avoid confusion with 'remove' from step_rm.
Happy to submit a pull request if this seems valuable.
suppressPackageStartupMessages(devtools::load_all("."))
#> Loading recipesrec<- recipe(Species~., data=iris) %>%
step_rm(Petal.Width, id="rm_UDLut") %>%
step_rm(starts_with("Sepal"), id="custom_id")
rec %>% ignore_step("custom_id") # remove by custom id value#> Recipe#> #> Inputs:#> #> role #variables#> outcome 1#> predictor 4#> #> Operations:#> #> Variables removed Petal.Widthrec %>% ignore_step(1) # remove the first step#> Recipe#> #> Inputs:#> #> role #variables#> outcome 1#> predictor 4#> #> Operations:#> #> Variables removed starts_with("Sepal")rec %>% ignore_step("rm") # remove all `step_rm` steps (i.e., all steps here)#> Recipe#> #> Inputs:#> #> role #variables#> outcome 1#> predictor 4#> #> Operations:
The text was updated successfully, but these errors were encountered:
If the recipe has been estimated (or partially estimated), this wouldn't be feasible since the recipe would have some critical data that can't be rolled back. We are verifying this.
If the recipe has not been defined, this is pretty easy to do (just by subsetting the recipes$steps list). In this case, we could offer an easy api that takes number and/or id vectors as inputs and checks the recipe for having been trained.
Yeah the function I have now only works on unprepped recipes as you describe. I can imagine that removing steps from estimated recipes would be much more complicated.
Feature
When testing multiple model types (e.g., in an ensemble workflow), one could create a baseline recipe that works for most of the models. Then if one model has slightly different pre-processing requirements (e.g., maybe you'd prefer to remove
step_dummy
in arand_forest
model or removestep_rm
to include more predictors in an elastic net compared to a knn), a single call to this function could remove those steps and avoid the need to copy/paste the original recipe definition and delete certain steps. This helps to keep recipe definitions DRY.Reproducible Example
FWIW, I chose the verb
ignore_
to avoid confusion with 'remove' fromstep_rm
.Happy to submit a pull request if this seems valuable.
The text was updated successfully, but these errors were encountered: