Skip to content

Function that removes step(s) from an existing recipe #887

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
walrossker opened this issue Jan 27, 2022 · 2 comments · May be fixed by #1324
Open

Function that removes step(s) from an existing recipe #887

walrossker opened this issue Jan 27, 2022 · 2 comments · May be fixed by #1324
Labels
feature a feature request or enhancement

Comments

@walrossker
Copy link

Feature

When testing multiple model types (e.g., in an ensemble workflow), one could create a baseline recipe that works for most of the models. Then if one model has slightly different pre-processing requirements (e.g., maybe you'd prefer to remove step_dummy in a rand_forest model or remove step_rm to include more predictors in an elastic net compared to a knn), a single call to this function could remove those steps and avoid the need to copy/paste the original recipe definition and delete certain steps. This helps to keep recipe definitions DRY.

Reproducible Example

FWIW, I chose the verb ignore_ to avoid confusion with 'remove' from step_rm.

Happy to submit a pull request if this seems valuable.

suppressPackageStartupMessages(devtools::load_all("."))
#> Loading recipes

rec <- recipe(Species ~ ., data = iris) %>%
  step_rm(Petal.Width, id = "rm_UDLut") %>%
  step_rm(starts_with("Sepal"), id = "custom_id")

rec %>% ignore_step("custom_id") # remove by custom id value
#> Recipe
#> 
#> Inputs:
#> 
#>       role #variables
#>    outcome          1
#>  predictor          4
#> 
#> Operations:
#> 
#> Variables removed Petal.Width
rec %>% ignore_step(1) # remove the first step
#> Recipe
#> 
#> Inputs:
#> 
#>       role #variables
#>    outcome          1
#>  predictor          4
#> 
#> Operations:
#> 
#> Variables removed starts_with("Sepal")
rec %>% ignore_step("rm") # remove all `step_rm` steps  (i.e., all steps here)
#> Recipe
#> 
#> Inputs:
#> 
#>       role #variables
#>    outcome          1
#>  predictor          4
#> 
#> Operations:
@topepo
Copy link
Member

topepo commented Feb 3, 2022

We are looking into this a bit.

If the recipe has been estimated (or partially estimated), this wouldn't be feasible since the recipe would have some critical data that can't be rolled back. We are verifying this.

If the recipe has not been defined, this is pretty easy to do (just by subsetting the recipes$steps list). In this case, we could offer an easy api that takes number and/or id vectors as inputs and checks the recipe for having been trained.

@walrossker
Copy link
Author

Yeah the function I have now only works on unprepped recipes as you describe. I can imagine that removing steps from estimated recipes would be much more complicated.

@juliasilge juliasilge added the feature a feature request or enhancement label Feb 4, 2022
@EmilHvitfeldt EmilHvitfeldt self-assigned this May 26, 2024
@EmilHvitfeldt EmilHvitfeldt linked a pull request May 29, 2024 that will close this issue
@EmilHvitfeldt EmilHvitfeldt removed their assignment Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants