Skip to content

[Question] How can I make sure AutoSklearn is always using StandardScaler for feature preprocessing? #1548

@LeSasse

Description

@LeSasse

H!

First of all, thanks for this nice tool for the community. It is very useful in finding good models quickly without too much effort.

Short Question Description

My question is this: I would like to make sure that the autosklearn model only evaluates pipelines which scale features using the StandardScaler from sklearn. It is not entirely clear to me how this can be done. I have tried different argument configurations using
"include" and "exclude", but all of my inputs seem to be invalid.

With some extra context to follow it up. This way the question is clear for both you and us without it being lost in the paragraph.
Some useful information to help us with your question:

  • How did this question come about?

I am just trying to make sure that autosklearn is always using the StandardScaler on the input features.

  • Would a small code snippet help?

Yes, very much.

  • What have you already looked at?

I have been through the documentation for API, examples, and the code on github. I am not sure, whether what I want is possible anymore. One particular thing I don't really understand about the API is the distinction between the "data_preprocessor" and the "feature_preprocessor".

Before you ask, please have a look at the

  • Documentation
    • If it's related but not clear, please include it in your question with a link, we'll try to make it better!
  • Examples
    • Likewise, an example can answer many questions! However we can't cover all question with examples but if you think your question would benefit from an example, let us know!
  • Issues
    • We try to label all questions with the label Question, maybe someone has already asked. If the question is about a feature, try searching more of the issues. If you find something related but doesn't directly answer your question, please link to it with #(issue number)!

System Details (if relevant)

  • Which version of auto-sklearn are you using?
    auto-sklearn==0.14.7

  • Are you running this on Linux / Mac / ... ?
    Debian 10.11

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions