-
-
Notifications
You must be signed in to change notification settings - Fork 86
Multiple outputs for regression task #1296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks @tdhock. This would probably be a relatively large change, so this needs to be discussed in depth. |
I can see two different ways forward:
I do have a current project for which this would be useful. |
Currently I get an error when instantiating the task: N_row <- 100
D_in <- 10
D_out <- 2
set.seed(1)
df <- data.frame(
feature=matrix(rnorm(N_row*D_in), N_row, D_in),
target=matrix(rnorm(N_row*D_out), N_row, D_out))
df[1,]
reg_task <- mlr3::TaskRegr$new(
"example", df, target=paste0("target.", 1:D_out)) I got: > df[1,]
feature.1 feature.2 feature.3 feature.4 feature.5 feature.6 feature.7
1 -0.6264538 -0.6203667 0.4094018 0.8936737 1.074441 0.07730312 -0.341067
feature.8 feature.9 feature.10 target.1 target.2
1 -0.7075682 -1.086909 -1.541403 1.134965 0.2418959
> reg_task <- mlr3::TaskRegr$new(
+ "example", df, target=paste0("target.", 1:D_out))
Erreur dans .__TaskRegr__initialize(self = self, private = private, super = super, :
Assertion on 'target' failed: Must have length 1.
> sessionInfo()
R version 4.5.0 (2025-04-11)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
locale:
[1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Paris
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] digest_0.6.37 backports_1.5.0 R6_2.6.1
[4] codetools_0.2-20 lgr_0.4.4 parallel_4.5.0
[7] palmerpenguins_0.1.1 mlr3misc_0.16.0 parallelly_1.43.0
[10] future_1.34.0 mlr3_0.23.0 data.table_1.17.0
[13] compiler_4.5.0 paradox_1.0.1 globals_0.16.3
[16] tools_4.5.0 checkmate_2.3.2 listenv_0.9.1
[19] crayon_1.5.3 uuid_1.2-1 |
for comparison, scikit learn has support for some learners which are natively multi-output https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.MultiTaskLasso.html and this adaptor class for converting single-output regression learner to multi-output https://scikit-learn.org/stable/modules/generated/sklearn.multioutput.MultiOutputRegressor.html |
So this feature was at some point already on the roadmap, we just never got around to implementing it.
Regarding the conversion of single-output regression to multi-output: What exactly are you trying to do in your current project? Maybe we can find an easy workaround for now. |
in the current project we can do a work-around by making one TaskRegr for each output, and one single-task model for each output (with a single-task measure like MSE for each). but this is sub-optimal for two reasons:
|
Following up from mlr-org/mlr3torch#385 (review) I would like to request addition of a feature to support regression tasks with multiple targets / outputs (several columns to predict, not just one).
@sebffischer
The text was updated successfully, but these errors were encountered: