Skip to content

Commit 1b87eb3

Browse files
committed
fix multilabel vignette
1 parent 04a3b37 commit 1b87eb3

File tree

1 file changed

+21
-21
lines changed

1 file changed

+21
-21
lines changed

vignettes/tutorial/multilabel.Rmd

+21-21
Original file line numberDiff line numberDiff line change
@@ -19,16 +19,16 @@ set.seed(123)
1919

2020
Multilabel classification is a classification problem where multiple target labels can be assigned to each observation instead of only one like in multiclass classification.
2121

22-
Two different approaches exist for multilabel classification.
23-
*Problem transformation methods* try to transform the multilabel classification into binary or multiclass classification problems.
22+
Two different approaches exist for multilabel classification.
23+
*Problem transformation methods* try to transform the multilabel classification into binary or multiclass classification problems.
2424
*Algorithm adaptation methods* adapt multiclass algorithms so they can be applied directly to the problem.
2525

2626
# Creating a task
2727

2828
The first thing you have to do for multilabel classification in `mlr` is to
29-
get your data in the right format.
30-
You need a `data.frame` which consists of the features and a logical vector for each label which indicates if the label is present in the observation or not. After that you can create a `MultilabelTask` (`Task()`) like a normal `ClassifTask` (`Task()`).
31-
Instead of one target name you have to specify a vector of targets which correspond to the names of logical variables in the `data.frame`.
29+
get your data in the right format.
30+
You need a `data.frame` which consists of the features and a logical vector for each label which indicates if the label is present in the observation or not. After that you can create a `MultilabelTask` (`Task()`) like a normal `ClassifTask` (`Task()`).
31+
Instead of one target name you have to specify a vector of targets which correspond to the names of logical variables in the `data.frame`.
3232
In the following example we get the yeast data frame from the already existing `yeast.task()`, extract the 14 label names and create the task again.
3333

3434
```{r}
@@ -48,18 +48,18 @@ Multilabel classification in `mlr` can currently be done in two ways:
4848

4949
## Algorithm adaptation methods
5050

51-
Currently the available algorithm adaptation methods in **R** are the multivariate random forest in the [%randomForestSRC] package and the random ferns multilabel algorithm in the [%rFerns] package.
51+
Currently only the random ferns multilabel algorithm in the [%rFerns] package is available for multilabel classification tasks.
52+
5253
You can create the learner for these algorithms like in multiclass classification problems.
5354

5455
```{r}
55-
lrn.rfsrc = makeLearner("multilabel.randomForestSRC")
5656
lrn.rFerns = makeLearner("multilabel.rFerns")
5757
lrn.rFerns
5858
```
5959

6060
## Problem transformation methods
6161

62-
For generating a wrapped multilabel learner first create a binary (or multiclass) classification learner with `makeLearner()`.
62+
For generating a wrapped multilabel learner first create a binary (or multiclass) classification learner with `makeLearner()`.
6363
Afterwards apply a function like `makeMultilabelBinaryRelevanceWrapper()`, `makeMultilabelClassifierChainsWrapper()`, `makeMultilabelNestedStackingWrapper()`, `makeMultilabelDBRWrapper()` or `makeMultilabelStackingWrapper()` on the learner to convert it to a learner that uses the respective problem transformation method.
6464

6565
You can also generate a binary relevance learner directly, as you can see in the example.
@@ -73,20 +73,20 @@ lrn.br2 = makeMultilabelBinaryRelevanceWrapper("classif.rpart")
7373
lrn.br2
7474
```
7575

76-
The different methods are shortly described in the following.
76+
The different methods are shortly described in the following.
7777

7878
### Binary relevance
7979

8080
This problem transformation method converts the multilabel problem to binary
81-
classification problems for each label and applies a simple binary classificator on these.
81+
classification problems for each label and applies a simple binary classificator on these.
8282
In `mlr` this can be done by converting your binary learner to a wrapped binary relevance multilabel learner.
8383

8484
### Classifier chains
8585

86-
Trains consecutively the labels with the input data.
86+
Trains consecutively the labels with the input data.
8787
The input data in each step is augmented by the already trained labels (with the real observed values).
88-
Therefore an order of the labels has to be specified.
89-
At prediction time the labels are predicted in the same order as while training.
88+
Therefore an order of the labels has to be specified.
89+
At prediction time the labels are predicted in the same order as while training.
9090
The required labels in the input data are given by the previous done prediction of the respective label.
9191

9292
### Nested stacking
@@ -95,7 +95,7 @@ Same as classifier chains, but the labels in the input data are not the real one
9595

9696
### Dependent binary relevance
9797

98-
Each label is trained with the real observed values of all other labels.
98+
Each label is trained with the real observed values of all other labels.
9999
In prediction phase for a label the other necessary labels are obtained in a previous step by a base learner like the binary relevance method.
100100

101101
### Stacking
@@ -104,7 +104,7 @@ Same as the dependent binary relevance method, but in the training phase the lab
104104

105105
# Train
106106

107-
You can `train()` a model as usual with a multilabel learner and a multilabel task as input.
107+
You can `train()` a model as usual with a multilabel learner and a multilabel task as input.
108108
You can also pass ``subset`` and ``weights`` arguments if the
109109
learner supports this.
110110

@@ -113,13 +113,13 @@ mod = train(lrn.br, yeast.task)
113113
mod = train(lrn.br, yeast.task, subset = 1:1500, weights = rep(1 / 1500, 1500))
114114
mod
115115
116-
mod2 = train(lrn.rfsrc, yeast.task, subset = 1:100)
116+
mod2 = train(lrn.rFerns, yeast.task, subset = 1:100)
117117
mod2
118118
```
119119

120120
# Predict
121121

122-
Prediction can be done as usual in `mlr` with `predict` (`predict.WrappedModel()`) and by passing a trained model and either the task to the ``task`` argument or some new data to the ``newdata`` argument.
122+
Prediction can be done as usual in `mlr` with `predict` (`predict.WrappedModel()`) and by passing a trained model and either the task to the ``task`` argument or some new data to the ``newdata`` argument.
123123
As always you can specify a ``subset`` of the data which should be predicted.
124124

125125
```{r}
@@ -166,9 +166,9 @@ listMeasures("multilabel")
166166

167167
# Resampling
168168

169-
For evaluating the overall performance of the learning algorithm you can do some [resampling](resample.html){target="_blank"}.
170-
As usual you have to define a resampling strategy, either via `makeResampleDesc()` or `makeResampleInstance()`.
171-
After that you can run the `resample()` function.
169+
For evaluating the overall performance of the learning algorithm you can do some [resampling](resample.html){target="_blank"}.
170+
As usual you have to define a resampling strategy, either via `makeResampleDesc()` or `makeResampleInstance()`.
171+
After that you can run the `resample()` function.
172172
Below the default measure Hamming loss is calculated.
173173

174174
```{r echo = FALSE, results='hide'}
@@ -204,7 +204,7 @@ r
204204
# Binary performance
205205

206206
If you want to calculate a binary performance measure like, e.g., the [accuracy](measures.html){target="_blank"}, the [mmce](measures.html){target="_blank"} or the [auc](measures.html){target="_blank"} for each label, you can use function `getMultilabelBinaryPerformances()`.
207-
You can apply this function to any multilabel prediction, e.g., also on the resample multilabel prediction.
207+
You can apply this function to any multilabel prediction, e.g., also on the resample multilabel prediction.
208208
For calculating the [auc](measures.html){target="_blank"} you need predicted probabilities.
209209

210210
```{r}

0 commit comments

Comments
 (0)