You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: episodes/05-dplyr.Rmd
+27Lines changed: 27 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -699,6 +699,33 @@ left %>%
699
699
```
700
700
This result makes it easier to see the accumulation of more SNPs at later generations, without us having to know the sample IDs.
701
701
702
+
::::::::::: challenge
703
+
704
+
## What about right joins?
705
+
706
+
1. How many rows and columns would you expect from the following right join?
707
+
708
+
`right_join(variants, metadata_sub, by = join_by(sample_id == run))`
709
+
710
+
2. How many rows and columns would you expect from the following right join?
711
+
712
+
`right_join(metadata_sub, variants, by = join_by(run == sample_id))`
713
+
714
+
Think carefully about the data in question and which data frame is on the right and which is on the left.
715
+
716
+
:::::::: solution
717
+
718
+
**Part 1** There will be 860 rows and 31 variables, just like the full join.
719
+
All of the `sample_id`'s in the `variants` data frame have matches and will be kept and then it will also add on the `run` values that do not match but were represented in the `metadata_sub` data frame with empty info in the other columns since there is no matching rows in the `variants` data frame.
720
+
721
+
**Part 2** There will be 801 rows and 31 variables, just like the inner and left joins.
722
+
This join should always match exactly the left join as it is the mirrored right join.
723
+
It will only match the inner join if all of the samples in the `by` match-up in the right data frame are in the left data frame as well, otherwise it will drop the rows not listed in the left for the inner join.
0 commit comments