add errata

rasbt · rasbt · commit 4663a10c9e2c · 2024-05-28T17:25:58.000-05:00
diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-# *Machine Learning and AI Beyond the Basics* Book
+# *Machine Learning Q and AI Beyond the Basics* Book
 
 
 
diff --git a/errata/README.md b/errata/README.md
@@ -0,0 +1,12 @@
+# Errata
+
+
+#### Chapter 8
+
+The following sentence in Chapter 8
+
+> Transformers are easy to parallelize because they take a fixed-length sequence of word or image tokens as input.
+
+Is misleading because we only work with fixed-size sequences specifically during pretraining, finetuning, and batched inference. I.e., where we collect multiple sequences in a batch. A better explanation could be the following:
+
+> Like other deep learning architectures, transformers facilitate parallelization in batch training by handling sequences of word or image tokens. Although they can process variable-length sequences, in practice, sequences are often padded or truncated to fixed lengths for efficient parallel computation across multiple sequences.

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-# Machine Learning and AI Beyond the Basics Book`
	`1`	`+# Machine Learning Q and AI Beyond the Basics Book`
`2`	`2`
`3`	`3`
`4`	`4`