Skip to content

Commit 5ba9395

Browse files
Update 03-preprocessing.md
1 parent 65a0296 commit 5ba9395

File tree

1 file changed

+2
-6
lines changed

1 file changed

+2
-6
lines changed

episodes/03-preprocessing.md

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -377,16 +377,13 @@ Now that we've built a tokenizer we're happy with, lets use it to create lemmati
377377

378378
That is, we want to turn this:
379379

380-
```txt
381-
Emma Woodhouse, handsome, clever, and rich, with a comfortable home
380+
"Emma Woodhouse, handsome, clever, and rich, with a comfortable home
382381
and happy disposition, seemed to unite some of the best blessings
383382
of existence; and had lived nearly twenty-one years in the world
384-
with very little to distress or vex her.
385-
```
383+
with very little to distress or vex her."
386384

387385
into this:
388386

389-
```txt
390387
handsome
391388
clever
392389
rich
@@ -407,7 +404,6 @@ very
407404
little
408405
distress
409406
vex
410-
```
411407

412408
To help make this *relatively* quick for all the text in all our books, we'll use a helper function we prepared for learners to use our tokenizer, do the casing and lemmatization we discussed earlier, and write the results to a file:
413409

0 commit comments

Comments
 (0)