Skip to content

Commit 6508c13

Browse files
committed
minor updates (clear outputs since we're using prefilled notebook versions of episodes)
1 parent b9b4e05 commit 6508c13

File tree

1 file changed

+0
-46
lines changed

1 file changed

+0
-46
lines changed

episodes/04-vectorSpace.md

Lines changed: 0 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -69,13 +69,6 @@ corpus = np.array([[1,10],[8,8],[2,2],[2,2]])
6969
print(corpus)
7070
```
7171

72-
```txt
73-
[[ 1 10]
74-
[ 8 8]
75-
[ 2 2]
76-
[ 2 2]]
77-
```
78-
7972
### Graphing our model
8073

8174
We don't just have to think of our words as columns. We can also think of them as dimensions, and the values as coordinates for each document.
@@ -87,11 +80,6 @@ corpusT = np.transpose(corpus)
8780
print(corpusT)
8881
```
8982

90-
```txt
91-
[[ 1 8 2 2]
92-
[10 8 2 2]]
93-
```
94-
9583
```python
9684
X = corpusT[0]
9785
Y = corpusT[1]
@@ -154,10 +142,6 @@ origin = np.zeros([1,4])
154142
print(origin)
155143
```
156144

157-
```txt
158-
[[0. 0. 0. 0.]]
159-
```
160-
161145
```python
162146
# draw our vectors
163147
plt.quiver(origin, origin, X, Y, color=mycolors, angles='xy', scale_units='xy', scale=1)
@@ -166,8 +150,6 @@ plt.ylim(0, 12)
166150
plt.show()
167151
```
168152

169-
![](fig/02-plot-vectors.png){alt='png'}
170-
171153
Document A and document D are headed in exactly the same direction, which matches our intution that both documents are in some way similar to each other, even though they differ in length.
172154

173155
#### Cosine Similarity
@@ -183,13 +165,6 @@ from sklearn.metrics.pairwise import cosine_similarity as cs
183165
cs(corpus, D)
184166
```
185167

186-
```txt
187-
array([[0.7739573],
188-
[1. ],
189-
[1. ],
190-
[1. ]])
191-
```
192-
193168
Both A and D are considered similar by this metric. Cosine similarity is used by many models as a measure of similarity between documents and words.
194169

195170
### Generalizing over more dimensions
@@ -218,13 +193,6 @@ corpus = np.hstack((corpus, np.zeros((4,2))))
218193
print(corpus)
219194
```
220195

221-
```txt
222-
[[ 1. 10. 0. 0.]
223-
[ 8. 8. 0. 0.]
224-
[ 2. 2. 0. 0.]
225-
[ 2. 2. 0. 0.]]
226-
```
227-
228196
```python
229197
E = np.array([[0,2,1,1]])
230198
F = np.array([[2,2,1,1]])
@@ -234,27 +202,13 @@ corpus = np.vstack((corpus, E))
234202
print(corpus)
235203
```
236204

237-
```txt
238-
[[ 1. 10. 0. 0.]
239-
[ 8. 8. 0. 0.]
240-
[ 2. 2. 0. 0.]
241-
[ 2. 2. 0. 0.]
242-
[ 0. 2. 1. 1.]]
243-
```
244205

245206
What do you think the most similar document is to document F?
246207

247208
```python
248209
cs(corpus, F)
249210
```
250211

251-
```txt
252-
array([[0.69224845],
253-
[0.89442719],
254-
[0.89442719],
255-
[0.89442719],
256-
[0.77459667]])
257-
```
258212

259213
This new document seems most similar to the documents B,C and D.
260214

0 commit comments

Comments
 (0)