Skip to content

Commit c0ee4b9

Browse files
committed
add lesson 13: langchain caching reference scripts
1 parent 03f9f65 commit c0ee4b9

File tree

2 files changed

+111
-0
lines changed

2 files changed

+111
-0
lines changed

13_caching_memory.py

+65
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
"""
2+
LangChain has an incredible caching system layered on top of the Large Languange Models
3+
(LLMs) that you use it with. The resources on this though are pretty scarce, and as of this
4+
recording, many of them are still outdated. If you search its official documentation, you will see it still
5+
uses the old .predict() and if you try to run it, you'll get issues like this https://github.com/hwchase17/langchain/issues/6740
6+
and it doesn't run. I did a quick google search and checked the first two results and they are
7+
just lazy copy-and-paste from the docs, which means it's also wrong and plainly doesn't work.
8+
9+
But caching is so, so useful. Especially if you're building LLMs for production use where you're feeding
10+
a large context to the model, like using GPT to query your own personal knowledge base (I have a video
11+
on exactly that where I load in my bullet journals in a markdown format and build my query engine using gpt on top of it),
12+
or if you're trying to learn a foreign language by getting GPT to tutor you on a
13+
book (also have a video on that). In both cases, you're feeding a large context to the model, and you
14+
are going to incur quite a bit of cost and your queries will be slow if you don't cache.
15+
16+
So let me show you how to Caching with langchain and it's surprisingly easy. All of this code will be
17+
on my github, along with the rest of this LLM series if you've been following along. We're on video number
18+
13 now so there's a lot we've covered, and caching is just a great addition to your LLM development toolkit.
19+
20+
Let's open up a file and start with langchain's implementation of an in memory cache. Name it demo whatever.
21+
Before looking at the code, if you had asked me to guess, I thought it would be using
22+
python's LRU cache, which is also part of the standard library. I love caching and I have a video on LRU
23+
cache if you want to introduce built in caching to your python programs. But I took at the code and
24+
realize I was wrong, it was far simpler than that, it's just a dictionary. https://github.com/hwchase17/langchain/blob/master/langchain/cache.py#L102
25+
26+
And as a quick primer, a cache is just this dictionary that stores the result of a function call,
27+
so that repeated calls with the same arguments don't have to recompute the result. So if you have a
28+
function that takes a long time to run, you can cache the result of that function call, and the next
29+
time you use the input it should yield the same result by just referring to the dictionary, do a quick
30+
look up instead of burning your openai credits, your computation power, or whatever resource you're using
31+
for the computation. It saves you lots of time and money, and if you're not using cache for all these
32+
repeated queries, you're leaving money on the table.
33+
"""
34+
35+
import time
36+
from dotenv import load_dotenv
37+
import langchain
38+
from langchain.llms import OpenAI
39+
from langchain.callbacks import get_openai_callback
40+
from langchain.cache import InMemoryCache
41+
42+
load_dotenv()
43+
44+
# to make caching obvious, we use a slow model
45+
llm = OpenAI(model_name="text-davinci-002")
46+
47+
langchain.llm_cache = InMemoryCache()
48+
49+
with get_openai_callback() as cb:
50+
start = time.time()
51+
result = llm("What doesn't fall far from the tree?")
52+
print(result)
53+
end = time.time()
54+
print("--- cb")
55+
print(str(cb) + f" ({end - start:.2f} seconds)")
56+
57+
with get_openai_callback() as cb2:
58+
start = time.time()
59+
result2 = llm("What doesn't fall far from the tree?")
60+
result3 = llm("What doesn't fall far from the tree?")
61+
end = time.time()
62+
print(result2)
63+
print(result3)
64+
print("--- cb2")
65+
print(str(cb2) + f" ({end - start:.2f} seconds)")

13_caching_sqlite.py

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
import time
2+
from dotenv import load_dotenv
3+
import langchain
4+
from langchain.llms import OpenAI
5+
from langchain.callbacks import get_openai_callback
6+
7+
from langchain.text_splitter import CharacterTextSplitter
8+
from langchain.docstore.document import Document
9+
from langchain.cache import SQLiteCache
10+
from langchain.chains.summarize import load_summarize_chain
11+
12+
# add this to .gitignore if you don't want to commit the cache
13+
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")
14+
15+
load_dotenv()
16+
17+
text_splitter = CharacterTextSplitter()
18+
llm = OpenAI(model_name="text-davinci-002")
19+
no_cache_llm = OpenAI(model_name="text-davinci-002", cache=False)
20+
21+
with open("news/summary.txt") as f:
22+
news = f.read()
23+
24+
texts = text_splitter.split_text(news)
25+
print(texts)
26+
27+
docs = [Document(page_content=t) for t in texts[:3]]
28+
29+
chain = load_summarize_chain(llm, chain_type="map_reduce", reduce_llm=no_cache_llm)
30+
31+
with get_openai_callback() as cb:
32+
start = time.time()
33+
result = chain.run(docs)
34+
end = time.time()
35+
print("--- result1")
36+
print(result)
37+
print(str(cb) + f" ({end - start:.2f} seconds)")
38+
39+
40+
with get_openai_callback() as cb2:
41+
start = time.time()
42+
result = chain.run(docs)
43+
end = time.time()
44+
print("--- result2")
45+
print(result)
46+
print(str(cb2) + f" ({end - start:.2f} seconds)")

0 commit comments

Comments
 (0)