|
| 1 | +# Design Overview for `PGVectorStore` |
| 2 | + |
| 3 | +This document outlines the design choices behind the PGVectorStore integration for LangChain, focusing on how an async PostgreSQL driver can supports both synchronous and asynchronous usage. |
| 4 | + |
| 5 | +## Motivation: Performance through Asynchronicity |
| 6 | + |
| 7 | +Database interactions are often I/O-bound, making asynchronous programming crucial for performance. |
| 8 | + |
| 9 | +- **Non-Blocking Operations:** Asynchronous code prevents the application from stalling while waiting for database responses, improving throughput and responsiveness. |
| 10 | +- **Asynchronous Foundation (`asyncio` and Drivers):** Built upon Python's `asyncio` library, the integration is designed to work with asynchronous PostgreSQL drivers to handle database operations efficiently. While compatible drivers are supported, the `asyncpg` driver is specifically recommended due to its high performance in concurrent scenarios. You can explore its benefits ([link](https://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/)) and performance benchmarks ([link](https://fernandoarteaga.dev/blog/psycopg-vs-asyncpg/)) for more details. |
| 11 | + |
| 12 | +This native async foundation ensures the core database interactions are fast and scalable. |
| 13 | + |
| 14 | +## The Two-Class Approach: Enabling a Mixed Interface |
| 15 | + |
| 16 | +To cater to different application architectures while maintaining performance, we provide two classes: |
| 17 | + |
| 18 | +1. **`AsyncPGVectorStore` (Core Asynchronous Implementation):** |
| 19 | + * This class contains the pure `async/await` logic for all database operations. |
| 20 | + * It's designed for **direct use within asynchronous applications**. Users working in an `asyncio` environment can `await` its methods for maximum efficiency and direct control within the event loop. |
| 21 | + * It represents the fundamental, non-blocking way of interacting with the database. |
| 22 | + |
| 23 | +2. **`PGVectorStore` (Mixed Sync/Async API ):** |
| 24 | + * This class provides both asynchronous & synchronous APIs. |
| 25 | + * When one of its methods is called, it internally invokes the corresponding `async` method from `AsyncPGVectorStore`. |
| 26 | + * It **manages the execution of this underlying asynchronous logic**, handling the necessary `asyncio` event loop interactions (e.g., starting/running the coroutine) behind the scenes. |
| 27 | + * This allows users of synchronous codebases to leverage the performance benefits of the asynchronous core without needing to rewrite their application structure. |
| 28 | + |
| 29 | +## Benefits of this Dual Interface Design |
| 30 | + |
| 31 | +This two-class structure provides significant advantages: |
| 32 | + |
| 33 | +- **Interface Flexibility:** Developers can **choose the interface that best fits their needs**: |
| 34 | + * Use `PGVectorStore` for easy integration into existing synchronous applications. |
| 35 | + * Use `AsyncPGVectorStore` for optimal performance and integration within `asyncio`-based applications. |
| 36 | +- **Ease of Use:** `PGVectorStore` offers a familiar synchronous programming model, hiding the complexity of managing async execution from the end-user. |
| 37 | +- **Robustness:** The clear separation helps prevent common errors associated with mixing synchronous and asynchronous code incorrectly, such as blocking the event loop from synchronous calls within an async context. |
| 38 | +- **Efficiency for Async Users:** `AsyncPGVectorStore` provides a direct path for async applications, avoiding any potential overhead from the sync-to-async bridging layer present in `PGVectorStore`. |
0 commit comments