diff --git a/docs/v2_design_overview.md b/docs/v2_design_overview.md new file mode 100644 index 0000000..8224a1f --- /dev/null +++ b/docs/v2_design_overview.md @@ -0,0 +1,38 @@ +# Design Overview for `PGVectorStore` + +This document outlines the design choices behind the PGVectorStore integration for LangChain, focusing on how an async PostgreSQL driver can supports both synchronous and asynchronous usage. + +## Motivation: Performance through Asynchronicity + +Database interactions are often I/O-bound, making asynchronous programming crucial for performance. + +- **Non-Blocking Operations:** Asynchronous code prevents the application from stalling while waiting for database responses, improving throughput and responsiveness. +- **Asynchronous Foundation (`asyncio` and Drivers):** Built upon Python's `asyncio` library, the integration is designed to work with asynchronous PostgreSQL drivers to handle database operations efficiently. While compatible drivers are supported, the `asyncpg` driver is specifically recommended due to its high performance in concurrent scenarios. You can explore its benefits ([link](https://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/)) and performance benchmarks ([link](https://fernandoarteaga.dev/blog/psycopg-vs-asyncpg/)) for more details. + +This native async foundation ensures the core database interactions are fast and scalable. + +## The Two-Class Approach: Enabling a Mixed Interface + +To cater to different application architectures while maintaining performance, we provide two classes: + +1. **`AsyncPGVectorStore` (Core Asynchronous Implementation):** + * This class contains the pure `async/await` logic for all database operations. + * It's designed for **direct use within asynchronous applications**. Users working in an `asyncio` environment can `await` its methods for maximum efficiency and direct control within the event loop. + * It represents the fundamental, non-blocking way of interacting with the database. + +2. **`PGVectorStore` (Mixed Sync/Async API ):** + * This class provides both asynchronous & synchronous APIs. + * When one of its methods is called, it internally invokes the corresponding `async` method from `AsyncPGVectorStore`. + * It **manages the execution of this underlying asynchronous logic**, handling the necessary `asyncio` event loop interactions (e.g., starting/running the coroutine) behind the scenes. + * This allows users of synchronous codebases to leverage the performance benefits of the asynchronous core without needing to rewrite their application structure. + +## Benefits of this Dual Interface Design + +This two-class structure provides significant advantages: + +- **Interface Flexibility:** Developers can **choose the interface that best fits their needs**: + * Use `PGVectorStore` for easy integration into existing synchronous applications. + * Use `AsyncPGVectorStore` for optimal performance and integration within `asyncio`-based applications. +- **Ease of Use:** `PGVectorStore` offers a familiar synchronous programming model, hiding the complexity of managing async execution from the end-user. +- **Robustness:** The clear separation helps prevent common errors associated with mixing synchronous and asynchronous code incorrectly, such as blocking the event loop from synchronous calls within an async context. +- **Efficiency for Async Users:** `AsyncPGVectorStore` provides a direct path for async applications, avoiding any potential overhead from the sync-to-async bridging layer present in `PGVectorStore`.