From 57fd9cf2bb75ebf9fe612e5144cb5197e94d6815 Mon Sep 17 00:00:00 2001 From: dishaprakash <57954147+dishaprakash@users.noreply.github.com> Date: Mon, 14 Apr 2025 15:44:25 +0530 Subject: [PATCH 1/7] chore(docs): Add an Async Explainer doc --- Async explainer.md | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 Async explainer.md diff --git a/Async explainer.md b/Async explainer.md new file mode 100644 index 00000000..803b40be --- /dev/null +++ b/Async explainer.md @@ -0,0 +1,38 @@ +# Design Rationale: Mixed Interface Postgres VectorStore + +This document outlines the design choices behind the PGVectorStore integration for LangChain, focusing on its dual interface that supports both synchronous and asynchronous usage patterns. + +## Motivation: Performance through Asynchronicity + +Database interactions are often I/O-bound, making asynchronous programming crucial for performance. + +- **Non-Blocking Operations:** Asynchronous code prevents the application from stalling while waiting for database responses, improving throughput and responsiveness. +- **Asynchronous Foundation (`asyncio` and Drivers):** Built upon Python's `asyncio`, the integration is designed to work with asynchronous PostgreSQL drivers to handle database operations efficiently. While compatible drivers are supported, the `asyncpg` driver is specifically recommended due to its high performance in concurrent scenarios. You can explore its benefits ([link](https://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/)) and performance benchmarks ([link](https://fernandoarteaga.dev/blog/psycopg-vs-asyncpg/)) for more details. + +This foundation ensures the core database interactions are fast and scalable. + +## The Two-Class Approach: Enabling a Mixed Interface + +To cater to different application architectures while maintaining performance, we provide two classes: + +1. **`AsyncVectorstore` (Core Asynchronous Implementation):** + * This class contains the pure `async/await` logic for all database operations using `asyncpg`. + * It's designed for **direct use within asynchronous applications**. Users working in an `asyncio` environment can `await` its methods for maximum efficiency and direct control within the event loop. + * It represents the fundamental, non-blocking way of interacting with the database. + +2. **`Vectorstore` (Synchronous Interface / Asynchronous Internals):** + * This class provides both asynchronous & synchronous APIs. + * When one of its methods is called, it internally invokes the corresponding `async` method from `AsyncVectorstore`. + * It **manages the execution of this underlying asynchronous logic**, handling the necessary `asyncio` event loop interactions (e.g., starting/running the coroutine) behind the scenes. + * This allows users of synchronous codebases to leverage the performance benefits of the asynchronous core without needing to rewrite their application structure. + +## Benefits of this Dual Interface Design + +This two-class structure provides significant advantages: + +- **Interface Flexibility:** Developers can **choose the interface that best fits their needs**: + * Use `Vectorstore` for easy integration into existing synchronous applications. + * Use `AsyncVectorstore` for optimal performance and integration within `asyncio`-based applications. +- **Ease of Use:** `Vectorstore` offers a familiar synchronous programming model, hiding the complexity of managing async execution from the end-user. +- **Robustness:** The clear separation helps prevent common errors associated with mixing synchronous and asynchronous code incorrectly, such as blocking the event loop from synchronous calls within an async context. +- **Efficiency for Async Users:** `AsyncVectorstore` provides a direct path for async applications, avoiding any potential overhead from the sync-to-async bridging layer present in `Vectorstore`. From 43034624b57f00f7cfe38599fca166ca62bf6f26 Mon Sep 17 00:00:00 2001 From: dishaprakash <57954147+dishaprakash@users.noreply.github.com> Date: Mon, 14 Apr 2025 15:46:54 +0530 Subject: [PATCH 2/7] update class names --- Async explainer.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Async explainer.md b/Async explainer.md index 803b40be..89ff6a52 100644 --- a/Async explainer.md +++ b/Async explainer.md @@ -15,14 +15,14 @@ This foundation ensures the core database interactions are fast and scalable. To cater to different application architectures while maintaining performance, we provide two classes: -1. **`AsyncVectorstore` (Core Asynchronous Implementation):** +1. **`AsyncPGVectorstore` (Core Asynchronous Implementation):** * This class contains the pure `async/await` logic for all database operations using `asyncpg`. * It's designed for **direct use within asynchronous applications**. Users working in an `asyncio` environment can `await` its methods for maximum efficiency and direct control within the event loop. * It represents the fundamental, non-blocking way of interacting with the database. -2. **`Vectorstore` (Synchronous Interface / Asynchronous Internals):** +2. **`PGVectorstore` (Synchronous Interface / Asynchronous Internals):** * This class provides both asynchronous & synchronous APIs. - * When one of its methods is called, it internally invokes the corresponding `async` method from `AsyncVectorstore`. + * When one of its methods is called, it internally invokes the corresponding `async` method from `AsyncPGVectorstore`. * It **manages the execution of this underlying asynchronous logic**, handling the necessary `asyncio` event loop interactions (e.g., starting/running the coroutine) behind the scenes. * This allows users of synchronous codebases to leverage the performance benefits of the asynchronous core without needing to rewrite their application structure. @@ -31,8 +31,8 @@ To cater to different application architectures while maintaining performance, w This two-class structure provides significant advantages: - **Interface Flexibility:** Developers can **choose the interface that best fits their needs**: - * Use `Vectorstore` for easy integration into existing synchronous applications. - * Use `AsyncVectorstore` for optimal performance and integration within `asyncio`-based applications. -- **Ease of Use:** `Vectorstore` offers a familiar synchronous programming model, hiding the complexity of managing async execution from the end-user. + * Use `PGVectorstore` for easy integration into existing synchronous applications. + * Use `AsyncPGVectorstore` for optimal performance and integration within `asyncio`-based applications. +- **Ease of Use:** `PGVectorstore` offers a familiar synchronous programming model, hiding the complexity of managing async execution from the end-user. - **Robustness:** The clear separation helps prevent common errors associated with mixing synchronous and asynchronous code incorrectly, such as blocking the event loop from synchronous calls within an async context. -- **Efficiency for Async Users:** `AsyncVectorstore` provides a direct path for async applications, avoiding any potential overhead from the sync-to-async bridging layer present in `Vectorstore`. +- **Efficiency for Async Users:** `AsyncPGVectorstore` provides a direct path for async applications, avoiding any potential overhead from the sync-to-async bridging layer present in `PGVectorstore`. From 4caff7abb7a783ce69712213b761b727ac5d269e Mon Sep 17 00:00:00 2001 From: dishaprakash <57954147+dishaprakash@users.noreply.github.com> Date: Mon, 14 Apr 2025 15:48:35 +0530 Subject: [PATCH 3/7] update class names --- Async explainer.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Async explainer.md b/Async explainer.md index 89ff6a52..6d6ee0e3 100644 --- a/Async explainer.md +++ b/Async explainer.md @@ -15,14 +15,14 @@ This foundation ensures the core database interactions are fast and scalable. To cater to different application architectures while maintaining performance, we provide two classes: -1. **`AsyncPGVectorstore` (Core Asynchronous Implementation):** +1. **`AsyncPGVectorStore` (Core Asynchronous Implementation):** * This class contains the pure `async/await` logic for all database operations using `asyncpg`. * It's designed for **direct use within asynchronous applications**. Users working in an `asyncio` environment can `await` its methods for maximum efficiency and direct control within the event loop. * It represents the fundamental, non-blocking way of interacting with the database. -2. **`PGVectorstore` (Synchronous Interface / Asynchronous Internals):** +2. **`PGVectorStore` (Synchronous Interface / Asynchronous Internals):** * This class provides both asynchronous & synchronous APIs. - * When one of its methods is called, it internally invokes the corresponding `async` method from `AsyncPGVectorstore`. + * When one of its methods is called, it internally invokes the corresponding `async` method from `AsyncPGVectorStore`. * It **manages the execution of this underlying asynchronous logic**, handling the necessary `asyncio` event loop interactions (e.g., starting/running the coroutine) behind the scenes. * This allows users of synchronous codebases to leverage the performance benefits of the asynchronous core without needing to rewrite their application structure. @@ -31,8 +31,8 @@ To cater to different application architectures while maintaining performance, w This two-class structure provides significant advantages: - **Interface Flexibility:** Developers can **choose the interface that best fits their needs**: - * Use `PGVectorstore` for easy integration into existing synchronous applications. - * Use `AsyncPGVectorstore` for optimal performance and integration within `asyncio`-based applications. -- **Ease of Use:** `PGVectorstore` offers a familiar synchronous programming model, hiding the complexity of managing async execution from the end-user. + * Use `PGVectorStore` for easy integration into existing synchronous applications. + * Use `AsyncPGVectorStore` for optimal performance and integration within `asyncio`-based applications. +- **Ease of Use:** `PGVectorStore` offers a familiar synchronous programming model, hiding the complexity of managing async execution from the end-user. - **Robustness:** The clear separation helps prevent common errors associated with mixing synchronous and asynchronous code incorrectly, such as blocking the event loop from synchronous calls within an async context. -- **Efficiency for Async Users:** `AsyncPGVectorstore` provides a direct path for async applications, avoiding any potential overhead from the sync-to-async bridging layer present in `PGVectorstore`. +- **Efficiency for Async Users:** `AsyncPGVectorStore` provides a direct path for async applications, avoiding any potential overhead from the sync-to-async bridging layer present in `PGVectorStore`. From 3ee1262040f33be7009724b29c220550326a83f8 Mon Sep 17 00:00:00 2001 From: dishaprakash <57954147+dishaprakash@users.noreply.github.com> Date: Tue, 15 Apr 2025 12:48:46 +0530 Subject: [PATCH 4/7] heading change --- Async explainer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Async explainer.md b/Async explainer.md index 6d6ee0e3..1aa9e990 100644 --- a/Async explainer.md +++ b/Async explainer.md @@ -1,4 +1,4 @@ -# Design Rationale: Mixed Interface Postgres VectorStore +# Design Rationale: Mixed Interface PGVectorStore This document outlines the design choices behind the PGVectorStore integration for LangChain, focusing on its dual interface that supports both synchronous and asynchronous usage patterns. From c0a8ccda923f40749504e3681b26999d9a437e3c Mon Sep 17 00:00:00 2001 From: dishaprakash <57954147+dishaprakash@users.noreply.github.com> Date: Fri, 18 Apr 2025 02:43:12 +0530 Subject: [PATCH 5/7] Update Async explainer after review --- Async explainer.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Async explainer.md b/Async explainer.md index 1aa9e990..8224a1f9 100644 --- a/Async explainer.md +++ b/Async explainer.md @@ -1,26 +1,26 @@ -# Design Rationale: Mixed Interface PGVectorStore +# Design Overview for `PGVectorStore` -This document outlines the design choices behind the PGVectorStore integration for LangChain, focusing on its dual interface that supports both synchronous and asynchronous usage patterns. +This document outlines the design choices behind the PGVectorStore integration for LangChain, focusing on how an async PostgreSQL driver can supports both synchronous and asynchronous usage. ## Motivation: Performance through Asynchronicity Database interactions are often I/O-bound, making asynchronous programming crucial for performance. -- **Non-Blocking Operations:** Asynchronous code prevents the application from stalling while waiting for database responses, improving throughput and responsiveness. -- **Asynchronous Foundation (`asyncio` and Drivers):** Built upon Python's `asyncio`, the integration is designed to work with asynchronous PostgreSQL drivers to handle database operations efficiently. While compatible drivers are supported, the `asyncpg` driver is specifically recommended due to its high performance in concurrent scenarios. You can explore its benefits ([link](https://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/)) and performance benchmarks ([link](https://fernandoarteaga.dev/blog/psycopg-vs-asyncpg/)) for more details. +- **Non-Blocking Operations:** Asynchronous code prevents the application from stalling while waiting for database responses, improving throughput and responsiveness. +- **Asynchronous Foundation (`asyncio` and Drivers):** Built upon Python's `asyncio` library, the integration is designed to work with asynchronous PostgreSQL drivers to handle database operations efficiently. While compatible drivers are supported, the `asyncpg` driver is specifically recommended due to its high performance in concurrent scenarios. You can explore its benefits ([link](https://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/)) and performance benchmarks ([link](https://fernandoarteaga.dev/blog/psycopg-vs-asyncpg/)) for more details. -This foundation ensures the core database interactions are fast and scalable. +This native async foundation ensures the core database interactions are fast and scalable. ## The Two-Class Approach: Enabling a Mixed Interface To cater to different application architectures while maintaining performance, we provide two classes: 1. **`AsyncPGVectorStore` (Core Asynchronous Implementation):** - * This class contains the pure `async/await` logic for all database operations using `asyncpg`. + * This class contains the pure `async/await` logic for all database operations. * It's designed for **direct use within asynchronous applications**. Users working in an `asyncio` environment can `await` its methods for maximum efficiency and direct control within the event loop. * It represents the fundamental, non-blocking way of interacting with the database. -2. **`PGVectorStore` (Synchronous Interface / Asynchronous Internals):** +2. **`PGVectorStore` (Mixed Sync/Async API ):** * This class provides both asynchronous & synchronous APIs. * When one of its methods is called, it internally invokes the corresponding `async` method from `AsyncPGVectorStore`. * It **manages the execution of this underlying asynchronous logic**, handling the necessary `asyncio` event loop interactions (e.g., starting/running the coroutine) behind the scenes. From 99fb6859841f07f4a18d043a73f93d3ef8e578d9 Mon Sep 17 00:00:00 2001 From: dishaprakash <57954147+dishaprakash@users.noreply.github.com> Date: Fri, 18 Apr 2025 03:08:25 +0530 Subject: [PATCH 6/7] Rename Async explainer.md to v2_design_overview.md --- Async explainer.md => v2_design_overview.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename Async explainer.md => v2_design_overview.md (100%) diff --git a/Async explainer.md b/v2_design_overview.md similarity index 100% rename from Async explainer.md rename to v2_design_overview.md From c715fafad23f7076d78cc1d3a92d26d8bc97093d Mon Sep 17 00:00:00 2001 From: dishaprakash <57954147+dishaprakash@users.noreply.github.com> Date: Sat, 19 Apr 2025 19:03:35 +0530 Subject: [PATCH 7/7] Rename v2_design_overview.md to docs/v2_design_overview.md --- v2_design_overview.md => docs/v2_design_overview.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename v2_design_overview.md => docs/v2_design_overview.md (100%) diff --git a/v2_design_overview.md b/docs/v2_design_overview.md similarity index 100% rename from v2_design_overview.md rename to docs/v2_design_overview.md