-
Notifications
You must be signed in to change notification settings - Fork 81
feat: Add the PGVectorStore class #168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 10 commits
a21a8e2
926f417
a34ddbe
bdd2bf6
239b1c3
544cade
03dcac1
7b4fa7f
1d42314
dc9a5b8
4c3f93f
b3a12b7
8b30833
1496033
cbd0889
b436df3
eb6954d
3e52c56
a24fe73
c74858e
e52e609
1f6a70e
8029731
cf58c2a
b9526c6
1d6563a
c9ad8f3
a913b5a
9e539e0
1daac17
5062185
fe62c35
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,10 @@ | ||
"""Index class to add vector indexes on the PGVectorStore. | ||
|
||
Learn more about vector indexes at https://github.com/pgvector/pgvector?tab=readme-ov-file#indexing | ||
""" | ||
|
||
import enum | ||
averikitsch marked this conversation as resolved.
Show resolved
Hide resolved
averikitsch marked this conversation as resolved.
Show resolved
Hide resolved
|
||
import re | ||
import warnings | ||
from abc import ABC, abstractmethod | ||
from dataclasses import dataclass, field | ||
|
@@ -26,6 +32,18 @@ class DistanceStrategy(StrategyMixin, enum.Enum): | |
|
||
@dataclass | ||
class BaseIndex(ABC): | ||
""" | ||
Abstract base class for defining vector indexes. | ||
|
||
Attributes: | ||
name (Optional[str]): A human-readable name for the index. Defaults to None. | ||
index_type (str): A string identifying the type of index. Defaults to "base". | ||
distance_strategy (DistanceStrategy): The strategy used to calculate distances | ||
between vectors in the index. Defaults to DistanceStrategy.COSINE_DISTANCE. | ||
partial_indexes (Optional[list[str]]): A list of names of partial indexes. Defaults to None. | ||
extension_name (Optional[str]): The name of the extension to be created for the index, if any. Defaults to None. | ||
""" | ||
|
||
name: Optional[str] = None | ||
index_type: str = "base" | ||
distance_strategy: DistanceStrategy = field( | ||
|
@@ -44,6 +62,20 @@ def index_options(self) -> str: | |
def get_index_function(self) -> str: | ||
return self.distance_strategy.index_function | ||
|
||
def __post_init__(self) -> None: | ||
"""Check if initialization parameters are valid. | ||
|
||
Raises: | ||
ValueError: extension_name is a valid postgreSQL identifier | ||
""" | ||
if ( | ||
self.extension_name | ||
and re.match(r"^[a-zA-Z_][a-zA-Z0-9_]*$", self.extension_name) is None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: If we are doing validation at the application layer, this should probably be in a standalone function and used in other places as well (e.g., any of the index classes has the same injection issue) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This function is added as a post init to the BaseIndex, which is extended by all Index classes, so all the indexes run this check after init. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This check is only for the We can also handle this in a follow up PR if easier? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right! I missed that. I've seperated that function and now it validates for both extension_name and index_type. I've also wrapped the index_name in double quotes to allow the same flexibility as tables. |
||
): | ||
raise ValueError( | ||
f"Invalid identifier: {self.extension_name}. Identifiers must start with a letter or underscore, and subsequent characters can be letters, digits, or underscores." | ||
) | ||
|
||
|
||
@dataclass | ||
class ExactNearestNeighbor(BaseIndex): | ||
|
Uh oh!
There was an error while loading. Please reload this page.