postgres_retriever#
Leverage a postgres database to store and retrieve documents.
Classes
|
Enum for the distance to operator. |
|
Use a postgres database to store and retrieve documents. |
- class DistanceToOperator(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
Enum for the distance to operator.
About pgvector:
L2 distance: <->, inner product (<#>), cosine distance (<=>), and L1 distance (<+>, added in 0.7.0)
- L2 = '<->'#
- INNER_PRODUCT = '<#>'#
- COSINE = '<=>'#
- L1 = '<+>'#
- class PostgresRetriever(embedder: Embedder, top_k: int | None = 1, database_url: str = None, table_name: str = 'document', distance_operator: DistanceToOperator = DistanceToOperator.INNER_PRODUCT)[source]#
Bases:
Retriever
[Any
,str
]Use a postgres database to store and retrieve documents.
Users can follow this example and to customize the prompt or additionally ask it to output score along with the indices.
- Parameters:
top_k (Optional[int], optional) – top k documents to fetch. Defaults to 1.
database_url (str) – the database url to connect to. Defaults to postgresql://postgres:password@localhost:5432/vector_db.
References: [1] pgvector extension: pgvector/pgvector
- classmethod format_vector_search_query(table_name: str, vector_column: str, query_embedding: List[float], top_k: int, distance_operator: DistanceToOperator, sort_desc: bool = True) str [source]#
Formats a SQL query string to select all columns from a table, order the results by the distance or similarity score to a provided embedding, and also return that score.
- Parameters:
table_name (str) – The name of the table to query.
column (str) – The name of the column containing the vector data.
query_embedding (list or str) – The embedding vector to compare against.
top_k (int) – The number of top results to return.
- Returns:
A formatted SQL query string that includes the score.
- Return type:
str