The options for the different data sources (table and volume) are so different that we decided to offer distinct "register" functions for those. If we had a single register function, it would either just accept an opaque JSON arg, or have a lot of different args where some are exclusive. Both are not great UX
Retriever for a table data source
The register_retriever_for_table function is used to create a retriever for a table data source. This is the function signature, you can see many of those are optional and have defaults
Example: Registering a retriever
In this example, we use all the defaults.
Creating the Embeddings
Bulk embedding if there is existing data in the source table:
enable auto-embedding for any future changes:
auto-embedding can be disabled as well:
Retrieving
A basic key retriever is available that does not look up the source data, but just returns the ID/key of the matching embeddings:
Retrieving the key
Example: Retrieving the key
This can be used if you want to do a join/lookup yourself based on the key. For retrievers with external (volume) data sources, this is especially useful. Usually the application itself wants to do the retrieval from the external data source. Or you might want to push-down the actual retrieval to a client application.
The retrieve_text function joins the embeddings with the source data and directly returns the results:
Retrieving the text
The retrieve_text function joins the embeddings with the source data and directly returns the results:
Example
Listing the retrievers
A view is available that lists all the retrievers. aidb.retrievers also includes some of the retrievers configuration:
It is recommended that you just select the columns you are interested in.
Deleting a retriever
This will not delete the vector table or anything else, just the configuration:
End to end example
You can find an end-to-end example for a table/text retriever at on the Retrievers example page.