Tmp doc retrieval
Temporary Document Retrieval (Tmp Doc Retrieval) allows the pipeline to retrieve relevant content from user-uploaded documents during a conversation in the chat UI. This feature is particularly useful when answering user questions that rely on conversational-specific documents uploaded in real-time. It enhances the context for responses by incorporating information directly from user-provided files.
To use the TMP Doc Retrieval feature, it must first be activated and properly configured. Navigate to settings of your project, go to the "PDF Upload" tab, and enable the “Allow file upload” option. Additionally, ensure that the maximum file size, maximum folder size, and language settings are configured. Set up the extraction configuration by selecting a PDF extractor and providing the necessary environment variables and LLM settings if neccessary. Configure the transformation settings by choosing an LLM provider, entering the API Key, and selecting a model. Once fully set up, users can upload documents into the chat UI for question-answering.
Once activated, the project pipeline can integrate the strategy step for retrieval. This step provides several configuration options to fine-tune the retrieval process: "Entries to be collected" allows you to specify how many similar entries should be retrieved per query; "TF-IDF Minimum Data Frequency" adjusts the threshold for term frequency-inverse document frequency (TF-IDF) to enhance result filtering; and "Search Input" customizes the search input used for the query. By default, it uses the user's querstion, but it can be modified to account for query variations or custom output of previous steps.
When users upload documents, the etl preparation process is shown in real-time within the chat interface. This allows users to track the progress of the document parsing, and the pipelin will only begin into the strategy execution once the process is complete. Users can also view the attached file names in the current chat, providing transparency about which documents are being used to generate responses. To improve speed, the system may cache extrations and transformation of the given documents internally based on the organization, uploaded files and configuration settings. Cached files allow the system to skip parts of the ETL process and significantly enhance response times. The chat specific files are stored on MinIO, and after two weeks, they are deleted to free up space, with the corresponding conversation being archived (Uploading data Section).