ingestion, conversion, analysis, and storage of documents for retrieval-augmented generation (RAG)
Monitors Google Drive for new files every few hours, checks if they’ve been imported, and moves or processes them.'
Analyzes the document with GPT-4.1 to generate a summary, classification, and metadata.
Creates a SHA-256 CRYPTO hash of the text to check for duplicates in the vector database.
Converts documents (TXT, RTF, PDF, images) to text using OCR for further processing.