The Proposed Solution → Develop a RAG Search Platform that allows for complex natural language querying and response, has much faster search time, and provides drastically more accurate search results.
Technical Definition: A RAG (Retrieval-Augmentation-Generation) Model is an AI framework that optimizes the output of a Large Language Model (LLM) by allowing it to reference useful context knowledge before generating a response.
| Steps | Description | Specific Use Case at C1 |
|---|---|---|
| Retrieval | When you ask a question, the system searches an external database to find specific, relevant snippets of information. |
|
| Augmentation | The system "augments" your original prompt by attaching the retrieved facts to it, giving the AI "open-book" notes to read. | Raw Text of the user question and the retrieved historical information is then passed to Gemini LLM via API (along with a standard prompt text) |
| Generation | The LLM processes both your question and the added context to generate a final answer that is factually grounded in the provided data. | Gemini LLM can use this high value & niche “in-house” technical information to return a much more accurate and faster Diagnosis and Solution Response to the technical problem! |
| Team Member | Assigned Tasks |
|---|---|
| 1. Manager |
Leveraged his experience at AWS to:
|
| 2. Business Lead |
Leveraged his finance background to:
|
| 3. Cyber Security Specialist |
Leveraged his CyberSecurity skillset to:
|
| 4. Myself (Lead Engineer for this Project) |
Develop a Prototype Model to Demonstrate Value to our Organization
|
Model needed to be working end-to-end to properly demonstrate functionality and efficacy of a RAG Model.
Outlining a simple end-to-end Architecture and Work Flow Diagram was critical in having a final result that would be functional (see diagram to the right)
To reduce prototype development time and cost, I looked for opportunities to simplify architecture components that were not needed at small-scale:
My Roadmap to Achieve Functional Prototype on Short Deadline (Contd.)
For sake of development time, I wrote frontend in HTML for static elements and Javascript for any dynamic functionality (e.g. update table, call backend functions etc.)
Backend was written in Python, Flask was used as WebFrame work bridging Frontend and Backend.
To keep our platform fast and user friendly, I opted to use AJAX requests to handle shorter tasks while longer tasks were waiting to complete
For example: AJAX request is first made to call our function to submit our prompt to Gemini API (usually takes several seconds to get response). Because AJAX allows for Async processing, we call functions to update our results table while we are waiting for Gemini to return a response.
My Roadmap to Achieve Functional Prototype on Short Deadline (Contd.)
Incident Diagnostic & Root Cause Extraction Engine
| Match % (Cosine Similarity) |
Problem Description | Root Cause | Fix / Resolution |
|---|---|---|---|
| n/a | My Old Honda Accord Won't Start! | It's an old car! | Jiggle the keys and then turn! |
| n/a | My Old Honda Accord Won't Start! | It's an old car! | Jiggle the keys and then turn! |
| n/a | F150 has door that won't open in freezing cold | Door Latch cable gets frozen stuck | Bring to Dealership and technician will install improved cable design free of charge! |
| n/a | My Old Honda Accord Won't Start! | It's an old car! | Jiggle the keys and then turn! |
| n/a | Engine stutters and loses power when accelerating | Clogged fuel filter or failing spark plugs | Replace the fuel filter and install new spark plugs |