How to implement RAG (Retrieval-Augmented Generation) with custom embeddings?
Asked about 2 months agoViewed 169 times
5
I want to build a RAG system for our internal documentation, but I'm confused about the embedding strategy.
Current setup:
- 500+ markdown documentation files
- Using OpenAI's text-embedding-3-small
- Storing in Pinecone vector database
Questions:
- Should I fine-tune embeddings on our domain-specific content?
- What chunk size works best for technical documentation?
- How do I handle code snippets vs prose differently?
- What's the best way to re-rank retrieved chunks before sending to LLM?
I've seen some teams use hybrid search (keyword + semantic). Is that worth the added complexity?
asked about 2 months ago
A
Comments
No comments yet. Be the first to comment!
Please log in to add a comment
Log In0 Answers
Sign in to post an answer
Sign In