How to implement RAG (Retrieval-Augmented Generation) with custom embeddings?

Asked about 2 months agoViewed 169 times
5

I want to build a RAG system for our internal documentation, but I'm confused about the embedding strategy.

Current setup:

  • 500+ markdown documentation files
  • Using OpenAI's text-embedding-3-small
  • Storing in Pinecone vector database

Questions:

  1. Should I fine-tune embeddings on our domain-specific content?
  2. What chunk size works best for technical documentation?
  3. How do I handle code snippets vs prose differently?
  4. What's the best way to re-rank retrieved chunks before sending to LLM?

I've seen some teams use hybrid search (keyword + semantic). Is that worth the added complexity?

asked about 2 months ago

Comments

No comments yet. Be the first to comment!

Please log in to add a comment

Log In

0 Answers

Sign in to post an answer

Sign In