Here’s a comprehensive and categorized list of open source AI stack components that you can mix and match when building GenAI applications — especially when focusing on modularity, scalability, and performance. This includes components for data processing, model serving, retrieval-augmented generation (RAG), vector search, and orchestration.
🧠 Foundational Model Alternatives
Models you can self-host or fine-tune:
-
LLMs
-
Multimodal
-
Fine-Tuning
-
QLoRA,LoRA,PEFT(via 🤗 Transformers + PEFT) -
Axolotl– Full stack fine-tuning.
-
📚 RAG (Retrieval-Augmented Generation) Stack
Tools to power knowledge-based Q&A systems:
-
Embeddings
sentence-transformers-
Instructor-XL– Instruction-based embeddings.
-
Vector Databases
-
Document Loaders & Chunking
-
LangChainorLlamaIndex -
Haystack– Full RAG pipelines.
-
🔧 Serving & Orchestration
Serving models with APIs, managing prompts, memory, and chaining tools:
-
Model Servers
-
vLLM– Fast LLM serving with paged attention. -
TGI– HuggingFace’s scalable inference server. Triton Inference Server-
LMDeploy– Model optimization & serving.
-
-
Agent / Workflow Frameworks
LangChainLlamaIndexHaystack-
CrewAI– Multi-agent framework. AutoGen
-
Prompt Management
PromptLayerLangfuse-
Helicone(for logging OpenAI usage)
🖼️ Frontend / Chat UI
For chatbots or multimodal interfaces:
-
Next.js– UI + SSR/ISR. -
ShadCN/ui– Design system for building clean UIs. -
Chatbot UI– Open-source ChatGPT-style interface. -
Open WebUI– Web UI for LM Studio / Ollama.
🚀 Inference & Runtime Optimization
-
llm.rs– LLM inference in Rust. -
ggml– Quantized models, runs on CPU. -
exllama– High-perf quantized inference.
🔒 Security & DevOps (for production)
-
AuthN/AuthZ: [
Auth.js(NextAuth)], [Clerk], [Ory], [ZITADEL] -
Logging/Tracing: [
Langfuse], [OpenTelemetry], [Sentry] - DevOps: Docker, Kubernetes, GitHub Actions, Terraform
🧱 Full Stack Boilerplates
If you're looking to start fast:
-
AI Engineer OS– Full-stack open source GenAI stack. -
LangChainHub– Reusable chains and prompts. -
OpenChatKit– Chatbot framework. -
Flowise– Visual LangChain builder.
🧪 Experimental Tools
-
Ollama– Run and manage LLMs locally. -
Modal– Serverless infra for AI. -
LiteLLM– Drop-in proxy for OpenAI-compatible APIs.
Top comments (1)
Could you write a new topic about how to use that