large-language-model – Ali Darbehani

Building a High-Quality RAG System: Challenges and Solutions

In the fast-evolving field of AI, Retrieval-Augmented Generation (RAG) has become a standout technique by effectively bridging the gap between information retrieval and text generation. Essentially, a RAG system retrieves relevant documents from a large corpus in response to a user query, then uses a generative model to produce a coherent response grounded in the…

Alireza Darbehani

August 19, 2024

GenAI, Retrieval Augmented Generation

ai, GenAI, large-language-model, RAG, Retrieval Augmented Generation

Supercharging Your Inference of Large Language Models with vLLM (part-2)

As discussed in part 1 of this blog post vLLM is a high-throughput distributed system for serving large language models (LLMs) efficiently. It addresses the challenge of memory management in LLM serving systems by introducing PagedAttention, an innovative attention algorithm inspired by virtual memory techniques in operating systems. This approach allows for near-zero waste in…

Alireza Darbehani

August 10, 2024

GenAI, Large Language Models, LLM Inference, MLOps

Distributed Inference, GenAI, large-language-model, llm, llm-serving, Paged Attention

Supercharging Your Inference of Large Language Models with vLLM (part-1)

As the demand for large language models (LLMs) continues to rise, optimizing inference performance becomes crucial. vLLM is an innovative library designed to enhance the efficiency and speed of LLM inference and serving. This blog post explains a high level view of vLLM’s capabilities, its unique features, and how it compares to similar solutions in…

Alireza Darbehani

August 4, 2024

GenAI, Large Language Models, LLM Inference

Continuous Batching, GenAI, large-language-model, llm, LLM Inference, Paged Attention, vLLM

Challenges and Best Practices in Developing Multi-Agent AI Applications

The development of multi-agent AI applications, especially those leveraging large language models (LLMs), involves navigating numerous challenges. Ensuring these systems perform optimally requires a blend of strategic planning, robust design principles, and advanced monitoring techniques. Here, we delve into the challenges, best practices, and recommendations for better developing and deploying multi-agent AI applications. Challenges in…

Alireza Darbehani

July 22, 2024

GenAI, Multi Agent Applications

ai, artificial-intelligence, GenAI, large-language-model, llm, multi-agent-apps, RAG

Deploying Agentic Systems: Navigating the Complexities of Multi-Agent LLM Applications

Image source Introduction Deploying agentic Large Language Model (LLM) systems is a multifaceted challenge, involving intricate multi-agent coordination, scalability, and real-time processing. As organizations increasingly depend on LLMs for tasks such as customer service automation and data analysis, ensuring seamless and efficient operation becomes paramount. The complexity lies in managing interactions between multiple agents, each…

Alireza Darbehani

July 17, 2024

GenAI

ai, artificial-intelligence, chatgpt, GenAI, large-language-model, llm, llm-evaluation, llm-observability, multi-agent-apps, technology

Tag: large-language-model

Building a High-Quality RAG System: Challenges and Solutions

Supercharging Your Inference of Large Language Models with vLLM (part-2)

Supercharging Your Inference of Large Language Models with vLLM (part-1)

Challenges and Best Practices in Developing Multi-Agent AI Applications

Deploying Agentic Systems: Navigating the Complexities of Multi-Agent LLM Applications