LangChain - Dev Weekends

Introduction
What is LangChain?
Installation
Core Concepts
1. LCEL (LangChain Expression Language)
2. Prompts
3. Chains
4. Memory
5. Tools
6. RAG with LangChain
7. Streaming
8. Observability with LangSmith
Production Patterns
Error Handling
Caching
Batch Processing
Output Parsing
When to Use LangChain
Limitations
Performance Tips
Key Takeaways
What’s Next

December 2025 Update: Covers LangChain 0.3+ with LCEL (LangChain Expression Language), async support, and production patterns.

Introduction

In recent years, language models have become more advanced, allowing us to tackle complex tasks and extract information from large documents. However, these models have a limit on the amount of context they can consider, which can be tricky when dealing with lots of information. To overcome this challenge, LLM chains have emerged. They simplify the process of chaining multiple LLM calls together, making it easier to handle large volumes of data.

LLM chains use different language model components to process information and generate responses in a unified way. In this article, we will discuss different components and conventions in LangChain.

What is LangChain?

LangChain provides AI developers with tools to connect language models with external data sources.

LLMs are large deep-learning models pre-trained on large amounts of data that can generate responses to user queries by answering questions or creating images from text-based prompts. LangChain is a framework for building applications powered by LLMs. It provides:

Chains: Composable sequences of LLM calls
Prompts: Template management and optimization
Memory: Conversation and context management
Tools: Integration with external APIs and functions
Agents: Autonomous decision-making workflows

Why LangChain? While you can build AI apps with raw APIs, LangChain provides abstractions that make production systems easier to build, test, and maintain.

Installation

pip install langchain langchain-openai langchain-core langchain-community

Core Concepts

1. LCEL (LangChain Expression Language)

LCEL is LangChain’s declarative way to compose chains:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Define components
llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_template("Translate to {language}: {text}")
output_parser = StrOutputParser()

# Compose chain
chain = prompt | llm | output_parser

# Invoke
result = chain.invoke({"language": "French", "text": "Hello, world!"})
print(result)  # "Bonjour, le monde!"

Key Benefits:

Declarative syntax with pipe operator (|)
Automatic async support
Built-in streaming
Easy debugging and observability

2. Prompts

Prompt Templates:

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# System + User prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Answer in {style}."),
    ("user", "{question}")
])

# With conversation history
conversation_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="history"),
    ("user", "{input}")
])

Few-Shot Examples:

from langchain_core.prompts import FewShotChatMessagePromptTemplate

examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
]

example_prompt = ChatPromptTemplate.from_messages([
    ("human", "{input}"),
    ("ai", "{output}"),
])

few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)

final_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant that gives antonyms."),
    few_shot_prompt,
    ("human", "{input}"),
])

Prompt Management:

# Save prompts to file
prompt.save("prompts/translation.yaml")

# Load from file
from langchain_core.prompts import load_prompt
loaded_prompt = load_prompt("prompts/translation.yaml")

3. Chains

Simple Chain:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_template("What is {topic}?")

chain = prompt | llm

response = chain.invoke({"topic": "quantum computing"})

Sequential Chain:

from langchain_core.output_parsers import StrOutputParser

# Chain 1: Generate question
question_chain = (
    ChatPromptTemplate.from_template("Generate a question about {topic}")
    | llm
    | StrOutputParser()
)

# Chain 2: Answer question
answer_chain = (
    ChatPromptTemplate.from_template("Answer this question: {question}")
    | llm
    | StrOutputParser()
)

# Combine
def qa_chain(topic: str):
    question = question_chain.invoke({"topic": topic})
    answer = answer_chain.invoke({"question": question})
    return {"question": question, "answer": answer}

RunnableParallel (Parallel Execution):

from langchain_core.runnables import RunnableParallel

parallel_chain = RunnableParallel({
    "summary": ChatPromptTemplate.from_template("Summarize: {text}") | llm,
    "sentiment": ChatPromptTemplate.from_template("Sentiment of: {text}") | llm,
    "keywords": ChatPromptTemplate.from_template("Extract keywords from: {text}") | llm,
})

results = parallel_chain.invoke({"text": "LangChain is a framework for LLM applications."})

Conditional Chains:

from langchain_core.runnables import RunnableLambda

def route_chain(input_data: dict):
    if input_data["type"] == "technical":
        return technical_chain
    else:
        return general_chain

routed_chain = RunnableLambda(route_chain)

4. Memory

Conversation Buffer:

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

memory = ConversationBufferMemory()
chain = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True
)

chain.predict(input="Hi, I'm Alice")
chain.predict(input="What's my name?")  # Remembers "Alice"

Conversation Summary Memory:

from langchain.memory import ConversationSummaryMemory

summary_memory = ConversationSummaryMemory(llm=llm)
chain = ConversationChain(llm=llm, memory=summary_memory)

# Long conversation gets summarized automatically

Conversation Buffer Window:

from langchain.memory import ConversationBufferWindowMemory

window_memory = ConversationBufferWindowMemory(k=5)  # Last 5 exchanges

Vector Store Memory:

from langchain.memory import VectorStoreRetrieverMemory
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_texts([""], embeddings)

vector_memory = VectorStoreRetrieverMemory(
    retriever=vectorstore.as_retriever(search_kwargs={"k": 3})
)

Custom Memory:

from langchain.memory import BaseMemory
from typing import Dict, List

class CustomMemory(BaseMemory):
    def __init__(self):
        self.memories: List[Dict] = []
    
    def save_context(self, inputs: Dict, outputs: Dict):
        self.memories.append({"inputs": inputs, "outputs": outputs})
    
    def load_memory_variables(self, inputs: Dict) -> Dict:
        return {"history": self.memories}
    
    def clear(self):
        self.memories = []

5. Tools

Creating Tools:

from langchain_core.tools import tool

@tool
def search_web(query: str) -> str:
    """Search the web for information."""
    # Implementation here
    return f"Results for {query}"

@tool
def calculator(expression: str) -> str:
    """Evaluate a mathematical expression."""
    try:
        result = eval(expression)
        return str(result)
    except:
        return "Invalid expression"

tools = [search_web, calculator]

Using Tools with LLM:

from langchain_core.messages import HumanMessage

llm_with_tools = llm.bind_tools(tools)

response = llm_with_tools.invoke([
    HumanMessage(content="What's 15 * 23? Then search for Python tutorials.")
])

# Check for tool calls
if response.tool_calls:
    for tool_call in response.tool_calls:
        print(f"Tool: {tool_call['name']}")
        print(f"Args: {tool_call['args']}")

Structured Tools:

from langchain_core.tools import StructuredTool
from pydantic import BaseModel, Field

class SearchInput(BaseModel):
    query: str = Field(description="Search query")
    max_results: int = Field(default=5, description="Maximum results")

def search_function(query: str, max_results: int = 5) -> str:
    # Implementation
    return f"Found {max_results} results for {query}"

structured_tool = StructuredTool.from_function(
    func=search_function,
    args_schema=SearchInput,
    name="web_search",
    description="Search the web for information"
)

6. RAG with LangChain

Complete RAG Chain:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.runnables import RunnablePassthrough

# Setup vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_texts(
    texts=["LangChain is a framework for building LLM applications..."],
    embedding=embeddings
)

retriever = vectorstore.as_retriever()

# RAG chain
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_prompt = ChatPromptTemplate.from_template("""
Answer the question based on the context:

Context: {context}

Question: {question}
""")

rag_chain = (
    {
        "context": retriever | format_docs,
        "question": RunnablePassthrough()
    }
    | rag_prompt
    | llm
    | StrOutputParser()
)

answer = rag_chain.invoke("What is LangChain?")

RAG with Sources:

from langchain_core.runnables import RunnableLambda

def format_docs_with_sources(docs):
    formatted = []
    for i, doc in enumerate(docs):
        formatted.append(f"[Source {i+1}]: {doc.page_content}")
    return "\n\n".join(formatted)

rag_with_sources = (
    {
        "context": retriever | format_docs_with_sources,
        "question": RunnablePassthrough()
    }
    | rag_prompt
    | llm
    | StrOutputParser()
)

7. Streaming

Streaming Responses:

from langchain_core.callbacks import StreamingStdOutCallbackHandler

llm = ChatOpenAI(
    model="gpt-4o",
    streaming=True,
    callbacks=[StreamingStdOutCallbackHandler()]
)

chain.invoke({"topic": "AI"})

Custom Streaming Handler:

from langchain_core.callbacks import BaseCallbackHandler

class CustomStreamHandler(BaseCallbackHandler):
    def __init__(self, on_token):
        self.on_token = on_token
        self.tokens = []
    
    def on_llm_new_token(self, token: str, **kwargs):
        self.tokens.append(token)
        self.on_token(token)
    
    def get_full_response(self) -> str:
        return "".join(self.tokens)

# Usage
tokens_received = []
handler = CustomStreamHandler(lambda t: tokens_received.append(t))

llm = ChatOpenAI(model="gpt-4o", streaming=True, callbacks=[handler])
response = llm.invoke([HumanMessage(content="Explain streaming")])

Async Streaming:

async def stream_chain():
    async for chunk in chain.astream({"topic": "AI"}):
        print(chunk.content, end="", flush=True)

import asyncio
asyncio.run(stream_chain())

8. Observability with LangSmith

import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
os.environ["LANGCHAIN_PROJECT"] = "my-project"

# All chains automatically traced
chain.invoke({"input": "test"})
# View at smith.langchain.com

Custom Tracing:

from langchain_core.tracers import LangChainTracer
from langchain.callbacks import LangChainTracerV2

tracer = LangChainTracerV2(project_name="my-project")
chain.invoke({"input": "test"}, config={"callbacks": [tracer]})

Production Patterns

Error Handling

from langchain_core.runnables import RunnableLambda

def safe_invoke(chain, input_data):
    try:
        return chain.invoke(input_data)
    except Exception as e:
        return {"error": str(e), "input": input_data}

safe_chain = RunnableLambda(lambda x: safe_invoke(chain, x))

Retry Logic:

from langchain_core.runnables import RunnableRetry

retry_chain = RunnableRetry(
    chain,
    max_attempts=3,
    retry_if_exception_type=(Exception,)
)

Caching

from langchain.cache import InMemoryCache
from langchain.globals import set_llm_cache

set_llm_cache(InMemoryCache())

# First call - hits API
chain.invoke({"input": "test"})

# Second call - uses cache
chain.invoke({"input": "test"})

Redis Cache:

from langchain.cache import RedisCache
import redis

redis_client = redis.Redis()
set_llm_cache(RedisCache(redis_client))

Batch Processing

# Process multiple inputs in parallel
inputs = [{"topic": "AI"}, {"topic": "ML"}, {"topic": "NLP"}]
results = chain.batch(inputs)

Async Batch:

import asyncio

async def batch_process():
    inputs = [{"topic": "AI"}, {"topic": "ML"}]
    results = await chain.abatch(inputs)
    return results

asyncio.run(batch_process())

Output Parsing

Structured Output:

from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

class Answer(BaseModel):
    answer: str = Field(description="The answer")
    confidence: float = Field(description="Confidence score")

parser = PydanticOutputParser(pydantic_object=Answer)

chain = prompt | llm | parser

result = chain.invoke({"question": "What is AI?"})
print(result.answer)
print(result.confidence)

JSON Output:

from langchain_core.output_parsers import JsonOutputParser

json_parser = JsonOutputParser()

chain = prompt | llm | json_parser
result = chain.invoke({"input": "Extract key info"})

When to Use LangChain

Use LangChain when:

Building complex multi-step workflows
Need prompt management and versioning
Require memory/conversation management
Integrating multiple tools and APIs
Want built-in observability (LangSmith)
Building production systems that need maintainability
Need to compose different LLM providers

Consider raw APIs when:

Simple one-off LLM calls
Maximum performance is critical
Want minimal dependencies
Building lightweight prototypes
Need fine-grained control over every API call

Limitations

Additional abstraction layer adds overhead
Learning curve for LCEL syntax
Dependency on LangChain ecosystem
Can be overkill for simple use cases
Version changes can break code

Performance Tips

Use async methods (ainvoke, astream) for better concurrency
Enable caching for repeated queries
Batch process when possible
Use streaming for better UX
Monitor with LangSmith to identify bottlenecks
Cache embeddings and prompts when possible

Key Takeaways

LCEL is Powerful

Use the pipe operator (|) to compose chains declaratively.

Prompts as Templates

Manage prompts separately from code for easier iteration.

Memory Built-in

LangChain provides multiple memory types for conversations.

Production Ready

Built-in observability, caching, and error handling.

What’s Next

LangGraph

Learn how to build complex agent workflows with state machines

Multi-Agent Design Patterns LangGraph

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Introduction

​What is LangChain?

​Installation

​Core Concepts

​1. LCEL (LangChain Expression Language)

​2. Prompts

​3. Chains

​4. Memory

​5. Tools

​6. RAG with LangChain

​7. Streaming

​8. Observability with LangSmith

​Production Patterns

​Error Handling

​Caching

​Batch Processing

​Output Parsing

​When to Use LangChain

​Limitations

​Performance Tips

​Key Takeaways

LCEL is Powerful

Prompts as Templates

Memory Built-in

Production Ready

​What’s Next

LangGraph

Introduction

What is LangChain?

Installation

Core Concepts

1. LCEL (LangChain Expression Language)

2. Prompts

3. Chains

4. Memory

5. Tools

6. RAG with LangChain

7. Streaming

8. Observability with LangSmith

Production Patterns

Error Handling

Caching

Batch Processing

Output Parsing

When to Use LangChain

Limitations

Performance Tips

Key Takeaways

What’s Next