[Week of 4/29] LangChain Release Notes

[Week of 4/29] LangChain Release Notes

Regression Testing for LangSmith, 3 LangGraph tutorials for RAG agents, and Dosu achieves 30% accuracy improvement with LangSmith

4 min read

🦜🛠️ New in LangSmith

  • 😍 Improved Regression Testing Experience: When making changes to your LLM application, it’s important to understand whether or not behavior has regressed compared to your prior test cases. Changes to the prompt, retrieval strategy, or model choice can have big implications on the responses produced by your application. We’ve updated our Comparison View to make it easier to explore data across multiple experiments:
    • We’ve released more display options so you can customize the granularity of information you want to see. View the columns, datapoints, and charts you find important.
    • Automatically highlight test runs that have increased or decreased on your evaluation metric compared to the baseline, and filter to show only the ones that deviate.
    • An improved side panel view allows you to dive into the details of specific runs that interest you.

Check out our video walkthrough and blog to learn more!

  • Hotkeys in Annotation Queue: We’ve introduced hotkeys in Annotation Queues to help you navigate more quickly 🔥!  Look for hotkey indicators next to buttons that now support keyboard shortcuts.
  • Mustache Support: You now have the option to switch the template language for your prompts in the Playground and Prompt Hub between f-string and Mustache. This gives you more flexibility in how you manage and format your variables.
  • Evaluations: Our new video series shows you how LangSmith Experiments can help you add testing coverage, spot regressions, and make informed tradeoffs amongst latency, cost, and quality. We’ve released three more concepts focused on RAG evaluation, showing you how to evaluate response quality: correctness (Video, Docs), hallucinations (Video, Docs), and retrieved document relevance (Video, Docs).
  • Azure Marketplace: We’re thrilled to announce that LangSmith is now available in the Azure Marketplace. When you purchase LangSmith through the Azure Marketplace with your Azure credits, you’ll keep data fully contained in your Azure VPC, get ease of deployment, and experience a smoother procurement process. Learn more on our blog and contact sales to start a conversation with one of our experts.

🕸️ LangGraph

Check out three new tutorials using LangGraph, our framework for building stateful, multi-actor applications with LLMs.

We’re hosting an online workshop with AI Makerspace next Wednesday, May 8th, where we’ll show you how to build and orchestrate multi-agent RAG applications with LangGraph. Sign up here!

  • Reliable Local RAG Agents with LangGraph and Llama3: In this video, we dive into how to build your own local RAG agent from scratch using LangGraph and Llama3-8b. We show you how to increase agent performance by combining techniques from three advanced RAG papers into a single control flow with LangGraph. Code.
  • Advanced RAG Agents with LangGraph and Mistral: We’ve released a video and set of cookbooks showing you how to build an advanced RAG agent using control flows in LangGraph and Mixtral 8x22B.
  • Build Computing Olympiad Agents with LangGraph: Princeton researchers recently published the USA Computing Olympiad contest benchmark dataset, revealing that a zero-shot GPT-4 agent solves just 8.7% of the challenges. While LLMs can't solve the hardest problems solo, with some clever flow engineering, you can build a hybrid human-AI system to arrive at winning solutions. We've implemented their paper and released a video showing you how to build a programming Olympian with LangGraph. Code.

👀 In Case You Missed It

  • 😍 How Dosu Used LangSmith to Achieve a 30% Accuracy Improvement with No Prompt Engineering: Prompt engineering is the most common approach to help LLMs learn, but in-context (few-shot) learning can provide significantly better results. Discover how Dosu, the GitHub bot that auto-labels issues and PRs, uses LangSmith to power their in-context learning pipeline — check out our video, blog, and cookbook.
  • 🤖 Benchmarking Agents: We’ve updated our benchmark results that evaluate agents powered by different models. These benchmarks leverage LangSmith and LangChain’s standard tool calling interface so you can see which LLMs are best for certain tasks. Learn more about our methodology in our original blog post.
  • 🌊 Flow Engineering with CodiumAI: Flow Engineering has grown in popularity for its ability to outperform naive prompt engineering using an iterative process. LangGraph, our highly controllable framework for creating agents as graphs, is closely aligned with Flow Engineering principles. Hear from our CEO Harrison Chase and CodiumAI CEO Itamar Friedman in this webinar replay.
  • 🔁 Optimizing Vector Retrieval with Graph-based Metadata Techniques Using LangChain and Neo4j: Text embeddings help identify document relevance, but struggle with specific criteria like dates or categories. By pairing metadata filtering with vector similarity search, you can refine search results to better match specific attributes. Dive deeper into this approach in our blog showing you how to use LangChain and Neo4j for more precise document retrieval.

🤝 From the Community