CMU & Inspired Cognition’s DocPrompting Improves Code Generation by Retrieving Relevant Documentation

In the new paper DocPrompting: Generating Code by Retrieving the Docs, a research team from Carnegie Mellon University and Inspired Cognition presents DocPrompting, a natural-language-to-code generation approach. Tasked with generating code to unseen functions or libraries from a natural language intent, DocPrompting retrieves corresponding code documentation to enable the model to learn to perform the task.

by Synced

2023-02-28

Comments 40

The ability of large language models to generate computer code from natural language (NL) prompts has revolutionized the programming domain. Most contemporary models however can only generate code for seen libraries and function calls, and struggle when they encounter any of the new libraries or functions that are constantly being introduced. A human programmer facing such a challenge would typically research and retrieve user manuals and other relevant documents to familiarize themselves with the new library/function — could LLMs be taught to do the same?

In the new paper DocPrompting: Generating Code by Retrieving the Docs, a research team from Carnegie Mellon University and Inspired Cognition presents DocPrompting, a novel NL-to-code generation approach. Tasked with generating code to unseen functions or libraries from an NL intent, DocPrompting retrieves corresponding code documentation to enable the model to learn to perform the task.

DocPrompting is inspired by programmers’ use of manuals and documentation when encountering unseen/unused functions or libraries. The approach first learns to retrieve relevant documents from an external documentation pool, then learns to generate code using prompts based on the information it gleaned from the documents.

The documentation pool can be regularly updated with new content to enable DocPrompting to generate unseen and unused functions and libraries without requiring any costly retraining of model components. DocPrompting is also a general method — it can be applied to any programming language and is not bounded to the underlying neural model, and can be instantiated with any base retriever and generator.

In their empirical study, the team evaluated DocPrompting on two NL-to-code tasks and benchmarks: shell scripting and Python programming. In the shell scripting task, DocPrompting consistently improved on the base model; while In Python programming, CodeT5+DocPrompting performed exceptionally well on unseen functions and achieved a 1.65 BLEU score improvement over the state-of-the-art result.

This work opens a promising new direction for the evolution of code generation. The team says that, to their best knowledge, DocPrompting is the first approach to explicitly and effectively leverage documentation for NL-to-code tasks.

The code is available on the project’s GitHub. The paper DocPrompting: Generating Code by Retrieving the Docs is on arXiv.

Author: Hecate He | Editor: Michael Sarazen

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

40 comments on “CMU & Inspired Cognition’s DocPrompting Improves Code Generation by Retrieving Relevant Documentation”

Salima

2023-05-15

Your writing is perfect and complete.

Loading...

Reply
- Maria L.
  
  2026-03-14
  
  I found the idea of retrieving documentation to handle unseen functions really clever. As a developer, I often struggle with new libraries, so a tool that could intelligently pull the right docs would be a game-changer. It addresses a very real pain point in current AI coding assistants.Maria L.maria@fiongo.comhttps://fiongo.com
  
  Loading...
  
  Reply
donna1205

2025-04-29

From chill beats to high-energy jams, Sprunki Incredibox has it all! Each round is like creating a new song. Love it!

Loading...

Reply
Geometry dash

2025-05-27

Game blends music and gameplay in a way that few games can—one mistake and it’s game over!

Loading...

Reply
DeepSite AI

2025-07-26

Hello! I love all the styles you share!

Loading...

Reply
doodle baseball

2025-08-06

Wow, this is fascinating! The idea of LLMs learning to use documentation like human programmers is a real game-changer. It reminds me a little of playing Doodle Baseball – you don’t always nail the perfect hit right away, but you learn from each swing, adjusting your timing and strategy. Similarly, DocPrompting seems to allow the model to “swing” at new code challenges, learn from its “misses” by referencing the documentation, and ultimately improve its “hitting” average. The fact that it’s adaptable to different languages is super promising too. Thanks for sharing this!

Loading...

Reply
Smith Emma

2025-08-14

papa’s games is a series of restaurant management simulation games where players take on the role of chefs and manage different food outlets.

Loading...

Reply
papa's game

2025-08-14

Papa’s Games is a series of restaurant management simulation games where players take on the role of chefs and manage different food outlets.

Loading...

Reply
Kontext Dev

2025-08-26

Kontext Dev is your go-to AI image editor when you need precise local adjustments and consistent character details across edits. Start with a free trial and see how Kontext Dev combines multi-modal input and advanced control to help artists and developers achieve professional results with less effort.

Loading...

Reply
Heic To Jpg

2025-08-27

HEIC to JPG – Fast & Easy HEIC Image Converter

Loading...

Reply
life grid

2025-08-31

this is so helpful for me . thanks!

Loading...

Reply
hunty zombie

2025-09-11

I recently tried Hunty Zombie — it delivers spooky thrills, vivid visuals, and heart-racing scenes. The pacing is excellent, keeping tension high. The sound design adds an extra layer of immersion. Perfect for horror fans who enjoy being terrified and intrigued in equal measure.

Loading...

Reply
DeepSong AI

2025-09-17

Create original and royalty-free songs with DeepSong.ai now! – a free online AI-powered platform for fast, high-quality song & music generator.

Loading...

Reply
Kontext Dev

2025-10-14

Struggling with complex image edits? Kontext Dev offers free AI-powered local image editing with character consistency and multi-modal input — perfect for artists and developers.

Loading...

Reply
Brat Generator

2025-10-14

Brat Generator makes it easy to design custom Charli XCX brat cover style images in seconds, free and online.

Loading...

Reply
ao3dle

2025-11-05

This is super interesting! LLMs struggling with new libraries is a real pain point, so improving code generation with documentation retrieval sounds like a game-changer. It’s cool how different types of data can solve such varied problems, from coding to even figuring out fandom stats in a fun game like ao3dle!

Loading...

Reply
best red light therapy sleeping bag and mat

2025-11-05

DocPrompting sounds like a huge win for LLMs and code generation, especially with new libraries! It’s cool how innovation makes everything better. That even applies to personal wellness. I’ve been looking into the best red light therapy devices for better sleep and recovery. So many advancements!

Loading...

Reply
retro bowl 26

2025-11-05

This is super interesting! Improving LLMs to handle new libraries is a game-changer for code generation, really making them more robust. It reminds me how important good documentation is. Sometimes, after intense coding, I need a different kind of ‘generation’ – like a fun break. Ever tried retro bowl 26 for a quick game? It’s a blast!

Loading...

Reply
lovetype 16

2025-11-18

I recently explored Love Type — it’s a surprisingly soothing experience that mixes playful personality insights with a clean, elegant design. The questions feel intuitive, the results are fun to read, and the overall vibe is uplifting. Great for anyone who enjoys self-discovery with a light romantic twist.

Loading...

Reply
lkd maomao

2025-11-25

Test your fandom knowledge with Rule34dle – the addictive daily game where you predict which characters have more Rule 34 content! Challenge yourself with 10 daily rounds or go infinite, compete with friends on global leaderboards, and discover fascinating trends in Rule 34 culture. Perfect for casual fans and dedicated enthusiasts alike. Play now and see how well you really know your favorite characters.

Loading...

Reply
hubble birthday

2025-11-26

I recently checked out Hubble Birthday — it’s an engaging tool that pairs your birth date with a stunning Hubble photo, giving you a glimpse into the cosmos in a uniquely personal way. The design is clean, the results load quickly, and each image feels like a little snapshot of universal wonder. Perfect for space lovers or anyone curious about the universe’s timeline.

Loading...

Reply
theforgescript

2025-11-28

I recently explored The Forge Script — the site keeps things minimal and straightforward, making it easy to skim through the script features available for the game. The categories are clear, the functions are laid out simply, and the overall experience feels quick and practical.

Loading...

Reply
lkd maomao

2025-12-20

z-img.art is an AI image generation and editing platform for creators, allowing them to quickly generate high-quality images simply by providing input prompts.

Loading...

Reply
Vidmix

2025-12-21

Such a useful explanation saved me a lot of time. Thanks!

Loading...

Reply
lkd maomao

2025-12-22

nano banana pro is an AI image generation tool to create stunning images from prompts in seconds.

Loading...

Reply
My Little Pony test

2026-01-06

This My Little Pony test only took a few minutes to finish, but the personality description felt thoughtful and quite accurate.

Loading...

Reply
ai graffiti generator

2026-01-14

This research on improving code generation is fascinating! It reminds me of how AI tools can enhance creativity too, like using a graffiti font generator to turn ideas into cool street art designs effortlessly.

Loading...

Reply
orblary

2026-02-21

Really appreciate the detail here. Not many blogs go this deep. For more quality content, I’d recommend https://brainrotgames.games as well.

Loading...

Reply
orblary

2026-02-21

DocPrompting is a clever approach. Grounding code generation in retrieved documentation makes a lot of sense especially for less common APIs where the model has limited training data. Curious to see how this scales with larger doc corpora.

Loading...

Reply
lucid

2026-02-22

The idea of having LLMs retrieve documentation before generating code is a natural and elegant solution to the unseen-library problem. DocPrompting essentially mirrors how experienced developers actually work — nobody memorizes every API, they look things up. The 1.65 BLEU improvement on unseen Python functions is particularly impressive given how challenging that benchmark is.

Loading...

Reply
Sarah Chen

2026-02-25

Thanks for this insightful content. I wanted to add that AI-powered video creation tools have become incredibly sophisticated. I’ve been using https://videodance.cc — the latest models support multi-language lip sync, physics-accurate dynamics, and even multi-shot narrative coherence. Exciting times for creators!

Loading...

Reply
Sarah Chen

2026-02-25

This is really helpful, thanks for sharing! On a related note, I’ve been using AI video generators recently and it’s amazing how far the technology has come. I found https://videodance.cc which uses Seedance 2.0 models to generate multi-shot narratives with synchronized audio — the future of content creation is here.

Loading...

Reply
pvz fusion mod

2026-03-30

The analogy to human programmers looking up documentation is spot on. I appreciate how DocPrompting addresses the gap where current LLMs fail with unseen libraries — retrieving relevant docs before generating code is an elegant solution. Curious to see how this scales with rapidly evolving frameworks.

Loading...

Reply
sbti

2026-04-10

The SBTI 测试 feels more practical than most — quick to finish and easy to relate to your daily behavior.

Loading...

Reply
Text to Song AI

2026-04-30

The idea of mimicking how human programmers actually use documentation to learn new libraries is a total game-changer for LLMs. I love that this approach avoids the need for constant, costly retraining by just plugging in updated docs instead. It makes the whole code generation process feel much more dynamic and scalable for real-world development.

Loading...

Reply
Alex Chen

2026-05-19

Excellent overview of DocPrompting. Retrieving relevant documentation to guide code generation is a smart way to handle unseen libraries. It reminds me that similar retrieval-augmented approaches are emerging in other domains, such as personality analysis. For example, there’s now an AI palm reading tool that can interpret your future and personality from a simple photo of your hand, drawing on traditional palmistry knowledge. The parallel is that both systems augment their core model by pulling in contextual data rather than relying solely on memorized training.

Loading...

Reply
Alex Chen

2026-05-20

The DocPrompting approach is a clever way to bridge the gap between LLMs and unfamiliar libraries by leveraging documentation—something I’ve found myself doing manually many times. For researchers and developers who need to communicate complex systems visually alongside code, tools that streamline the creation of scientific figures and data visualizations can be equally transformative. I’ve been using a scientific diagram and data visualization tool that generates publication-quality illustrations, charts, and flowcharts from simple text prompts, and exports them as editable SVGs. It cuts down hours of manual drawing and coding into minutes, which pairs nicely with workflow automation like DocPrompting for end-to-end research communication.

Loading...

Reply
Justin Carter

2026-05-31

I absolutely agree with your points about free calculator free calculator tool tool. This is very useful. Keep up the excellent work!

Loading...

Reply
best ai image generator

2026-06-03

Interesting approach to improving code generation by retrieving documentation. This method seems much more efficient than retraining models for new libraries.
click

Loading...

Reply
Patrick Edwards

2026-06-08

I’ve been searching for information about practical tools and this practical tools article is just what I needed. Thanks for the comprehensive explanation!

Loading...

Reply