ServiceNow Research & Hugging Face Release The Stack: 3 TB of Permissively Licensed Source Code for LLMs
In the new paper The Stack: 3 TB of Permissively Licensed Source Code, a team from ServiceNow Research and Hugging Face advances open and responsible research on code LLMs by releasing The Stack, a 3.1 TB dataset of permissively licensed source code in 30 programming languages.