Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable Texts

The majority of scientific knowledge is most commonly stored in the form of Portable Document Format (PDF), which are also the second most prominent data format on the internet. However, to extract information from this format or transform them into machine-readable text are challenging, especially when mathematical expressions are involved.

To address this issue, previous studies propose Optical Character Recognition (OCR), a effective technology for detecting and classifying individual characters and words from an image, to process scientific documents by treating them as images, but they fail to capture the relationship between sentences as they process the sentences line-by-line.

In a new paper Nougat: Neural Optical Understanding for Academic Documents, a Meta AI research team presents Neural Optical Understanding for Academic Documents (Nougat), a Visual Transformer model that can effectively convert scientific documents stored in PDF format to a lightweight markup language, even intensive mathematical equations are involved.

The team summarizes their primary contributions as follows:

Release of a pre-trained model capable of converting a PDF to a lightweight markup language. We release the code and the model on GitHub.
We introduce a pipeline to create dataset for pairing PDFs to source code.
Our method is only dependent on the image of a page, allowing access to scanned papers and books.

The proposed Nougat is built upon Donut architecture. The Swin Transformer encoder takes a document image as inputs and output a sequence of latent embeddings. Next, the encoded image is decoded into a sequence of tokens through a transformer decoder architecture with cross-attention in a autoregressive manner. Finally, the output is projected to the size of the vocabulary.

Notably, the researchers leverage recent advances in visual document understanding to a novel OCR task, but contrary to previous approaches, Nougat does not need to rely on OCR or embedded text representations, only the rasterized document pages are needed.

In their empirical study, the team compared Nougat with baseline model GROBID, Nougat achieves the highest performance in all metrics, including Edit distance, BLEU, METEOR and F-measure.

Overall, this work demonstrates that Nougat not only has great potential to extract text from digital-born PDFs, but also can handle scanned papers and textbooks. The team hopes their work can serve as a start point for more future research in the related fields.

The code is available on project’s GitHub. The paper Nougat: Neural Optical Understanding for Academic Documents on arXiv.

Author: Hecate He | Editor: Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

5 comments on “Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable Texts”

Pingback: Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable Texts – Ai Headlines
Pingback: Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable Texts
tomas

2024-02-29

Outstanding service is https://writemy.com/write-my-personal-statement ! I’ve utilized several essay writing services before stumbling upon this gem, and I must say, none compare. From impeccable writing quality to prompt delivery, they’ve exceeded my expectations every time. Their expert writers truly understand the nuances of academic writing, ensuring each piece is not just well-crafted but also tailored to my specific requirements. Moreover, the option to ‘write my personal statement’ was a lifesaver during my college application process. I highly recommend this service to anyone seeking top-notch essays and personalized assistance. Trust me; you won’t be disappointed!

Loading...

Reply
Olivia

2024-04-08

If you’re looking to ditch your old fax machine and start sending faxes online, you’ll find everything you need here https://www.gotfreefax.com/fax-cover-sheet . This reduces your carbon footprint and saves costs associated with paper, ink and energy.

Loading...

Reply
abel1303

2025-03-18

The competitive gameplay in Paper io makes it one of the best .io games out there! Every match feels different, and the thrill of dominating the map is amazing.

Loading...

Reply

Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable Texts

Like this:

5 comments on “Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable Texts”

Leave a Reply Cancel reply

Related

Share this:

Like this:

5 comments on “Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable Texts”

Leave a Reply Cancel reply

Related