Why Your Base64 PDF Is Corrupted (and How to Fix It)

Nothing is more annoying than successfully decoding Base64, downloading the file, and then seeing “file is damaged” from your PDF viewer. The good news: this almost always comes down to a small set of repeatable mistakes.

TL;DR: Most corrupted Base64 PDFs are caused by truncated input, unstripped data URI prefixes, whitespace/newlines, or decoding the wrong field.

The 5 most common causes

1. The input is incomplete

If the Base64 string was cut off during copy/paste, the decoded bytes will be broken.

2. You decoded the full data URI without cleaning it

This string:

data:application/pdf;base64,JVBERi0xLjQK...

must usually be cleaned before decoding.

3. The string contains whitespace

Line breaks from email or formatted JSON often break decoders.

4. The field is not actually PDF data

A payload may contain metadata, thumbnails, or another file type in nearby fields.

5. The file was saved incorrectly

In Python or Node, binary output must be written in binary-safe mode.

Quick debugging checklist

Remove whitespace/newlines
Strip any data: prefix
Decode with a known-good tool like GoGood.dev Base64 Converter
Save as .pdf
Confirm the source field is really the PDF content

Useful sanity check

Many PDF Base64 strings start with:

JVBERi0x

That does not guarantee correctness, but it is a useful sign that the data probably is PDF-related.

Example cleanup in JavaScript

const cleanBase64 = input
  .replace(/\s+/g, '')
  .replace(/^data:[^;]+;base64,/, '');

Example cleanup in Python

clean_base64 = value.replace('\n', '').replace('\r', '')
clean_base64 = clean_base64.split(',')[-1]

Best workflow when you are stuck

First, test the exact string in GoGood.dev Base64 Converter. If the file still fails, the source data is bad. If the tool works but your code does not, your decode/save path is wrong.

FAQ

Why does my Base64 PDF download but not open?

Because the decoded bytes are incomplete or malformed, or the wrong content was decoded.

Can whitespace really break Base64 decoding?

Yes. Some decoders handle it, some do not, and copy/paste often adds noise.

Is the issue usually in the decoder or in the source data?

In practice, both happen — but bad input is more common than people expect.

Corrupted Base64 PDFs are frustrating mostly because the error message is vague. Start with the basics: clean the input, strip the prefix, validate the field, and test with GoGood.dev Base64 Converter.