Supported File Formats in Read PDF Aloud

Despite the name, Read PDF Aloud is not limited to PDF files. It supports a wide range of document formats, making it a versatile text-to-speech tool for almost any reading material you have.

Here are all the formats you can upload:

PDF (.pdf) — standard PDF documents and scanned PDFs
EPUB (.epub) — open ebook format
MOBI (.mobi) — Amazon Kindle legacy ebook format
AZW3 / KF8 (.azw3, .kf8) — Amazon Kindle Format 8
Word (.docx) — Microsoft Word documents
PowerPoint (.pptx) — Microsoft PowerPoint presentations
OpenDocument Text (.odt) — LibreOffice / OpenOffice text documents
OpenDocument Presentation (.odp) — LibreOffice / OpenOffice slides
Rich Text Format (.rtf) — cross-platform rich text documents
Plain Text (.txt) — plain text files
Markdown (.md) — Markdown documents

Additionally, Read PDF Aloud offers a Text Input mode where you can paste any text directly and have it read aloud — no file upload needed.

PDF

PDF documents are parsed to extract text page by page, preserving the original layout as closely as possible. Each page is kept as an individual unit, so navigating by page during playback follows the original document structure.

Scanned PDF OCR

For scanned PDFs (image-based documents without embedded text), Read PDF Aloud offers OCR (Optical Character Recognition) powered by AI. When a scanned PDF is detected, you will be prompted to enable OCR. This is a premium feature with the following limits:

Maximum file size: 50 MB
Maximum page count: 1,000 pages

OCR results are returned as structured text organized by page.

How scanned PDFs are detected

Read PDF Aloud calculates the text density of a PDF — the average number of characters per page. If the density falls below a certain threshold (50 characters per page), the document is flagged as a scanned PDF. This prevents wasting resources on documents that have no meaningful text to extract.

Ebook Formats

Ebook files (EPUB, MOBI, AZW3/KF8) are processed chapter by chapter. The original HTML content of each chapter is extracted and converted to plain text, so you can listen to your ebooks with proper structure and pacing.

Note that AZW3 and KF8 refer to the same format (Amazon Kindle Format 8), so they are handled identically.

Office Documents

Office documents (DOCX, PPTX, ODT, ODP, RTF) are parsed into plain text. Since these formats do not have a built-in pagination concept like PDF, the extracted text is grouped into logical segments of 20 lines each for smooth playback.

The Office parser is loaded on demand — it won’t affect your browsing experience until you actually upload an Office file.

Plain Text and Markdown

Plain text files and Markdown documents are the simplest to process. They are read line by line, with empty lines and whitespace removed, then grouped into logical segments for playback.

Processing Pipeline

Regardless of the input format, all files go through the same pipeline:

Format detection — based on file extension
Text extraction — using format-specific parsing logic
Hash generation — a SHA-256 hash is computed for deduplication and caching
Local storage — extracted text and metadata are saved locally in your browser

All text processing happens locally on your device, ensuring your documents remain private and secure.