How images live inside a PDF
How a PDF stores pictures as image XObjects, why vector art is different, and what a browser can and cannot pull back out of the file.
Pictures are stored as image XObjects
Inside a PDF a photo is not a loose JPG sitting in the file. It is packaged as an image XObject, a numbered resource that holds the compressed pixel data plus its width, height and colour information. When a page needs to show that picture it issues a paint instruction that references the resource by name. This tool reads those paint instructions, follows each name to its resource, and asks pdf.js to decode the stored pixels so it can hand you a clean PNG.
Raster pixels versus vector drawings
A raster image is a grid of coloured pixels, the kind of thing a camera or scanner produces. A vector drawing, by contrast, is a set of maths instructions such as lines, curves and fills that the viewer redraws at any size. Charts, logos built from shapes, and ruled table borders are usually vector, which is why they look crisp when you zoom in. Because there are no stored pixels to recover, vector content never shows up in the extraction grid even though it is clearly visible on the page.
Why one image can appear more than once
PDFs are efficient about repetition. A header logo or a background texture is stored once, then painted on many pages by reference, which keeps the file small. This tool reports each image on every page where it is painted, so a logo on all ten pages shows up ten times in the grid. If you only want one copy, save the first occurrence and skip the duplicates, since they are pixel-for-pixel the same picture.
The limits of browser-side extraction
Everything happens locally through pdf.js, which is what keeps your document private, but it also sets the boundaries. Password-protected files must be unlocked in a reader first, and a corrupted structure can stop decoding entirely. Very large scans consume real memory while their pixels are turned into PNGs. Within those limits the trade is a good one, because your files are never sent anywhere and the extraction is repeatable.