Problems with scanned documents
- Unable to perform a word search
- Unable to copy text
- Unable to edit text
- Too large to email
- Too large to archive
Acrobat’s Object Character Recognition (OCR) can fix these problems!
Within the Enhanced Scans/Recognize Text tool bar, select Settings. The primary settings to adjust is Output. Three options exist. Taking my 129-page sample scanned proposal, I ran OCR using all three options:
- Searchable Image: This option deskews scanned pages for a plumb and true document; recognizes text, and places an invisible text layer on top; recognizes images; and discards whitespace.
- 49 percent reduction in size to 60.7 MB
- 7:08 minutes to complete
- Searchable Image (Exact): This option skips the deskew step, as well as the discarding of whitespace.
- No reduction in size
- 4:08 minutes to complete
- Editable Text and Images: This option, which was called “ClearScan” In previous versions of Acrobat, offers the best results. It synthesizes a new Type 3 font that is close to the original, places it on an invisible text layer, then downsamples the original background
- 73 percent reduction in size to 31.8 MB
- 8:12 minutes to complete (7:00 minutes to scan and 1 minute to process)
- Downsample To: When using the Searchable Image option, this sets the downsample rule for content that Acrobat detects as an image. In my opinion, most of the file size reduction when Searchable Image is due to discarding whitespace from the scanned image. Using Downsample, you can continue to reduce the file size. Consider leaving this feature at its default of 600 dpi and use the Editable Text and Images (ClearScan) option.
Acrobat Pro DC has integrated OCR into its Edit PDF tool, which makes it an on-demand feature when you are only interested in a portion of the document.
OCR was completed on a 2.5GHz 6-Core Intel XEON E5 Mac Pro; your time may vary.