Extracting PDF Text Using Markup Tools
PDF text can be extracted using the highlighter, strikethrough, underline, and replace text commenting tools.
Important Setting
There's a preferences setting in both Acrobat and Reader that can be used to extract text from PDFs using specific markup tools as listed in the subtitle of this article. Press Ctrl + k to open the preferences window and select the Comments category at the top of the list. Under the Making Comments section, select Copy selected text into Highlight, Strikethrough, Underline and Replace Text comment pop-ups. After this setting is selected any text selected with those markup tools becomes the contents of the annotation, available inside a popup for that annotation.
Get the course above, and a suite of automation tools for FREE with a Professional subscription.
Extracting Contents With JavaScript
PDFs can be converted to Excel spreadsheets by selecting File > Export To > Spreadsheet > Microsoft Excel Workbook. While the end result might resemble the PDF visually, the process is far from perfect and data might not be organized into rows and columns that is usable. This is especially true for scanned documents that have been OCR'd (recognize text). Consider a bank or credit card statement for which you need to extract transactions. Suppose you need data from four columns:
Date
Transaction description
Funds out
Funds in




