How to extract text from all PDF files in a folder?
Question
Solution
The command line enables you to extract text from all PDF files saved in a same folder at a time, and the code is very easy too.
You can use the simple command line below to carry out your requirement:
C:\Files>set ptCmd="C:\Program Files\A-PDF text extractor\ptcmd.exe"
C:\Files>for %f in (*.pdf) do %ptCmd% "%f"
Usage:
Parameters:
[Source]: The PDF file to be extract.
[Output File]: The output text file.
Options:
-W[password] : Password of the pdf file if application.
-B[BeginPage] and -E[EndPage]: Range of page number.
-P[Extract option] : Select to extract only odd pages or even pages or all pages.
Default is All. Options available: All, Odd, Even
-H[Header] and -F[Footer] : Some special variants can be put at Header or Footer area of every page to display page information. Following are the variants:
&p Current page number
&a All page count
&f PDF file name with full path.
Such as c:\pdfs\my.pdf
&n PDF file name. Such as my.pdf
&d Extracting date
-O[Output type] : Output type can
be used in different situation.
Includes:
Original: Follow the inner order of PDF files.
Smart: Rearrange text based on the position.
Position: output text with positions. Format:
@X=[xpos],Y=[ypos]@[text]@ENDTEXT@
The unit of X,Y is point(1/72 inch)
-T : Output the text extracted into screen, not file.
EXAMPLES:
PTCMD my.pdf
PTCMD c:\pdfs\my.pdf c:\pdfs\out.txt -W"P@ssw0rd" -B4 -E20 -Peven
PTCMD "c:\pdfs\my.pdf" -H" http://a-pdf.com" -F" =Page&p="
Related products
- A-PDF Text Extractor - Extract plain text from Adobe PDF files
- A-PDF Text Extractor Command line - A command line tool to convert PDF files to text.
- A-PDF OCR - OCR scanned PDF paper books and documents into editable electronic text files fast and easily.
We always like improving our products based on your suggestions. Please send your feedback (or ask questions) to us in the contact page.