How to extract text from all PDF files in a folder?

Return KB main page

Question

Which tool I can use to extract text contentfrom not a single PDF file? I want to batch extract text from all files saved in a same folder at the same time.

Solution

You can try A-PDF Text Extractor Command Line.

The command line enables you to extract text from all PDF files saved in a same folder at a time, and the code is very easy too.

You can use the simple command line below to carry out your requirement:

C:\Files>set ptCmd="C:\Program Files\A-PDF text extractor\ptcmd.exe"
C:\Files>for %f in (*.pdf) do %ptCmd% "%f"

Usage:

Parameters:
[Source]: The PDF file to be extract.
[Output File]: The output text file.

Options:
-W[password] : Password of the pdf file if application.
-B[BeginPage] and -E[EndPage]: Range of page number.
-P[Extract option] : Select to extract only odd pages or even pages or all pages. Default is All. Options available: All, Odd, Even
-H[Header] and -F[Footer] : Some special variants can be put at Header or Footer area of every page to display page information. Following are the variants:
&p Current page number
&a All page count
&f PDF file name with full path.
Such as c:\pdfs\my.pdf
&n PDF file name. Such as my.pdf
&d Extracting date
-O[Output type] : Output type can be used in different situation.
Includes:
Original: Follow the inner order of PDF files.
Smart: Rearrange text based on the position.
Position: output text with positions. Format:
@X=[xpos],Y=[ypos]@[text]@ENDTEXT@
The unit of X,Y is point(1/72 inch)
-T : Output the text extracted into screen, not file.

EXAMPLES:

PTCMD my.pdf
PTCMD c:\pdfs\my.pdf c:\pdfs\out.txt -W"P@ssw0rd" -B4 -E20 -Peven
PTCMD "c:\pdfs\my.pdf" -H" http://a-pdf.com" -F" =Page&p="

How to extract text from all PDF files in a folder?

Question

Solution

Related products