#1  
Old 7th December 2005, 21:14
badben badben is offline
Senior Member
 
Join Date: Oct 2005
Location: Lancashire, UK
Posts: 102
Thanks: 0
Thanked 1 Time in 1 Post
Default Converting PDF to .txt File

Does anybody know if it is possible to convert a pdf file to a plain text file using php so that my site search engine can index it?

I can't seem to find anything.
__________________
Tidy Hosting
Reply With Quote
  #2  
Old 7th December 2005, 21:24
falko falko is offline
Super Moderator
 
Join Date: Apr 2005
Location: Lüneburg, Germany
Posts: 31,853
Thanks: 781
Thanked 1,558 Times in 1,477 Posts
Default

You can use xpdf to extract text from PDF files (pdftotext). http://www.foolabs.com/xpdf/
__________________
Falko
--
Follow me on:
Reply With Quote
  #3  
Old 7th December 2005, 21:38
till till is offline
Super Moderator
 
Join Date: Apr 2005
Location: Lüneburg, Germany
Posts: 19,805
Thanks: 285
Thanked 1,805 Times in 1,357 Posts
Default

Or you use pdf2ps (http://www.csit.fsu.edu/~burkardt/g_...ps/pdf2ps.html) to convert the pdf to ps file and then ps2ascii to extract the text (http://annys.eines.info/cgi-bin/man/man2html?ps2ascii+1).
Reply With Quote
  #4  
Old 25th March 2006, 17:10
sbovisjb1 sbovisjb1 is offline
Senior Member
 
Join Date: Feb 2006
Posts: 172
Thanks: 0
Thanked 0 Times in 0 Posts
Exclamation Here is a script to index the stuff

Well idk if this would work... oh well
# Ex: matches [ -q ] string globpattern
# Does $1 match the glob expr $2 ?
# -q flag = set return status to 0 (true) or 1 (false)
# no -q flag = echo "1" (true) or "0" (false)
# Unfortunately, the return status is opposite from the echo'ed string
globmatches () {
if [ $1 = "-q" ]; then
shift
case "$1" in
$2 ) true ;;
* ) false ;;
esac
else
case "$1" in
$2 ) echo 1 ; true ;;
* ) echo 0 ; false ;;
esac
fi
}

if globmatches -q $file "*.txt" ; then
echo "Found a txt file"
elif globmatches -q $file "*pdf" ; then
echo "Found a pdf file"
if
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Mail System doesnt work! crichton Installation/Configuration 3 18th September 2007 00:49
PHP/MySQL/Apache2/ISPConfig configuration issues? senzapaura General 21 25th December 2005 15:01
ISPConfig pop3 problem mphayesuk General 21 31st October 2005 11:53
IMAP & POP3 FC3 Samer A. Yaghi Installation/Configuration 6 17th October 2005 13:54
/stats doesn't work (password not accepted) Hellbound General 6 15th September 2005 16:34


All times are GMT +2. The time now is 12:21.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Sponsored Links: Unified Communications: Thoughts, Strategies and Predictions
Join the discussion.
www.seamlessenterprise.com

IP Convergence
Integrate your wireless and wireline networks.
Learn how from the experts at Sprint.
www.seamlessenterprise.com

Wireless & Wireline Integration
Thoughts, strategies and solutions: join the discussion
www.seamlessenterprise.com

Unified Communications 2009
Join the Discussion. Now.
www.seamlessenterprise.com

Red Hat Virtual Experience - a free virtual event. Dec. 9th