VectorLinux
April 24, 2014, 09:16:58 pm *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
News: Visit our home page for VL info. To search the old message board go to http://vectorlinux.com/forum1. The first VL forum is temporarily offline until we can find a host for it. Thanks for your patience.
 
Now powered by KnowledgeDex.
   Home   Help Search Login Register  
Please support VectorLinux!
Pages: [1]
  Print  
Author Topic: pdftotext  (Read 2675 times)
InTheWoods
Vectorite
***
Posts: 302


« on: November 16, 2009, 05:25:35 am »

I have a PDF file originaly created in AbiWord that I would like to convert back to text for further editing.

I have tried "pdftotext", "pdftohtml", and opening the file in Acroread then saving as text.

All of these result in a new file that is comprised of nothing but symbols.

Is there any way to edit or convert this PDF file to an editable format?
Logged
M0E-lnx
Administrator
Vectorian
*****
Posts: 3134



« Reply #1 on: November 16, 2009, 06:06:05 am »

Have you tried pdfedit from the repos?
Logged

Daniel
Packager
Vectorian
****
Posts: 704


WWW
« Reply #2 on: November 16, 2009, 08:53:53 pm »

I have tried "pdftotext", "pdftohtml", and opening the file in Acroread then saving as text.

Did you use the commands: pdf2txt or pdf2html ? (I think those commands exist.) Or did you use: pdftotext and pdftohtml ?
Logged

The following sentence is true. The previous sentence is false.

VL 6.0 SOHO KDE-Classic on 2.3 Ghz Dual-core AMD with 3 Gigs of RAM
Hamzah
Member
*
Posts: 20


Wanna be hacker


« Reply #3 on: January 03, 2010, 06:45:15 pm »

I just tried using the command "pdftotext". And it worked.
Type pdftotext -h to get information
Logged

1001 0101 0010 0110 0011 0000 1010 0111 1101 0100 1001 1011 1000 0100 1111 1100
sledgehammer
Vectorian
****
Posts: 1398



« Reply #4 on: January 03, 2010, 08:06:34 pm »

pdftotext should work on the pdf which was created by abiword.  I call them electronic pdf files, but I am sure there is a better name.  pdftotext will also work on other pdf files which were originally created by saving a file with a word processor.  However, it won't work on pdf files which were scanned. Perhaps the word is flattened.  You have to first ocr the scanned pdf file with tesseract and then edit it with a word processor.

John

Logged

VL7.0 xfce4 Samsung RF511
sledgehammer
Vectorian
****
Posts: 1398



« Reply #5 on: February 09, 2012, 11:23:20 pm »

InTheWoods,

zmcmay's post reminded me that for the past several months I have been using google docs to convert pdfs to text.  Just upload the pdf to google docs (first check the "convert pdf" box. 
Logged

VL7.0 xfce4 Samsung RF511
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2013, Simple Machines Valid XHTML 1.0! Valid CSS!