• Welcome! The TrekBBS is the number one place to chat about Star Trek with like-minded fans.
    If you are not already a member then please register an account and join in the discussion!

PDFs to Word docs

Aramis

Commander
Red Shirt
Hiya,

My boss has tasked me with taking large PDF files and pulling out specific sections of text, while keeping the text as a PDF.

I can't copy/paste to a new PDF document; have to create a new one for these specific sections. So far what I've had to do is take the entire PDF file, convert it to Word, and then select the specific section I want to convert back to PDF. A tedious process if there ever was one. Further complicating the task is that there are often images, tables or graphs included.

Does anybody know of software or a program that I can download that will let me "pull" specific sections of PDF, while keeping it as a PDF so I could just save it as a new PDF? The tech guys here at work haven't had any suggestions.
 
Hiya,

My boss has tasked me with taking large PDF files and pulling out specific sections of text, while keeping the text as a PDF.

I can't copy/paste to a new PDF document; have to create a new one for these specific sections. So far what I've had to do is take the entire PDF file, convert it to Word, and then select the specific section I want to convert back to PDF. A tedious process if there ever was one. Further complicating the task is that there are often images, tables or graphs included.

Does anybody know of software or a program that I can download that will let me "pull" specific sections of PDF, while keeping it as a PDF so I could just save it as a new PDF? The tech guys here at work haven't had any suggestions.

Adobe Acrobat Professional lets you manipulate PDF's, remove content, add new content, etc.
 
Hiya,

My boss has tasked me with taking large PDF files and pulling out specific sections of text, while keeping the text as a PDF.

I can't copy/paste to a new PDF document; have to create a new one for these specific sections. So far what I've had to do is take the entire PDF file, convert it to Word, and then select the specific section I want to convert back to PDF. A tedious process if there ever was one. Further complicating the task is that there are often images, tables or graphs included.

Does anybody know of software or a program that I can download that will let me "pull" specific sections of PDF, while keeping it as a PDF so I could just save it as a new PDF? The tech guys here at work haven't had any suggestions.

Adobe Acrobat Professional lets you manipulate PDF's, remove content, add new content, etc.

I've used the edit facilities in Acrobat Pro and well lets say it's not a particularly easy process especially if you need to make changes. That changes aren't made to the actual document but are layer over.

It's also damn expensive. I'd first do a search for alternative PDF programs and see which ones will allow the documents to be edited (Bluebeam is one alternative to Acrobat Pro that might do the trick).

The best way I found was to OCR the document to a .doc (and it's straight text the process should be fairly painless if a bit time consuming). A client of mine used Scansoft's Omnipage OCR program which allowed you to either OCR from an scanner or read in from a PDF file. As a Word file you can edit as normal and then convert it back to PDF as well as always having an editiable version around.

Just remember trial versions will be your best friend. If this is just a once off then well you can do the job without cost (yeah not quite Kosher though) but they can also allow you to see how things go and it might lead to a purchase.
 
Doesn't the standard Adobe Acrobat work too (not the free version, but the one below Pro)?

If your boss doesn't want to pay for it, you'll have no choice but to use some kind of work-around.
 
Yeah, the tech guys are not willing to install Acrobat Pro on my machine - too pricey. Some individuals have it here, but in order for me to work on their computers, they would have to find somewhere else to sit/work for up to 1-2 days.

I have the Scansoft software - or at least some recent free version. It's giving me a lot of problems which is what prompted this post in the first place. :-(
 
Problem with PDF files is that they are designed to not be edited really. The general point is that all fonts are embedded and everything is locked down so you know for sure it will look exactly the same whichever machine you open it on.

They are a final file format to write to from working files such as original Word and Quark docs, so you won't find anything that will manipulate them particularly effectively. Even Acrobat professional is very limited in what it will do, and that is Adobe's own software.

I hear PDFedit is one of the best free tools, but it is Unix and complicated, probably not wrth your time.

Really, don't expect too much, they are not designed to be fucked with.
 
I've used the edit facilities in Acrobat Pro and well lets say it's not a particularly easy process especially if you need to make changes. That changes aren't made to the actual document but are layer over.

Wrong tool. Use touch-up text after running OCR.

The best way I found was to OCR the document to a .doc (and it's straight text the process should be fairly painless if a bit time consuming). A client of mine used Scansoft's Omnipage OCR program which allowed you to either OCR from an scanner or read in from a PDF file. As a Word file you can edit as normal and then convert it back to PDF as well as always having an editiable version around.
Or just hop on someone else's computer who has Pro and use the File->Export->Word Document option. It takes a grand total of about 30 seconds and maintains most formatting and images without fucking them up like OCR can.

Yeah, the tech guys are not willing to install Acrobat Pro on my machine - too pricey. Some individuals have it here, but in order for me to work on their computers, they would have to find somewhere else to sit/work for up to 1-2 days.

I have the Scansoft software - or at least some recent free version. It's giving me a lot of problems which is what prompted this post in the first place. :-(

You can extract specific pages from the PDF's and then either convert them to Word or run OCR so that they are editable, rather than wasting time with the whole thing. Extraction is instant; you select the pages you want and tell it to extract them. If you could get on someone else's computer for an hour or so, you could run through probably 30 or 40 of them and then take your extracted pages back to manipulate on on your machine...

Or, you know, go to your boss and say "I need the tools to do my job" and ask that he tell the all-powerful tech guys to install the appropriate software. They obviously have a site license for it if others have it, for fuck's sake.
 
^ Definitely, extracting everything and creating new documents is the way to go.

I doubt even if they did install Acrobat pro that it would do what Aramis wants, especially where graphics and tables are concerned, I use it all the time to make last minute changes to outputted print files, it's really not very flexible at all.
 
Or just hop on someone else's computer who has Pro and use the File->Export->Word Document option. It takes a grand total of about 30 seconds and maintains most formatting and images without fucking them up like OCR can.

Must be Adobe finally added that as a version 8 feature because it wasn't in v7 that the client had.

go to your boss and say "I need the tools to do my job" and ask that he tell the all-powerful tech guys to install the appropriate software. They obviously have a site license for it if others have it, for fuck's sake.

And those licenses might be in use or there could be a number of reasons but lets face is, Adobe Acrobat is not a cheap a program.
 
I've found this tool to be useful on more than one occasion. If I understand you right, it should do what you need.
 
^ Definitely, extracting everything and creating new documents is the way to go.

I doubt even if they did install Acrobat pro that it would do what Aramis wants, especially where graphics and tables are concerned, I use it all the time to make last minute changes to outputted print files, it's really not very flexible at all.

It depends on what he's doing. If he converts to a Word doc, he can probably do the manipulations he needs, assuming there's little re-design work involved. It just converts the graphics to an image object. There'd probably be some cleaning up involved, but it's the best way to go.

Or just hop on someone else's computer who has Pro and use the File->Export->Word Document option. It takes a grand total of about 30 seconds and maintains most formatting and images without fucking them up like OCR can.

Must be Adobe finally added that as a version 8 feature because it wasn't in v7 that the client had.

You probably also need to have the Office Suite Adobe plug-in or whatever it is installed...

go to your boss and say "I need the tools to do my job" and ask that he tell the all-powerful tech guys to install the appropriate software. They obviously have a site license for it if others have it, for fuck's sake.
And those licenses might be in use or there could be a number of reasons but lets face is, Adobe Acrobat is not a cheap a program.
Yeah, but he's been assigned a task that requires the software. They either need to assign said task to someone with the software, arrange for him to have access to a computer that has the software, or give him the software. Most bosses don't know what the tasks they assign entail, and one must explain to them in small words that "I can do this in x amount of time and with x amount of success with x program, or I can do it in x+y amount of time with x-y amount of success with some random compilation of programs from the internets." They'll generally cough up the money or make something happen to allow the task to be completed. ;)
 
And the wider business process issue is always keep a copy of your content in an editable format... :)
 
^That drives me crazy. When I started my current job, I found that they had kept every goddamned draft of the text for hundreds of documents and not a single design file. :scream:
 
If you are not already a member then please register an account and join in the discussion!

Sign up / Register


Back
Top