Pages in topic: [1 2] > | Merge and split MS Word files Thread poster: Samuel Murray
| Samuel Murray Netherlands Local time: 02:33 Member (2006) English to Afrikaans + ...
G'day everyone
A client sent me 100 MS Word files that I need to run some macros on and do some stuff with. It would be a lot simpler if it was just one loooooooong MS Word file. Do you know of reliable utilities that can merge multiple MS Word files and then split it up again into the original file names later?
I can do merging and splitting if the files are plain text, but in this case the files contain formatting, so I can't just convert it to text and then merge a... See more G'day everyone
A client sent me 100 MS Word files that I need to run some macros on and do some stuff with. It would be a lot simpler if it was just one loooooooong MS Word file. Do you know of reliable utilities that can merge multiple MS Word files and then split it up again into the original file names later?
I can do merging and splitting if the files are plain text, but in this case the files contain formatting, so I can't just convert it to text and then merge and split it.
The files are in DOC format and I generally use Word 2003 (although I do have Word 2007 available too). There are no tables and text boxes, fortunately -- just formatted text.
I understand that there are utilities that allow you to process multiple MS Word files, but I need to run some pretty complex macros on the file(s), so such multifile utilities probably won't work for me. I just need to merge it all and later split it all.
Any ideas?
Thanks
Samuel ▲ Collapse | | | Tony M France Local time: 02:33 Member French to English + ... SITE LOCALIZER Handy utility | Jan 18, 2013 |
I did once come across a handy utility (around $20) called soemthing like 'split/join' — it actually comes as 2 separate utilities, one for each feature — which does exactly that.
The only real snag I found was that when splitting the files out again, i couldn't find any way to reinstate their original filenames. My workaround was to include a 'filename' field in an added header/footer in each doc (still a pain, though, doing that on 100 docs!), which meant the customer did at l... See more I did once come across a handy utility (around $20) called soemthing like 'split/join' — it actually comes as 2 separate utilities, one for each feature — which does exactly that.
The only real snag I found was that when splitting the files out again, i couldn't find any way to reinstate their original filenames. My workaround was to include a 'filename' field in an added header/footer in each doc (still a pain, though, doing that on 100 docs!), which meant the customer did at least have that as a reference at the end of the day.
Sorry I can't give you the actual source of this utility (it was on a PC 4 lives ago!) — but I originally found it quite easily using Google. Hopefully, they may even have imporved it by now! ▲ Collapse | | | Folder Action | Jan 18, 2013 |
Samuel Murray wrote:
It would be a lot simpler if it was just one loooooooong MS Word file.
Even simpler - and less risky - would be to automatically process the files one by one, so without joining them. I can think of something for OS X - a "folder action" - and I suppose there is something similar for Windows.
Cheers,
Hans
[Edited at 2013-01-18 10:21 GMT] | | | Rolf Keller Germany Local time: 02:33 English to German Macros may consume a long time | Jan 18, 2013 |
Samuel Murray wrote:
A client sent me 100 MS Word files that I need to run some macros on and do some stuff with. It would be a lot simpler if it was just one loooooooong MS Word file.
Caution! It depends on the macros. There are some macros that have a quadratical time-comsuming behaviour: If one file needs 8 seconds, two such files combined will need 32 secs, and with 128 files combined the duration will be 36 hours. It would be very annoying if you discern - after 20 hours - that something has gone wrong, because one of the the original files contains a tiny speciality.
So, be aware of what your macros do. If they all have a linear time-consuming behaviour, there will be no problem, though. | |
|
|
Potential workaround with TagEditor | Jan 18, 2013 |
Trados 2007 has a Glue feature (SDL 2007>SDL Trados 2007 Freelance>Trados>Tools>SDL Trados Glue in the Programs tree), so maybe you could drag/drop your Word files to the TagEditor window, save them all (100 clicks/shortcuts to save icons, 100 clicks/shortcuts to close each file), use the Glue feature, work on the resulting TagEditor file, then split it back to the original files.
There may also be options to work in Word (rtf?) with the glued file if you need to run macros.
<... See more Trados 2007 has a Glue feature (SDL 2007>SDL Trados 2007 Freelance>Trados>Tools>SDL Trados Glue in the Programs tree), so maybe you could drag/drop your Word files to the TagEditor window, save them all (100 clicks/shortcuts to save icons, 100 clicks/shortcuts to close each file), use the Glue feature, work on the resulting TagEditor file, then split it back to the original files.
There may also be options to work in Word (rtf?) with the glued file if you need to run macros.
Philippe ▲ Collapse | | | Diana Coada (X) United Kingdom Local time: 01:33 Portuguese to English + ... I use Nitro Pdf | Jan 18, 2013 |
Save the Word files as pdf, use Nitro to merge or split them and then convert them back to Word with all the formatting intact. | | | Rolf Keller Germany Local time: 02:33 English to German What means "formatting" anyway? | Jan 18, 2013 |
Diana Coada wrote:
Save the Word files as pdf, use Nitro to merge or split them and then convert them back to Word with all the formatting intact.
PDF files do not contain any "real" formatting info. They contain only an optical representation.
Any links between text elements and style templates get lost, because all the style templates get lost. Even simple elements like tabulators get lost. The difference between hard and soft hyphens gets lost, as well as the difference between hard and soft linebreaks.
Actually you get a file that may look similar to the original, but is not properly editable - its just data garbage like a telefax. | | | Samuel Murray Netherlands Local time: 02:33 Member (2006) English to Afrikaans + ... TOPIC STARTER PDF idea seems a bit odd | Jan 18, 2013 |
Rolf Keller wrote:
Diana Coada wrote:
Save the Word files as pdf, use Nitro to merge or split them and then convert them back to Word with all the formatting intact.
PDF files do not contain any "real" formatting info. They contain only an optical representation.
I must say that I did not seriously think that the PDF method would work either. However, if Diana is willing, I can send her two or three sample files to convert to PDF and then convert back again, to see if the formatting remains intact after all. But I doubt if it would. | |
|
|
Diana Coada (X) United Kingdom Local time: 01:33 Portuguese to English + ... No problem, Samuel | Jan 18, 2013 |
You're welcome to send me the files | | | Samuel Murray Netherlands Local time: 02:33 Member (2006) English to Afrikaans + ... TOPIC STARTER
Diana Coada wrote:
You're welcome to send me the files
I sent Diana three sample files for the round-trip experiment. She merged them in PDF and then converted the PDF to an output DOC file. As I has suspected, the round-trip was not very successful, even though it appeared quite promising at first.
99.9% of the actual text appears to have survived the conversion (if we don't take extra line breaks into account, and if we don't take hidden text into account).
In the output DOC file, hard returns were inserted in mid-sentence in several places (as can be expected from a conversion from PDF). Also, the margins were awfully narrow in at least once place, which caused text to appear broken in mid-word.
In the converted output DOC file, several lines of text from the top of each file were missing (even though they were present in the PDF file).
There were also one or two instances of character changes between the PDF and the output DOC. In one file, backslashes were replaced by yen signs (though interestingly when I copied the yen signs to a new, blank document, they magically changed back to backslashes again). Most characters remained intact, however (e.g. the original files had multiple types of quotes, and all of the quote types survived the round-trip).
Which brings us to the most important changes:
1. The original files had hidden text, which were not present in the PDF, and consequently were not present in the output DOC file.
2. The original files had text in black, grey and red, but in the original files the text also had styles, and these styles were missing in the PDF, and consequently were missing in the output DOC file.
Thanks, Diana, for doing this experiment with us. | | | The wrong way | Jan 20, 2013 |
I'm still convinced merging/splitting is not the way to go. Even if the PDF trick would have worked, I see no way to arrive at the the original (100) files with there appropriate names, apart from splitting and naming them manually.
I would create an Automator action more or less like this:
You just dump your 100 files on the Folder Action (that looks like a folder), wait a few seconds, and that's all there is to it.
This is not blatant OS X promotion. In fact, Microsoft made around 100 ready-to-use MS Office "actions" available - if not more - to be used with Automator, and those are the actions I use most often. The nice guys from Redmond wouldn't have done that is there wasn't a Windows alternative. The trouble is, I don't know of any, and although AutoHotkey comes to mind, I don't know if it works.
Cheers,
Hans | | | Samuel Murray Netherlands Local time: 02:33 Member (2006) English to Afrikaans + ... TOPIC STARTER A merger would have a splitter | Jan 20, 2013 |
Meta Arkadia wrote:
I'm still convinced merging/splitting is not the way to go. Even if the PDF trick would have worked, I see no way to arrive at the the original (100) files with there appropriate names, apart from splitting and naming them manually.
Well, a good merger would have an appropriate splitter to go along with it, that names the files correctly. It would not be strange to me if the merger would place an extra page between merged files (with page breaks) that includes a special code with the file's original name in it.
I would create an Automator action more or less like this...
I've never heard of this feature for Windows. It would have to be integrated with MS Word itself, wouldn't it?
Although AutoHotkey comes to mind, I don't know if it works.
Well, yes, I could script the repetitive actions in AutoIt or AutoHotKey. But doing so means having to figure out what happens during each of my macros (e.g. what possible error messages may pop up) so that I can script the appropriate responses to those actions. Otherwise the script will break (best case scenario) or do some damage to the computer while I'm not looking (worst case scenario).
I'm positive that merging and splitting must be possible in an MS Word macro. I can script it in AutoIt (in fact, I worked out how to do it and managed to do the merging already), but surely an MS Word macro would be better. | |
|
|
Rolf Keller Germany Local time: 02:33 English to German XXX->PDF->XXX will not work, if XXX is a fully editable format | Jan 20, 2013 |
Samuel Murray wrote:
As I has suspected, the round-trip was not very successful
That's was to be expected. Probably you had seen even more problems if you had converted all the 100 files.
Merging/splitting via PDF can work only if the DOC files meet a **lot** of rigid criteria. In general the DOC-> PDF conversion deletes much info. The PDF->DOC conversion adds much info, but this will never be the prevoiusly deleted info - except if the converter software includes a clever clairvoyant.
How would you check the 100 files upfront to make sure that none of them contains any unconvertable features?
styles were missing in the PDF, and consequently were missing in the output DOC file.
Een if the DOC->PDF converter would include the styles, maybe in form of comments, this cannot work under alll circumstances. Just imagine a setting like "Apply this to the whole document". Or imagine that there are differently defined styles with identical names.[/quote] | | | Natalie Poland Local time: 02:33 Member (2002) English to Russian + ... MODERATOR SITE LOCALIZER | Tony M France Local time: 02:33 Member French to English + ... SITE LOCALIZER A Word macro | Jan 20, 2013 |
Can't vouch for it, but I found this, and from the posting date, it predates W2007, 2010, etc.
Sub ConcatenateAllWordFiles()
With Application.FileSearch
.NewSearch
.LookIn = "C:\Test" 'Set this to your directory full of files.
.SearchSubFolders = True 'Set this to false if you don't want subfolders included
.Execute
For i = 1 To .FoundFiles.Count
If Right(.FoundFiles(i), 4) = ".doc" Then
Docume... See more Can't vouch for it, but I found this, and from the posting date, it predates W2007, 2010, etc.
Sub ConcatenateAllWordFiles()
With Application.FileSearch
.NewSearch
.LookIn = "C:\Test" 'Set this to your directory full of files.
.SearchSubFolders = True 'Set this to false if you don't want subfolders included
.Execute
For i = 1 To .FoundFiles.Count
If Right(.FoundFiles(i), 4) = ".doc" Then
Documents.Open FileName:=.FoundFiles(i), _
ConfirmConversions:=False, ReadOnly:=False, AddToRecentFiles:=False, _
PasswordDocument:="", PasswordTemplate:="", Revert:=False, _
WritePasswordDocument:="", WritePasswordTemplate:="", Format:= _
wdOpenFormatAuto
current = ActiveDocument.Name
Selection.WholeStory
Selection.Copy
Documents(current).Close
Selection.Paste
Selection.EndKey Unit:=wdLine
End If
Next i
End With
End Sub
posted by pompomtom at 5:33 PM on April 7, 2005
HOWEVER, that doesn't solve the problem of how to split them out again ▲ Collapse | | | Pages in topic: [1 2] > | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Merge and split MS Word files Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
| Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |