Toward a Sneaky Way To Export Better to Word

For all things Mellel

Moderators: redlers, Eyal Redler, Ori Redler

Toward a Sneaky Way To Export Better to Word

Postby laup » Tue Nov 10, 2015 10:27 pm

Although I sometimes am able to write in Mellel, it is necessary for me to go into Word to deal with others in my organization. That is problematic for reasons readers of the forum understand. However, I have come up with perhaps 80% of a better way.

As background, if I export to RTF and then convert in Word to .docx, my manuscripts typically have a lot of problems in the .doc version:
— some empty squares instead of bullets
— all arabic-numbered endnotes get converted to roman-numeral endnotes
— many cross-reference-related items, such as page numbers, get replaced by question marks
— the front section's special numbering in roman numerals is replaced by regular numbering 1, 2....
—page numbering gets fouled up
—colored table cells lose their colors.
(pretty ugly)

Now, suppose instead that I export from Mellel to pdf and then use Adobe Acrobat to save-as to Word? I find that many (but not all) of the problems do not arise. In particular:
— The bullets translate properly
— The endnotes numbering is preserved in arabic numbers
— The cross-reference items, such as Figure numbers or page numbers, are correct
— The front matter's numbering is still a bit wrong for a reason I don't yet understand (the page numbers are blank, ii, iii, iv, 5, 6, vii,...). Probably idiosyncratic to my manuscript.
— Colored table cells preserve their colors
In addition, there are a few odd glitches in graphics (pdf images inserted in text). Also, if I view "all" (hidden characters, etc.), the figures have lots of confusing stuff related to section breaks. Those go away if one unclicks the nonprinting characters in Preferences/View of Word.

Still, it seems that by going through Adobe Acrobat, which some Mellel users may have, I avoid the vast majority of the export to Word problems. I can fix the exceptions pretty easily.

This report is tentative, based on one 100 page manuscript with lots of formatting, footnotes, and figures, but it seemed to me encouraging. So, I pass it on for what it's worth. Perhaps, with a bit of work on the Redlers' part, there could be a recipe that would be pretty reliable.
Paul
laup
Knows everything, can prove it
 
Posts: 210
Joined: Tue Aug 22, 2006 4:13 am
Location: Topanga, California

Re: Toward a Sneaky Way To Export Better to Word

Postby laup » Fri Nov 13, 2015 6:37 pm

***Encouraging Follow-up Using a Different PDF to Word Converter***
I obtained close to 100% accuracy in translating a complex 100 page Mellel document with figures, tables, endnotes, footnotes, and cross references by (1) exporting to pdf and (2) using the free docs.zone [rather than Adobe Acrobat] to translate the result to Word. In using the docs.zone app, I chose the option to make the mapping "exact" rather than flowing. That takes longer, but is important.

Correctly mapped:
    bullets
    Front-matter and main-text pagination (numbers and formats)
    Page breaks and page numbering
    Page numbers based on cross references
    Coloring of cells in tables
    Endnote numbering and format
    Nearly all figures (which were pdfs of graphics), including a few figures that Adobe Acrobat messed up

Incorrectly mapped:
Figure and table titles in Lucida Grande were mapped to Lucida Console
Sub section titles went from Ludica Sans to Lucida Sans Unicode
A few figures were slightly bolluxed up. Apparently, the translation operates on the content of the pdf figures and doesn't do so quite perfectly, e.g.
a legend "(did deterrence work?" mapped into the same thing, but with the ( and ?) a couple of points higher than the rest of the label.

If I used the "flowing" option, rather than the "exact" option, there were some errors in page breaks, more overall pages, some cosmetic problems in tables in which I had highlighted certain rows by thinking cell boundaries. Thus, the "exact" option is worth the trouble.

Lost, as expected:
the document's knowledge of its own structure (e.g., the Document Map Pane shows nothing and one can't easily update the table of contents).

Implications:
I believe that results can be extremely close to 100% by avoiding certain fonts and using .png images for figures, rather than vector-pdf images. I did some experimentation with this, using Arial, rather than variants of Lucida, and using .png files. Results appear dead-on, although I'm sure that a proof reader quality would find something, somewhere.

It would be better, of course, if the export to rtf from Mellel were perfect. That must be difficult or the Redlers would have done it years ago. Lacking that, it seems that the recipe I provide here can make generation of .docx versions extremely accurate. Further, the Redlers might be able to indicate the list of fonts that are very likely not going to cause any trouble. It's probably pretty long at this point, more than just Times New Roman, Arial, and Helvetica, but--regrettably--the format my organization uses managed to find some that docs.zone, at least didn't handle perfectly.

All in all, this seems like very good news. I can't say what would happen with the many other pdf-to-word converters, other than to note that Adobe Acrobat consistently messed up the pdf to Word mapping of front-matter pagination, generating front matter with page numbers like i, ii, 3, 4, v, 6. I was unable to see any reason for this. Otherwise, as I reported initially, the results were excellent.
Paul
laup
Knows everything, can prove it
 
Posts: 210
Joined: Tue Aug 22, 2006 4:13 am
Location: Topanga, California

Re: Toward a Sneaky Way To Export Better to Word

Postby Manu31 » Wed Nov 25, 2015 1:56 am

Thanks a lot for this invaluable tip. I have a 500 pages manuscript with lots of cross-references, and was anxiously preparing to painful days of correction.
You made my day.
Thanks again.
Emmanuel
Manu31
Got the styles thing figured out
 
Posts: 12
Joined: Mon Aug 18, 2014 5:34 pm

Re: Toward a Sneaky Way To Export Better to Word

Postby laup » Thu Nov 26, 2015 1:13 am

Emmanuel, although I had no trouble with a 100-page complex document, you may run into difficulties with your manuscript because it is so big. If so, I would think that you could convert chunks of the pdf one 100-page chunk at a time and then combine them in Word using Insert File. This is just speculation on my part, however. It will be good to get your data point.
Paul
laup
Knows everything, can prove it
 
Posts: 210
Joined: Tue Aug 22, 2006 4:13 am
Location: Topanga, California

Re: Toward a Sneaky Way To Export Better to Word

Postby jas » Thu Jul 27, 2017 7:05 pm

In an effort to keep the conversation going for the need for Mellel to export to .docx format, I would like to add that I recenlty used pdf2docx.com to convert a 92 page exported PDF to .docx with no problems. Unlike doc.zone it was free. If I export to Word directly from Mellel 3.5xx it converts all of my footnotes to endnotes amd moves charts. photos etc.

In any case, I hope that export to .docx will be inlcuded in Mellel 4.0. Finger's crossed!
jas
Got the styles thing figured out
 
Posts: 11
Joined: Thu Oct 14, 2010 10:30 am

Re: Toward a Sneaky Way To Export Better to Word

Postby nadinbrzezinski » Sat Jul 29, 2017 5:23 am

I used PDF converter. It is the endnotes that it has issues with. I am starting to think that this is a serious problem. I am looking forwards to days of correcting endnotes. I am considering going back to scrivener. Too bad, but am afraid text correction will take as long as editing in scrivener.

I like this program. But it only does PDF on IPad, and the exports on anything but PDF are mostly useless on other formats.
nadinbrzezinski
Got the styles thing figured out
 
Posts: 8
Joined: Tue Mar 04, 2008 6:57 pm

Re: Toward a Sneaky Way To Export Better to Word

Postby Bulow » Wed Aug 02, 2017 8:48 am

Everyone has different problems. I have always - and I often need it - used export to rtf and then import into whatever program does docx, and it works reasonably well for the sort of thing I do: formatting important, lots of ancient Greek, no endnotes but many footnotes. I tried converting both with docs.zone (which is not free) and pdf2docx.com and they each have their disadvantages, like pdf2docx suppressing all tabulations and some ends-of-lines, but the worst is that both chose Arial for the Greek which is inadequate leaving all special signs as squares. The sneaky’ way was definitely not an option for me.
A recent experience was equally horrible: importing rtf into Mellel. Not the fault of Mellel, for horrors took place already in the Mac/Word of my correspondent. Greek was again transfomed into a mixture of Arial and TimesNR and squares for special signs, although we both have and use the necessary Greek font. Apparently the font-substitution took place already inside Word when creating the rtf.
I hope that Mellel 4 will attack this problem seriously and well.
Bulow
Bulow
Knows everything, can prove it
 
Posts: 122
Joined: Tue Nov 25, 2008 11:49 am
Location: Paris (sometimes Copenhagen)

Re: Toward a Sneaky Way To Export Better to Word

Postby laup » Thu Aug 03, 2017 6:41 pm

Today I converted a 100 page complex Mellel document to Word .docx using export to pdf, open in Adobe Acrobat Pro DC, and export to Word. The conversion is essentially perfect. The conversion
—Handled front-matter pagination as intended (i, ii, iii,...), as in the Mellel manuscript but not as in the rtf export
--Maintained pagination, which meant that cross references are correct
--Treated graphics properly (i.e., didn't mangle them).
-- Created footnotes properly (I had no endnotes)
-- Handled tables properly (with some very very minor exceptions relating to cell borders in complex tables)

I would urge anyone wanting to get an excellent .docx version with minimum hassle or risk to consider this option, even though Acrobat is regrettably expensive (less so for academics and companies with licenses). I had tried some of the less expensive options on the web for pdf to Word conversion (some of them claiming to be free). My virus detector went off with on of them, which spooked me; another would't handle the lengthy document on the free trial; and another one didn't work quite right. One of them may have been perfect for my needs at a reasonable price, but I gave up on the hassle and went to Adobe.

Cautions. I did have some problems with all this. I resolved them with three steps:
1. Insert each image in the Mellel manuscript, i.e., each photo, equation, hand-drawn figure, etc.) as a .png file (NOT .pdf). With some drawing programs, a .pdf may be the default for a simple copy and paste. That causes trouble because Adrobat's optical character recognition tries to read such images, and fouls things up.
2. Insert each image with the format option of in-line, wrap, and center the image between blank spaces. This is NOT the Mellel default and the default caused a lot of trouble in the exporting from Mellel. I have reported this as a problem to the Redlers folks.
4. Change the preference in Acrobat from Page layout: Flowing text to Retain page Layout. Otherwise, pagination will get fouled up and cross references will be wrong. If you do use the Retain page layout option, the resulting Word document is not immediately editable. At first glance, it seems as though it's just a bunch of images, paragraph by paragraph. Fortunately, you can edit any particular paragraph by CONTROL-SELECT/Edit Text/ That would be important for an editor or collaborator. I don't know the equivalent command in Word for Windows. Of course, if you're not using cross references or indexes, then you may just want to use the Acrobat default.
Paul
laup
Knows everything, can prove it
 
Posts: 210
Joined: Tue Aug 22, 2006 4:13 am
Location: Topanga, California


Return to Mellel

Who is online

Users browsing this forum: Google [Bot] and 3 guests

cron