Office HTML Cleansing Revisited

Another file full of villainous Word-generated HTML crossed the geekroom desk today. It had footnotes, which worked fine in Firebird but not at all in Internet Explorer. In that browser, the first footnote number looked like:

[1]

That’s ugly!

Following John’s suggestion from last time this issue came up, I installed Mirosoft’s Office HTML Filter to remove all that weird Office-specific markup, and ran it on the file. The results were very agreeable, and the file now renders correctly in IE as well as Firebird.

2 Responses to “Office HTML Cleansing Revisited”

  1. Delphine Breedlove says:

    Internet Explorer (formerly Microsoft Internet Explorer and Windows Internet Explorer, commonly abbreviated IE or MSIE) is a series of graphical web browsers developed by Microsoft and included as part of the Microsoft Windows line of operating systems, starting in 1995. It was first released as part of the add-on package Plus! for Windows 95 that year. Later versions were available as free downloads, or in service packs, and included in the OEM service releases of Windows 95 and later versions of Windows.^…..

    http://www.caramoan.co

    My own internet site