I'm pleased to announce the release of the
Microsoft Word Unmunger,
a small Python program to remove cruft from Microsoft Word 2002's HTML
output (
Freshmeat page). It removes XML namespace declarations, smart tags, meta
tags, HTML comments, style sheets, DIVs, the file list, CSS classes, and
Office grammar and spelling error markers -- perfect for making Microsoft
Word-produced HTML hand-editable. The Word Unmunger is released under the
permissive MIT License.
Enjoy. And send bug reports to look@recursion.org.