XML.com: Word to XML and Back Again: “In this article, I will show you how to take the frighteningly messy result of Word’s ‘Save as Web Page’ and turn it into well-formed XML, using a few lines of Python and a touch of XSLT. Grab the sample Python application, and if you have libxml2 installed, you can type:

python wordconverter.py mydoc.htm > mydoc.xml

python wordconverter.py mydoc.xml > mynewdoc.htm”

-m