minidom.writexml handling unicode errors

Problem:Writing xml with unicode characters does not work, even with the encoding set to ‘utf-8’. Always results in something like UnicodeEncodeError(‘ascii’, u’kr\xf6se1989′, 2, 3, ‘ordinal not in range(128)’) Solution (from here):Use the codecs module and manually open the file in utf-8 mode for writing import codecsdoc.writexml(codecs.open(fname,’wb’,’utf-8′), encoding=’utf-8′)

Compress camera video for a windows machine

mencoder -ovc lavc -lavcopts vcodec=msmpeg4v2:vpass=1:vbitrate=375000:mbd=2:keyint=132:vqblur=1.0:cmp=2:subcmp=2:dia=2:mv0:last_pred=3 -oac copy -o /dev/null MVI_0052.AVImencoder -ovc lavc -lavcopts vcodec=msmpeg4v2:vpass=2:vbitrate=375000:mbd=2:keyint=132:vqblur=1.0:cmp=2:subcmp=2:dia=2:mv0:last_pred=3 -oac copy -o output.avi MVI_0052.AVI

Python and word documents

From here, using only standard python modules: import zipfile, re docx = zipfile.ZipFile(‘/path/to/file/mydocument.docx’)content = docx.read(‘word/document.xml’)cleaned = re.sub(”,”,content)print cleaned But, if you want to mess around in more detail in the document, then we can use the python-docx module.

discount bug

import discounta = discount.Markdown(”)a.get_html_content()–> Yields TypeErrorTypeError: ‘NoneType’ object is unsubscriptableShould return a blank. Found that markdown or markdown2 give comparable performance, so now no longer using discount.