willCode4Beer: Extracting Images from MSWord Documents

Monday, November 13, 2006

A new post is up about how to extract images from word docs.
Comments may be left here.

At 22 November, 2006 17:36, Anonymous said...: javax.xml.parsers.ParserConfigurationException: AElfred parser is non-validating
at com.icl.saxon.aelfred.SAXParserFactoryImpl.newSAXParser(SAXParserFactoryImpl.java:34)
at com.doylecentral.word.BinaryExtractor.parseXml(BinaryExtractor.java:23)
at com.doylecentral.word.FileTester.main(FileTester.java:22)
At 22 November, 2006 18:19, Anonymous said...: The above problem can be solved by removing saxon from the classpath.
At 23 May, 2007 04:37, code66 said...: A nice but somewhat difficult way to get the images from word.
There is a much simpler way:
save your document as a html page. Word wil create a directory containing all images used in the document saved as png/gif/jpg
At 29 June, 2007 06:28, Anonymous said...: Absolutely brilliant and soooo simple.

Thanks
Bruce