Microsoft Word doc documents handling on Linux command line with Antiword
Table of Contents
Sometimes it is convenient and necessary to deal with Microsoft Word doc documents directly on Linux command line. MS Word documents handling directly from the Linux command line is possible with Antiword program. Anti-Word is a small, easy to use and lightweight command line program that can convert Microsoft Word documents to PDF, PostScript, and XML.
For example, the following situations Antiword is a very useful tool:
- MS Word document can be read from a remote server easily and quickly.
- Several MS Word documents, texts need to be combined into a single text file.
- Several MS Word documents have to find some lines of text.
- MS Word document must be converted to PDF format.
Anti Word usage⌗
Basic usage, reading Microsoft Word document on console:⌗
antiword microsoft_word.doc
MS Word document into a text file:⌗
antiword microsoft_word.doc > text_file.txt
Converting MS Word document to PDF:⌗
antiword -a a4 microsoft_word.doc > pdf_file.pdf
Note: a4 mean paper size like: a4, letter or legal
Converting MS Word document to PostScript:⌗
antiword -p a4 microsoft_word.doc > ps_file.ps
Note: a4 mean paper size like: a4, letter or legal
Converting MS Word document to XML:⌗
antiword -x db microsoft_word.doc > xml_file.xml
Note: db mean dtd like: db (DocBook)
Find text rows from multiple MS Word documents:⌗
antiword .doc |grep text_to_find
Combine multiple MS Word documents to text file:⌗
antiword document1.doc document2.doc document3.doc > text_file.txt
Note: In the same way can also be done in PDF, PS, and XML files from a multiple MS Word documents.
Other Antiword options can be found with command:⌗
antiword --help