This is a completely non-SQL Server post. I had to recently convert a large number of word documents into Web Archive (.MHT) format. I would not mind doing that for one or two documents but when I have over 50 documents to perform this exercise on, it could get cumbersome and monotonous! This is where PowerShell came to the rescue.
The most common example available on the web is to convert Word documents to PDF. What I needed for the work that I was doing was a way to convert the Word documents to the Web Archive format. After a few Bing searches, I was able to determine that the Web Archive format enumeration number was 9. The following link has information about all the enumeration values: http://msdn.microsoft.com/en-us/library/office/bb238158(v=office.12).aspx
The PowerShell script below allows you to traverse a folder, pick all the word documents in the folders recursively and then convert each of those word documents in a .MHT file with the same name in the same location.
The script can be downloaded from OneDrive also.
<# ################################################################################# Script Name: ConvertToWord Author: Amit Banerjee Date: April 28, 2014 Description: This script takes a folder as an input and then converts the docx files present in the folder to web archive documents ################################################################################# This Sample Code is provided for the purpose of illustration only and is not intended to be used in a production environment. THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE. We grant You a nonexclusive, royalty-free right to use and modify the Sample Code and to reproduce and distribute the object code form of the Sample Code, provided that You agree: (i) to not use Our name, logo, or trademarks to market Your software product in which the Sample Code is embedded; (ii) to include a valid copyright notice on Your software product in which the Sample Code is embedded; and (iii) to indemnify, hold harmless, and defend Us and Our suppliers from and against any claims or lawsuits, including attorneys fees, that arise or result from the use or distribution of the Sample Code. The enumeration for the various document types that you can save as from Microsoft Word when you use the SaveAs option Const wdFormatDocument = 0 Const wdFormatDocument97 = 0 Const wdFormatDocumentDefault = 16 Const wdFormatDOSText = 4 Const wdFormatDOSTextLineBreaks = 5 Const wdFormatEncodedText = 7 Const wdFormatFilteredHTML = 10 Const wdFormatFlatXML = 19 Const wdFormatFlatXMLMacroEnabled = 20 Const wdFormatFlatXMLTemplate = 21 Const wdFormatFlatXMLTemplateMacroEnabled = 22 Const wdFormatHTML = 8 Const wdFormatPDF = 17 Const wdFormatRTF = 6 Const wdFormatTemplate = 1 Const wdFormatTemplate97 = 1 Const wdFormatText = 2 Const wdFormatTextLineBreaks = 3 Const wdFormatUnicodeText = 7 Const wdFormatWebArchive = 9 Const wdFormatXML = 11 Const wdFormatXMLDocument = 12 Const wdFormatXMLDocumentMacroEnabled = 13 Const wdFormatXMLTemplate = 14 Const wdFormatXMLTemplateMacroEnabled = 15 Const wdFormatXPS = 18 #> # Replace with the correct folder path # Remove the -recurse option if you only want to convert the documents in the first level folders # Retrieve the list of documents $Files = Get-ChildItem "C:\Windows\*.docx" -recurse foreach ($File in $Files) { # Create the name of the new document $Name = $File.FullName.replace(“docx”,”mht”) if (Test-Path $Name) { # Check if the file already exists # If it does then do not do anything Write-Host "Skipping conversion for " $Name } else { # Save the file as a web archive if it does not exist Write-Host "Creating file " $Name ConvertToMHT $File.FullName } } # Function to convert the file function ConvertToMHT ($FileName) { # Create a word document object $Word=NEW-OBJECT –COMOBJECT WORD.APPLICATION # Open the word document $Doc=$Word.Documents.Open($FileName) # Replace with appropriate document format type using the enumeration provided above in the comments # Save the document in the required format in the same location [ref]$SaveFormat = "System.Object" -as [type] $Doc.saveas([ref] (($FileName).replace(“docx”,”mht”)), [ref]9) # Quit word after closing the document $Doc.close() $Word.Application.Quit() }