Search Results for

    Show / Hide Table of Contents

    Convert HTML to Word

    Namespace: Clippit.Html

    Convert an XHTML document to a Word document, with configurable CSS cascading and page layout settings.

    public class HtmlToWmlConverter {
        public static WmlDocument ConvertHtmlToWml(
            string defaultCss, string authorCss, string userCss,
            XElement xhtml, HtmlToWmlConverterSettings settings)
        {...}
    
        public static WmlDocument ConvertHtmlToWml(
            string defaultCss, string authorCss, string userCss,
            XElement xhtml, HtmlToWmlConverterSettings settings,
            WmlDocument emptyDocument, string annotatedHtmlDumpFileName)
        {...}
    
        public static HtmlToWmlConverterSettings GetDefaultSettings()
        {...}
    
        public static string CleanUpCss(string css)
        {...}
    }
    

    The converter accepts XHTML as an XElement and produces a WmlDocument. CSS is applied in three layers following the CSS cascade: default (browser-like defaults), author (document styles), and user (overrides). The GetDefaultSettings() method returns a pre-configured HtmlToWmlConverterSettings with sensible page layout defaults.

    HtmlToWmlConverterSettings

    Field Type Description
    MajorLatinFont string Major (heading) Latin font
    MinorLatinFont string Minor (body) Latin font
    DefaultFontSize double Default font size
    DefaultSpacingElement XElement Default paragraph spacing
    DefaultSpacingElementForParagraphsInTables XElement Default spacing in tables
    SectPr XElement Section properties (page size, margins)
    DefaultBlockContentMargin string Default margin for block content
    BaseUriForImages string Base URI for resolving image paths

    The settings also expose read-only properties derived from SectPr: PageWidthTwips, PageMarginLeftTwips, PageMarginRightTwips, PageWidthEmus, PageMarginLeftEmus, PageMarginRightEmus.

    Static Properties

    Property Type Description
    EmptyDocument WmlDocument A minimal empty Word document used as a template

    HtmlToWmlConverter Sample

    var settings = HtmlToWmlConverter.GetDefaultSettings();
    settings.BaseUriForImages = "/images/";
    
    var defaultCss = File.ReadAllText("default.css");
    var authorCss = File.ReadAllText("styles.css");
    var userCss = "";
    
    var xhtml = XElement.Parse(@"
    <html xmlns='http://www.w3.org/1999/xhtml'>
    <head><title>Sample</title></head>
    <body>
        <h1>Hello World</h1>
        <p>This is a <strong>sample</strong> document converted from HTML.</p>
        <table>
            <tr><td>Cell 1</td><td>Cell 2</td></tr>
            <tr><td>Cell 3</td><td>Cell 4</td></tr>
        </table>
    </body>
    </html>");
    
    var doc = HtmlToWmlConverter.ConvertHtmlToWml(
        defaultCss, authorCss, userCss, xhtml, settings);
    doc.SaveAs("output.docx");
    
    • Edit this page
    In this article
    Back to top Generated by DocFX