Convert Word to HTML
Namespace: Clippit.Word
Convert a Word document to an HTML XElement, with configurable CSS generation and image handling.
public static class WmlToHtmlConverter {
public static XElement ConvertToHtml(
WmlDocument doc, WmlToHtmlConverterSettings htmlConverterSettings)
{...}
public static XElement ConvertToHtml(
WordprocessingDocument wordDoc, WmlToHtmlConverterSettings htmlConverterSettings)
{...}
}
The converter produces a complete HTML document as an XElement (XHTML). It generates CSS classes
for paragraph and character styles, handles numbering/lists, and processes images through a
configurable ImageHandler callback.
An extension method is also available directly on WmlDocument:
WmlDocument doc = new WmlDocument("input.docx");
XElement html = doc.ConvertToHtml(settings);
WmlToHtmlConverterSettings
| Field | Type | Default |
|---|---|---|
PageTitle |
string |
"" |
CssClassPrefix |
string |
"pt-" |
FabricateCssClasses |
bool |
true |
GeneralCss |
string |
"span { white-space: pre-wrap; }" |
AdditionalCss |
string |
"" |
RestrictToSupportedLanguages |
bool |
false |
RestrictToSupportedNumberingFormats |
bool |
false |
ImageHandler |
Func<ImageInfo, XElement> |
null |
ImageInfo
The ImageHandler callback receives an ImageInfo object for each image in the document:
| Field | Type | Description |
|---|---|---|
Image |
SkiaSharp.SKBitmap |
The decoded image |
ImgStyleAttribute |
XAttribute |
The computed style attribute (width/height) |
ContentType |
string |
The image MIME type |
DrawingElement |
XElement |
The source OpenXml drawing element |
AltText |
string |
Alternative text for the image |
WmlToHtmlConverter Sample
var doc = new WmlDocument("input.docx");
var settings = new WmlToHtmlConverterSettings
{
PageTitle = "My Document",
CssClassPrefix = "doc-",
FabricateCssClasses = true,
AdditionalCss = "body { font-family: Calibri, sans-serif; }",
ImageHandler = imageInfo =>
{
// Convert images to inline base64 data URIs
using var stream = new MemoryStream();
using var image = SKImage.FromBitmap(imageInfo.Image);
if (image == null) return null;
using var data = image.Encode(SKEncodedImageFormat.Png, quality: 80);
if (data == null) return null;
data.SaveTo(stream);
var base64 = Convert.ToBase64String(stream.ToArray());
var imgElement = new XElement(
Xhtml.img,
imageInfo.ImgStyleAttribute,
new XAttribute("src", $"data:image/png;base64,{base64}"),
new XAttribute("alt", imageInfo.AltText ?? "")
);
return imgElement;
}
```
> [!NOTE]
> The `ImageHandler` callback uses SkiaSharp types (`SKImage`, `SKEncodedImageFormat`).
> Add `using SkiaSharp;` at the top of your file to use them.
};
var html = WmlToHtmlConverter.ConvertToHtml(doc, settings);
File.WriteAllText("output.html", html.ToString());