Document Metrics
Namespace: Clippit
Analyze Word, Excel, and PowerPoint documents and return detailed metrics as XML.
public class MetricsGetter {
public static XElement GetMetrics(string fileName, MetricsGetterSettings settings)
{...}
public static XElement GetDocxMetrics(WmlDocument wmlDoc, MetricsGetterSettings settings)
{...}
public static XElement GetXlsxMetrics(SmlDocument smlDoc, MetricsGetterSettings settings)
{...}
public static XElement GetPptxMetrics(PmlDocument pmlDoc, MetricsGetterSettings settings)
{...}
}
MetricsGetter inspects Office documents and produces an XElement containing metrics such as
paragraph counts, character counts (by script), run counts, content control details, embedded
objects, hyperlinks, revision tracking status, namespaces, and validation errors.
The GetMetrics method auto-detects the file type by extension and delegates to
GetDocxMetrics, GetXlsxMetrics, or GetPptxMetrics.
MetricsGetterSettings
| Property | Type | Default | Description |
|---|---|---|---|
IncludeTextInContentControls |
bool |
false |
Include text content of content controls in output |
IncludeXlsxTableCellData |
bool |
false |
Include cell data from Excel tables |
RetrieveNamespaceList |
bool |
false |
Include list of namespaces used in the document |
RetrieveContentTypeList |
bool |
false |
Include list of content types of document parts |
GetMetrics Sample
var settings = new MetricsGetterSettings
{
RetrieveNamespaceList = true,
RetrieveContentTypeList = true
};
var metrics = MetricsGetter.GetMetrics("document.docx", settings);
Console.WriteLine(metrics.ToString());
Output includes elements such as:
ParagraphCount,RunCount,AsciiCharCount,EastAsiaCharCountRevisionTracking(whether the document has tracked revisions)ContentControls(if present)Namespaces(ifRetrieveNamespaceLististrue)- Validation errors from the OpenXml SDK validator
GetDocxMetrics Sample
var wmlDoc = new WmlDocument("report.docx");
var settings = new MetricsGetterSettings
{
IncludeTextInContentControls = true
};
var metrics = MetricsGetter.GetDocxMetrics(wmlDoc, settings);
// Extract specific values
var paragraphCount = (int?)metrics.Element("ParagraphCount");
var hasRevisions = (bool?)metrics.Element("RevisionTracking")?.Attribute("Val");
Console.WriteLine($"Paragraphs: {paragraphCount}, Has revisions: {hasRevisions}");
GetXlsxMetrics Sample
var smlDoc = new SmlDocument("data.xlsx");
var settings = new MetricsGetterSettings
{
IncludeXlsxTableCellData = true
};
var metrics = MetricsGetter.GetXlsxMetrics(smlDoc, settings);
Console.WriteLine(metrics.ToString());