Search Results for

    Show / Hide Table of Contents

    Document Metrics

    Namespace: Clippit

    Analyze Word, Excel, and PowerPoint documents and return detailed metrics as XML.

    public class MetricsGetter {
        public static XElement GetMetrics(string fileName, MetricsGetterSettings settings)
        {...}
    
        public static XElement GetDocxMetrics(WmlDocument wmlDoc, MetricsGetterSettings settings)
        {...}
    
        public static XElement GetXlsxMetrics(SmlDocument smlDoc, MetricsGetterSettings settings)
        {...}
    
        public static XElement GetPptxMetrics(PmlDocument pmlDoc, MetricsGetterSettings settings)
        {...}
    }
    

    MetricsGetter inspects Office documents and produces an XElement containing metrics such as paragraph counts, character counts (by script), run counts, content control details, embedded objects, hyperlinks, revision tracking status, namespaces, and validation errors.

    The GetMetrics method auto-detects the file type by extension and delegates to GetDocxMetrics, GetXlsxMetrics, or GetPptxMetrics.

    MetricsGetterSettings

    Property Type Default Description
    IncludeTextInContentControls bool false Include text content of content controls in output
    IncludeXlsxTableCellData bool false Include cell data from Excel tables
    RetrieveNamespaceList bool false Include list of namespaces used in the document
    RetrieveContentTypeList bool false Include list of content types of document parts

    GetMetrics Sample

    var settings = new MetricsGetterSettings
    {
        RetrieveNamespaceList = true,
        RetrieveContentTypeList = true
    };
    
    var metrics = MetricsGetter.GetMetrics("document.docx", settings);
    Console.WriteLine(metrics.ToString());
    

    Output includes elements such as:

    • ParagraphCount, RunCount, AsciiCharCount, EastAsiaCharCount
    • RevisionTracking (whether the document has tracked revisions)
    • ContentControls (if present)
    • Namespaces (if RetrieveNamespaceList is true)
    • Validation errors from the OpenXml SDK validator

    GetDocxMetrics Sample

    var wmlDoc = new WmlDocument("report.docx");
    var settings = new MetricsGetterSettings
    {
        IncludeTextInContentControls = true
    };
    
    var metrics = MetricsGetter.GetDocxMetrics(wmlDoc, settings);
    
    // Extract specific values
    var paragraphCount = (int?)metrics.Element("ParagraphCount");
    var hasRevisions = (bool?)metrics.Element("RevisionTracking")?.Attribute("Val");
    Console.WriteLine($"Paragraphs: {paragraphCount}, Has revisions: {hasRevisions}");
    

    GetXlsxMetrics Sample

    var smlDoc = new SmlDocument("data.xlsx");
    var settings = new MetricsGetterSettings
    {
        IncludeXlsxTableCellData = true
    };
    
    var metrics = MetricsGetter.GetXlsxMetrics(smlDoc, settings);
    Console.WriteLine(metrics.ToString());
    
    • Edit this page
    In this article
    Back to top Generated by DocFX