Overview

Every document processed through the API automatically includes comprehensive metadata extraction. This rich metadata provides valuable insights about your documents without requiring additional API calls or configuration.

Types of Metadata Extracted

Document Properties

Standard document metadata fields extracted from all file types:

Title: Document title from metadata or content analysis
Author: Document creator information
Subject: Document subject or description
Keywords: Extracted keywords and tags
Creation Date: When the document was originally created
Modification Date: Last modification timestamp
Application: Software used to create the document

Content Analysis

Intelligent analysis of document content:

Page Count: Total number of pages
Word Count: Estimated word count
Character Count: Total character count including spaces
Language Detection: Primary language(s) detected in the document
Content Type: Classification of document type (report, article, form, etc.)

Technical Specifications

File format and technical details:

File Size: Document size in bytes
Format Version: Specific format version (e.g., PDF 1.7, DOCX)
Encoding: Text encoding used in the document
Compression: Compression methods applied
Security: Password protection or encryption status

Structure Information

Document layout and formatting analysis:

Headings: Hierarchy of document headings and sections
Tables: Number and structure of tables detected
Images: Count and basic properties of embedded images
Links: Internal and external links found in the document
Formatting: Text styling and layout information

File Formats - Supported document types
Best Practices - Optimization guidelines
API Reference - Technical metadata field specifications

Metadata Extraction

Overview

Types of Metadata Extracted

Document Properties

Content Analysis

Technical Specifications

Structure Information

On this page

Metadata Extraction

Overview

Types of Metadata Extracted

Document Properties

Content Analysis

Technical Specifications

Structure Information

Related Information

On this page