This paper introduces document normalization, and consider whether controlled document authoring system can be used in a reverse mode to normalize legacy documents. A paradigm for deep content analysis using such a system is proposed, and an architecture for a document normalization system is described.
