CAT-Tools/OmegaT/User manual/Tagged (Formatted) Files Text Editing

Previous: Translation Memory and Glossary Files - Next: File Filters

Working with Tags
OmegaT displays the in-line formatting information of supported formats (at present HTML, XHTML, OpenDocument, and OpenOffice.org/StarOffice) as tags. The tags that are visible in OmegaT are not the same as those used by the original format; OmegaT converts them internally, and converts them back at compilation when the target documents are created.

Care must be exercised with tags. If they are accidentally changed, the formatting of the final file may be corrupted. In the worst case scenario a file may not open altogether; this occurs with OpenOffice.org files in particular.

Although custom formatting cannot be performed from within OmegaT, it is possible to apply some degree of control over the existing format.

OpenDocument and OpenOffice.org File Tag Manipulation
OpenDocument and OpenOffice.org tags must be preserved as is. Tags are paired as opening ( ) and closing ( ) tags with consecutive numbers, starting from 1, in each segment. These tags cannot be deleted, duplicated, or have their sequence changed. However, it is possible to:


 * move tags to a different part of the segment; as long as they remain in the same order;
 * cancel tags by placing them next to each other.


 * For instance, if the tagged sentence

This text is in italics.


 * is changed to

This text is in italics.


 * the italic formatting is effectively removed.

HTML/XHTML File Tag Manipulation
In HTML or XHTML the beginning of formatting is marked by an opening tag of the form , and the end of formatting by a corresponding closing tag  (where x represents the characters of a particular tag element). These tags have matching numbers, starting from 0, in each segment. Essentially, any changes made to tags must be made to both tags of the same pair.

The following rules for tags are applicable:


 * Deletion is permitted (e.g. to remove italics from

This text is in italics


 * simply delete the tags  and ).


 * Duplication is permitted (e.g. in

This text is in italics, and so is this


 * the same formatting of "italics" is applied to "this").


 * Change of order is permitted (e.g.

This text is in <x0>italics</x0>, this text is <x1> underlined</x1>


 * can be changed to

This text is <x1>underlined</x1>, this text is in <x0>italics</x0>)


 * Nesting is permitted (e.g.

This text is in <x0>italics</x0>, this text is <x1>underlined</x1>


 * can be changed to

This text is in <x0>italics and <x1>this text is also underlined</x1></x0>)


 * Overlap is not permitted (tagging like this

This text is in <x0>italics, this text is also <x1>underlined</x0>, this text is underlined but not in italics</x1>


 * is not allowed and will corrupt the file. Insertion of an opening tag followed by a second opening tag requires the closing tag of the second opening tag to come before the closing tag of the first.

Tag Validation
The validate tags function (Tools > Validate Tags) detects changed tags (whether done deliberately or by accident), and indicates the affected segments. Use of this function will open a dialog with any suspected broken or bad tags in a document.

This function can be useful for tracking down bugs in a translated tagged text. This is often a problem in OpenDocument or OpenOffice.org files that will not open due to tag problems created in the process of translation. Fixing the tags and recreating the target documents again can often remedy the problem.

Previous: Translation Memory and Glossary Files - Next: File Filters