Overwriting the default behavior of the KantanMT Gentry file parser is very simple.
All you need to do is to upload your custom Gentry rule file alongside your Client Files for translation.
A Gentry rule file is made up of the following elements:
- <rule>: This is the main element in the file. It defines all of the required extraction and insertion rules.
- <root>: Roots define the elements of your files that you want to extract. They can contain limited RegEx and XPath expressions.
- <gextractRule>: Rule used to identify what to extract from a <root> element
- <gextractOutputRule>: Formatter rules to output matched text
- <ginsertRule>: Rule to match formatter text and to re-insert into file
- <roots>: This element contain the individual extraction rules.
- <regex>: This element contains the text processing rules for each matched <root> (Note: In the majority of cases you should only really need to modify the <ginsertRule>. The <gextract> and <gextractOutput> rules would very rarely need to be modified)
You can create Gentry rule files using any text editor by following these simple steps:
- Download our sample Gentry rule file (attached below)
- Open with your favorite text editor
- Modify the elements to suit your requirements
- Save the file with the defined .rul file name for that file extension (see table below) (Note: ensure file is saved with UTF-8 encoding)
- Upload your custom .rul file alongside your Client Files for translation
You can use some of the examples provided in our examples.zip file to see what sort of modifications you can make to Gentry rule files.
For more information on file parsing in KantanMT and Gentry, please visit the following links:
File extensions and their related Gentry file names
File Extension | Gentry File Name |
---|---|
.dita | dita.rul |
.docx | docx.rul |
.htm[l] | xhtml.rul |
.idml | idml.rul |
.inx | indesign.rul |
.mqxliff | mqxliff.rul |
.mxliff | mxliff.rul |
.odt | odt.rul |
.php | xhtml.rul |
.sdlxliff | sdl-xlf.rul |
.svg | svg.rul |
.tmx | tmx.rul |
.ttx | ttx.rul |
.txml | txml.rul |
.xlf | ws-xlf.rul |
.xliff | ws-xlf.rul |
.xlsx | xlsx.rul |
.xml | defaultxml.rul |
- doctype: novdoc | novdoc.rul |
- doctype: svg | svg.rul |
- doctype: serviceanleitung | abortext.rul |
- doctype: montageanleitung | montag.rul |
- doctype: ditabase.dtd | dita.rul |
- doctype: cellavision.dtd | cellavision.rul |