Gap Analytics and Terminology in BuildAnalytics


Gap Analysis is a list of unknown words- words for which your MT engine does not yet have a translation. You can use the Gap Analysis report to quickly inject your engine with vital terminology and improve the unique word count.

Gap Analysis can be found BuildAnalytics. The unknown word is listed beside the full segment (the context, taken from your training data) and how your KantanMT engine has attempted to translate the segment.

To view an excel document with this information click Download. Get a linguist to check and translate the list. There may be false positives in the list- words that are supposed to remain the same as the source language. A linguist/terminologist will be able to confirm this. 

Upload the translated words to the training tab (an excel spreadsheet is fastest). Any words that should remain untranslated can be listed in a simple text file named ignorewords.txt and uploaded to the Training tab. You can also add Brands in an excel spread sheet: put the source in column A and Target in column B, name it brands.xlsx and upload it the Training tab. 

You will need to rebuild the engine to fully train you terminology into the engine. A quick fix for individual words can be to upload a terminology file to the Translation tab. More information can be found here.

Have more questions? Submit a request