The ATA Chronicle - September/October 2023 - 41

RR
The advantage of using
Opus-CAT is that it can
be fine-tuned and runs
locally on your machine,
and no data aside from
the " raw " (i.e., trained but
not fine-tuned) language
models downloaded from
the Opus-CAT repository is
exchanged online. Thus, data
privacy is ensured, if you are
working with intellectual
property texts, texts falling
under the Health Insurance
Portability and Accountability
Act, or texts containing
other personally identifying
information. This is not
always the case when using
other translation services
(read the fine print!). With
Opus-CAT, you can turn off
the internet, and it still works
just fine, unless, of course,
you're using it as a plugin
with a cloud-based CAT tool.
In the following, I'll
explain how to prepare your
TM file and then fine-tune
the Opus-CAT engine with
that file. For the visual types,
I made a video with detailed
explanations, which you can
watch here: bit.ly/opuscat.
Translation Memory
Preparation
To start, you will need to
collect all your TM data
into one file. As previously
mentioned, the data should
be in a relatively narrow
subject area. You then export
that TM as a TMX file. TMX is
short for Translation Memory
eXchange, and all reasonable
CAT tools are capable of
handling this file format.
If you are lucky, your TMX
file can be read by Opus-CAT
without any problems and
you can proceed straight to
the next step, the fine-tuning.
www.ata-chronicle.online
Figure 1: Olifant Error Log
In many cases, however, the
TMX file contains invalid
characters, which can cause
trouble, particularly with
larger files like we are
handling here. Many CAT
tools or quality assurance
tools such as Xbench have an
option to repair TMX files and
remove invalid characters,
so you may be able to repair
your TMX file with one of
these tools.
In the following exercise,
we'll use Olifant, a free
Windows program, to repair
the TMX file and remove any
invalid characters with the
help of Olifant and a plain
text editor such as Notepad
for Windows or TextEdit
for Mac. To install Olifant,
go to the Okapi download
page, click on " Download
Olifant--....zip, " and
download and unpack the
zip file. The resulting file is
a Windows installer, which
installs Olifant after you
double-click on it.
Next, start Olifant and
import your TMX file, by
going to File > Import. Make
sure the checkbox " Check
for invalid XML characters "
is checked. If all goes well,
Olifant can repair/remove the
invalid characters. You can
then save the imported TM
file as a repaired TMX file.
In my case, however, I
got an error message, as
shown in Figure 1. The file
contains invalid characters
( " 0x1E " ), which Olifant
first encountered on line
33702, as shown in the error
log above, which couldn't
be removed even though
I had the aforementioned
checkbox checked. To
remove the offending
character(s), first close the
file in Olifant and open it in
Notepad (on Windows) or
TextEdit (on a Mac). Turn
off Word Wrap, if applicable.
Then go to the offending
line, here line 33702, and
look for the invalid character.
Usually, the hexadecimal
character is displayed as
" ode; " in your text
editor instead of " 0xcode "
in the Olifant log. That is
in the present case, we're
looking for "  " instead
of " 0x1E " (i.e., " &# " instead
of the " 0 " and an appended
American Translators Association 41
https://okapi.sourceforge.net/downloads.html https://www.cfbtranslations.com/product/opuscat-instruction-video/ https://okapi.sourceforge.net/downloads.html http://www.ata-chronicle.online

The ATA Chronicle - September/October 2023

Table of Contents for the Digital Edition of The ATA Chronicle - September/October 2023

Contents
The ATA Chronicle - September/October 2023 - 1
The ATA Chronicle - September/October 2023 - Contents
The ATA Chronicle - September/October 2023 - 3
The ATA Chronicle - September/October 2023 - 4
The ATA Chronicle - September/October 2023 - 5
The ATA Chronicle - September/October 2023 - 6
The ATA Chronicle - September/October 2023 - 7
The ATA Chronicle - September/October 2023 - 8
The ATA Chronicle - September/October 2023 - 9
The ATA Chronicle - September/October 2023 - 10
The ATA Chronicle - September/October 2023 - 11
The ATA Chronicle - September/October 2023 - 12
The ATA Chronicle - September/October 2023 - 13
The ATA Chronicle - September/October 2023 - 14
The ATA Chronicle - September/October 2023 - 15
The ATA Chronicle - September/October 2023 - 16
The ATA Chronicle - September/October 2023 - 17
The ATA Chronicle - September/October 2023 - 18
The ATA Chronicle - September/October 2023 - 19
The ATA Chronicle - September/October 2023 - 20
The ATA Chronicle - September/October 2023 - 21
The ATA Chronicle - September/October 2023 - 22
The ATA Chronicle - September/October 2023 - 23
The ATA Chronicle - September/October 2023 - 24
The ATA Chronicle - September/October 2023 - 25
The ATA Chronicle - September/October 2023 - 26
The ATA Chronicle - September/October 2023 - 27
The ATA Chronicle - September/October 2023 - 28
The ATA Chronicle - September/October 2023 - 29
The ATA Chronicle - September/October 2023 - 30
The ATA Chronicle - September/October 2023 - 31
The ATA Chronicle - September/October 2023 - 32
The ATA Chronicle - September/October 2023 - 33
The ATA Chronicle - September/October 2023 - 34
The ATA Chronicle - September/October 2023 - 35
The ATA Chronicle - September/October 2023 - 36
The ATA Chronicle - September/October 2023 - 37
The ATA Chronicle - September/October 2023 - 38
The ATA Chronicle - September/October 2023 - 39
The ATA Chronicle - September/October 2023 - 40
The ATA Chronicle - September/October 2023 - 41
The ATA Chronicle - September/October 2023 - 42
The ATA Chronicle - September/October 2023 - 43
The ATA Chronicle - September/October 2023 - 44
The ATA Chronicle - September/October 2023 - 45
The ATA Chronicle - September/October 2023 - 46
https://www.nxtbook.com/nxtbooks/chronicle/20241112
https://www.nxtbook.com/nxtbooks/chronicle/20240910
https://www.nxtbook.com/nxtbooks/chronicle/20240708
https://www.nxtbook.com/nxtbooks/chronicle/20240506
https://www.nxtbook.com/nxtbooks/chronicle/20240304
https://www.nxtbook.com/nxtbooks/chronicle/20240102
https://www.nxtbook.com/nxtbooks/chronicle/20231112
https://www.nxtbook.com/nxtbooks/chronicle/20230910
https://www.nxtbook.com/nxtbooks/chronicle/20230506
https://www.nxtbook.com/nxtbooks/chronicle/20230304
https://www.nxtbook.com/nxtbooks/chronicle/20230102
https://www.nxtbook.com/nxtbooks/chronicle/20221112
https://www.nxtbook.com/nxtbooks/chronicle/20220910
https://www.nxtbook.com/nxtbooks/chronicle/20220708
https://www.nxtbook.com/nxtbooks/chronicle/20220506
https://www.nxtbook.com/nxtbooks/chronicle/20220304
https://www.nxtbook.com/nxtbooks/chronicle/20220102
https://www.nxtbook.com/nxtbooks/chronicle/20211112
https://www.nxtbook.com/nxtbooks/chronicle/20210910
https://www.nxtbook.com/nxtbooks/chronicle/20210708
https://www.nxtbook.com/nxtbooks/chronicle/20210506
https://www.nxtbook.com/nxtbooks/chronicle/20210304
https://www.nxtbook.com/nxtbooks/chronicle/20210102
https://www.nxtbookmedia.com