1.45.0 to 1.46.0
Core
- Added
ThreadSafeFilterConfigurationMapper
. This will be the default in the future as it is light weight, fast and thread safe. - Deprecated
FilterConfigurationMapper
andDefaultFilters
. These classes will be moved to okapi-ui in the future as they were originally designed to be used with Rainbow. - Deprecated many UI and file system based methods in
IFilterConfigurationMapper
. These will be removed whenFilterConfigurationMapper
is moved to okapi-ui. - Cleaned up warnings on interfaces and implementors.
IPipelineStep
implementshandleStream
to support Java 8 streams.Property
now has aProperty.type
field which gives more information about the Property's intended use (FILTER_ONLY, DISPLAY, ITS
etc…).
Connectors
Filters
-
Created
DefaultParameters
.IFilter.getParameters()
updated for all filters to never return null. -
Add
EpubFilter
. Initial implementation. -
Add
SubtitleFilter
, split intoVTTFilter
andTTMLFilter
. Initial implementation. File must be splittable by ending punctuation.- Added: an option to avoid splitting lines in the middle of words at the expense of max character count per line
-
Add
MessageFormatFilter
. Full support for ICU message strings. This filter is intended to be used as a subfilter (i.e., in JSON, YAML or XML filters). There is support for automatically adding plural forms for the target. To aid in translation a normalization option will automatically move leading and trailing text inside each message variant. A Pretty Print option reformats the translated ICU message string for easier viewing. -
AutoXLIFF Filter
- Improved: relevant classloader provided for XML input, output and event factories: issue#1054.
-
IDML Filter
- Improved: relevant classloader provided for XML input, output and event factories: issue#1054.
- Improved: external hyperlinks extraction capability provided: issue#1178.
- Improved: special characters made configurable: issue#1193.
- Fixed: the XML input factory configuration from filter parameters clarified: issue#1194.
- Improved: XML input factory configured for speed and low memory usage, the writing of \ modified content enhanced: issue#1332.
-
ITS Filter
- Improved: untranslatable textual units extraction provided: issue#1319.
- Fixed: unescaping of newlines escaped as code.
- Added: an option to unescape android quotes.
- Added: cdata subfilter.
-
OpenOffice Filter
- Improved: Woodstox specified as dependency, relevant classloader provided for XML input, \ output and event factories: issue#1054.
-
OpenXML Filter
- Fixed: empty cells and rows cleaned up aggressively, the writing of modified content improved
and markup memory allocations clarified: issue#894. - Improved: relevant classloader provided for XML input, output and event factories: issue#1054.
- Fixed: “obj” placeholder type considered as “body” in PPTX documents: issue#1129.
- Improved: DOCX: font color smoothing provided: issue#1145.
- Improved: external hyperlinks extraction parameter usage clarified: issue#1176.
- Fixed: the XML input factory configuration from filter parameters clarified: issue#1194.
- Fixed: the shared strings part formation from worksheet inline strings clarified: issue#1199.
- Improved: non-complex script and complex script properties identification and merge clarified: issue#1200.
- Fixed: table elements identification clarified: issue#1301.
- Fixed: relationship id generation improved: issue#1306.
- Fixed: WPML toggle properties handling aligned with tools behaviour: issue#1311.
- Improved: fonts information made available for extraction: issue#1312.
- Improved: numbering level texts made available for translation: issue#1313.
- Fixed: paragraph properties and RTL run property made absent for an RTL target locale in the
shared strings part: issue#1314. - Fixed: worksheet rows and columns identification clarified: issue#1325.
- Fixed: PPTX: styles clarification throughout the whole document performed: issue#1329.
- Improved: XLSX: same cell data in different cells can be copied for extraction optionally:
issue#1333. - Improved: XLSX: the limited form of multilingual translation supported:
issue#1334. - Fixed: empty lastModifiedBy elements exclusion clarified: issue#1335.
- Fixed: empty cells and rows cleaned up aggressively, the writing of modified content improved
-
Markdown Filter
- Fixed: admonition blocks are now correctly indented.
- Fixed: various indentation issues.
- Improved: if there is a blank line between the admonition header and the content, it is kept.
- Improved: whitespace after the opening marker in bullet and ordered lists are kept as is.
- Improved: whitespace in front of the information in fenced code blocks are kept.
-
Regex Filter
- Added: a rule option to collapse newlines into spaces within the source and target groups.
-
TMX Filter
- Improved: relevant classloader provided for XML input, output and event factories: issue#1054.
-
TS Filter
- Improved: relevant classloader provided for XML input, output and event factories: issue#1054.
-
TTX Filter
- Improved: relevant classloader provided for XML input, output and event factories: issue#1054.
-
TXML Filter
- Improved: Woodstox specified as dependency, relevant classloader provided for XML input, \ output and event factories: issue#1054.
-
XLIFF Filter
- Improved: relevant classloader provided for XML input, output and event factories: issue#1054.
- Added: support for PCDATA subfiltering.
- Fixed a bug which caused the
useTranslationTargetState
option to update other attributes besidesstate
. - Added: an option to generate targets for monolingual files.
- Changed: codefinder will still protect targets in cases of id mismatch.
-
XLIFF 2.0 Filter
state
andsubState
attributes on XLIFF 2.0<segment>
elements will now be exposed when serializing to XLIFF 1.2 as custom attributes on the<mrk>
element (okp:xliff2-state
andokp:xliff2-subState
, respectively).copyOf
now stored on OkapiCode
during parsing and updated onCTag
for writing.
-
HTML Filter
- Added: an option to disable ampersand escaping.
-
JSON Filter
- Fixed: a bug where TUs end prematurely after objects nested under TUs.
- Added: an option to set maxwidthRules to be extracted as maxwidth property in the XLIFF.
- Added: an option to use the entire keypath for the names of ids and metadata.
Examples
- Add example8 which demonstrates Many to One and One to Many event conversions. Based on the new stream support.
- Add example9 showing show a subfilter can be created as a step.
Libraries
-
Merge Library
- Updated
OriginalDocumentXliffMergerStep
to handle atomic events.
- Updated
-
Serialization Library
- Updated
OriginalDocumentProtoMergerStep
to handle atomic events. - Change
Event.proto
to outputEvent
's instead ofTextUnit
's (Now serializesGROUP
events)
- Updated
-
Segmentation Library
- Add SRX segmenter unit tests for many languages
- Update
defaultSegmentation.srx
to handle the new segmentation test cases. - Change default SRX to use new
defaultSegmentation.srx
-
Terminology Library
- Improved: Woodstox specified as dependency, relevant classloader provided for XML input,
output and event factories: issue#1054.
- Improved: Woodstox specified as dependency, relevant classloader provided for XML input,
-
XLIFF 2 Library
- Improved: Woodstox specified as dependency, relevant classloader provided for XML input, \ output and event factories: issue#1054.
Steps
-
Added TransliterationStep
- Uses the ICU4J
com.ibm.icu.text.Transliterator
to automatically transform text - See https://unicode-org.github.io/icu/userguide/transforms/general/#compound-ids
- Handy for automatic conversions like Serbian-Cyrillic -> Serbian-Latin, Zawgyi -> Unicode, etc.
- The quality varies. We also plan to allow the use of custom transliterations.
- Uses the ICU4J
-
Splitting/Joining a TTX File Step
- Improved: Woodstox specified as dependency, relevant classloader provided for XML input, \ output and event factories: issue#1054.
-
Splitting an XLIFF File Step
- Improved: Woodstox specified as dependency, relevant classloader provided for XML input, \ output and event factories: issue#1054.
-
XML Validation Step
- Improved: Woodstox specified as dependency, relevant classloader provided for XML input, \ output and event factories: issue#1054.
-
Regex Code Extraction Step
- Added step that uses defined regex to convert to inline codes. Useful for filters that do not have builtin
InlineCodefinder
support.
- Added step that uses defined regex to convert to inline codes. Useful for filters that do not have builtin
TMs
Applications
OSes
-
FreeBSD
- Experimental: applications fully built and run: issue#868.
General
- Integration tests upgraded to use
ThreadSafeFilterConfigurationMapper
- Integration tests now allow an xliff or serialized golden file to simulate translation.
- Integration tests now use Java 8 streams vs
PipelineDriver
- Add XstartOnFirstThread JVM option in superpom for MacOS platforms.