Fork TermSuite on GitHub.

Bilingual alignment features in TermSuite are provided by class BilingualAlignmentService. You can get an instance of that class from the BilingualAligner builder. See TermSuite Javadoc for more information.

Prerequesites

  1. Java 8
  2. install an external POS tagger

Preprocessing

TXTCorpus txtCorpus = new TXTCorpus(Lang.FR, Paths.get("path/to/corpus"));

IndexedCorpus indexedCorpus = TermSuite.preprocessor()
  .setTaggerPath("path/to/tagger")
  .toIndexedCorpus(txtCorpus, 500000);

Terminology terminology =  indexedCorpus.getTerminology();