STSFineGrain is a Java package that contains a collection of semantic textual similarity models and a framework for their evaluation on STS corpora with fine-grained similarity scores. Seven different STS models are implemented, including three unsupervised and four supervised models. Among the supervised models there are both previously presented algorithms, such as
LInSTSS and
POST STSS, as well as the new
POS-TF STSS model that outperforms them. Evaluation can be performed either on an entire dataset, or via cross-validation on it.
STSFineGrain currently supports POST STSS and POS-TF STSS models for texts in Serbian and in English. Other models have no such language-related restrictions. This package was presented in the
LREC 2018 paper.