The Serbian Paraphrase Corpus (paraphrase.sr)

Go to Dataset/Tool Site

The Serbian Paraphrase Corpus – paraphrase.sr (ISLRN 192-200-046-033-9) consists of 1194 pairs of sentences gathered from news sources on the web. Each sentence pair was manually annotated with a binary similarity score that indicates whether the sentences in the pair are semantically similar enough to be considered close paraphrases. The construction of this corpus is described in the 2011 TELFOR paper and the 2013 Decision Support Systems paper.