Sagan Textual Entailment Test Suite

This textual entailment test suite aims at providing developers of Textual Entailment system additional test and training datasets. 

We followed the algorithm proposed in [1] to increase the size of Textual Entailment Corpus by using Machine Translation systems to generate additional <t,h> pairs.

We used this algorithm proposed to generate additional training dataset starting from RTEx and following a double translation process. We choose Spanish as intermediate language and Microsoft Bing Translator as the only Machine Translation system in this process. Soon we will provide additional datasets.

Additionally, we provide a Cross-Lingual Textual Entailment (CLTE) dataset based on the Monolingual RTE3 dataset used in the Third PASCAL Recognizing Textual Entailment Challenge. The texts (T) are written in English and the hypothesis (H) are written in Spanish. The procedure to generate this dataset can be found in [2].

Several datasets are provided and a description of their context can be found below :

Downloads

This test suite may be downloaded and used without restriction, it would be appreciate an acknowledgement if you publish results using it, and we would also be interested to hear what performance you get.

How to cite this Textual Entailment Test Suite :

1. The algorithm to generate the Monolingual corpus and a description can be found in the following paper:

2. The algorithm to generate the Bilingual corpus and a description can be found in the following paper:

Paper that used this Textual Entailment Test Suite :

J. Castillo, M. Cardenas, "Using Sentence Semantic Similarity Based on WordNet in Recognizing Textual Entailment". 12th Ibero-American Conference on AI, IBERAMIA 2010, Bahía Blanca, Argentina, November 1-5, 2010, Springer LNAI, in press.

J.Castillo, "A Semantic Oriented Approach to Textual Entailment using WordNet-based Measures".  9th Mexican International Conference on Artificial Intelligence, MICAI 2010, Pachuca, Mexico, November 8-13, 2010, Springer LNAI, in press.

J.Castillo, M. Cardenas, "An Approach to Cross-Lingual Textual Entailment using Web Machine Translation Systems".10th Mexican International Conference on Artificial Intelligence.POLITIS Research journal on computer science and computer engineering with applications, ISSN 1870-9044, Issue 44, December 2011.

Contact

For questions please send an e-mail to :  Julio Castillo ( jotacastillo A T gmail.com )

Locations of visitors to this page