Universitat Pompeu Fabra Institut Universitari de Lingüística Aplicada
Pàgina inicial Versió imprimible Cercar informació Informació de contacte



IULA Resources. Corpus & Tools

Linguistic resources developed at IULA within the framework of the common project Corpus Tècnic

The Institut Universitari de Lingüística Aplicada is concerned, within its objectives regarding basic and applied research, with the design and development of linguistic resources as well as language processing and extraction tools. The project Corpus Tècnic is common to all the members of IULA and different corpus exploitation tools continue to be developed in relation to it.

 

Access Accés Bwananet
Access: online
Description: The Corpus Tècnic query interface. The Corpus Tècnic contains written texts from the fields of Law, Economy, Genomics, Medicine and Environment as well as a contrastive corpus from the press. The languages of the texts in the corpus are Catalan, Spanish, English, French and German.
Access Accés Tools for Catalan and Spanish corpus processing Nou New!
Access: online demo (adreça provisional)
Description: A package of tools for Catalan and Spanish corpus processing. It includes a text handling module and a probabilistic POS tagger. It also allows consulting POS tagger dictionary data.
en desenvolupament Treebank-IULA
Access: under developement
Description: A new tool for the processing of the Corpus Tècnic in Catalan, Spanish and English.
Access Accés PALIC
Access: online demo
Description: A package of tools for the processing of the Corpus Tècnic in Catalan and Spanish. It includes a preprocessor, a PosTagger and a linguistic disambiguator.
en desenvolupament Desambigua
Access: online demo available soon
Description: A bank of disambiguation linguistic rules for Catalan and Spanish.
Access Accés Jaguar Nou New!
Access: temporal online access in test phase
Description: A tool for statistical corpus exploitation. It offers concordances, counts ngrams, extracts collocations and gives association, distribution and similarity measures.
restringit COLDIC
Access: restricted access
Description: A management tool from the collection of dictionaries used for the linguistic processing of the Corpus Tècnic and other projects of IULA. The dictionaries in electronic format have either been lent by institutions collaborating with IULA or developed by the different research groups of IULA.
en desenvolupament Syntactic analyser for Spanish
Access: online demo available soon
Description: An open-source HPSG grammar for Spanish implemented within the LKB system.
Access Accés Alinea
Access: online demo
Description: A tool for parallelizing translated texts, which has been specially designed for specialized corpora and also as a translation validator.
en desenvolupament Wüska
Access: test phase
Description: A package of tools for the automatic retrieval and classification of specialized texts.
Access Accés Poppins
Access: temporal online access in test phase
Description: An experimental design of a document classifier with supervised learning.

 

© INSTITUT UNIVERSITARI DE LINGÜÍSTICA APLICADA - UNIVERSITAT POMPEU FABRA, Roc Boronat 138, 08018 Barcelona