Coordinatore | UNIVERSITAT POLITECNICA DE VALENCIA
Organization address
address: CAMINO DE VERA S/N contact info |
Nazionalità Coordinatore | Spain [ES] |
Totale costo | 3˙005˙570 € |
EC contributo | 2˙399˙739 € |
Programma | FP7-ICT
Specific Programme "Cooperation": Information and communication technologies |
Code Call | FP7-ICT-2011-9 |
Funding Scheme | CP |
Anno di inizio | 2013 |
Periodo (anno-mese-giorno) | 2013-01-01 - 2015-12-31 |
# | ||||
---|---|---|---|---|
1 |
UNIVERSITAT POLITECNICA DE VALENCIA
Organization address
address: CAMINO DE VERA S/N contact info |
ES (VALENCIA) | coordinator | 0.00 |
2 |
INSTITUUT VOOR NEDERLANDSE LEXICOLOGIE
Organization address
address: Matthias de Vrieshof 2-3, 2311 BZ contact info |
NL (LEIDEN) | participant | 0.00 |
3 |
NATIONAL CENTER FOR SCIENTIFIC RESEARCH "DEMOKRITOS"
Organization address
address: Patriarchou Gregoriou Str. contact info |
EL (AGHIA PARASKEVI) | participant | 0.00 |
4 |
UNIVERSITAET INNSBRUCK
Organization address
address: INNRAIN contact info |
AT (INNSBRUCK) | participant | 0.00 |
5 |
UNIVERSITY COLLEGE LONDON
Organization address
address: Gower Street contact info |
UK (LONDON) | participant | 0.00 |
6 |
UNIVERSITY OF LONDON
Organization address
address: Malet Street, Senate House contact info |
UK (LONDON) | participant | 0.00 |
Esplora la "nuvola delle parole (Word Cloud) per avere un'idea di massima del progetto.
Huge amounts of handwritten historical documents are being published by on-line digital libraries world wide. However, for these raw digital images to be really useful, they need be annotated with informative content. The tranScriptorium project aims to develop innovative, efficient and cost-effective solutions for the indexing, search and full transcription of historical handwritten document images, using modern, holistic Handwritten Text Recognition (HTR) technology. For typical handwritten text images of historical documents, currently available text image recognition technologies are not suitable. Traditional Optical Character Recognition (OCR) is simply not usable since characters can not be isolated automatically in these images. Therefore, holistic, segmentation-free HTR techniques, often borrowed from the field of Automatic Speech Recognition are needed. Yet, state-of-the-art holistic HTR approaches still lack the required accuracy, mainly due to the usual poor quality, degradations and writing style variability of historical document images. To cope with this lack of recognition accuracy for handwritten text images of historical documents, three actions are planned in tranScriptorium: i) improve basic image preprocessing and holistic HTR techniques; ii) develop novel indexing and keyword searching approaches, mainly based on byproducts of holistic HTR decoding and word spotting techniques; and iii) capitalize on new, user-friendly interactive-predictive HTR approaches for computer-assisted operation, which minimize the user intervention needed to achieve full, high quality transcripts. HTR tools based on tranScriptorium techniques will be incorporated into HTR web platforms that will be accessible to users through two different means: i) a content provider portal that provides access to handwritten historical documents for casual, individual researchers; and b) a specialized HTR web portal for structured crowd-sourcing transcription projects.