#	Pagina
attuale pagina	/open-h2020/projects/193750/index.html
-1	/open-fp7/projects/110998/index.html
-2	/open-fp7/projects/100247/index.html
-3	/open-h2020/projects/194015/index.html
-4	/open-fp7/projects/90089/index.html
-5	/open-fp7/projects/100534/index.html
-6	/open-fp7/projects/93801/index.html
-7	/open-fp7/projects/100621/index.html
-8	/open-fp7/projects/191250/index.html
-9	/open-fp7/projects/96186/index.html
-10	/open-fp7/projects/95803/index.html

Opendata, web and dolomites

LatinOCR

Digital Bridge: Optical Character Recognition for Early Printed Books in Latin

Total Cost €

0

EC-Contrib. €

0

Partnership

0

Views

0

LatinOCR project word cloud

Explore the words cloud of the LatinOCR project. It provides you a very rough idea of what is the project "LatinOCR" about.

bridge successfully company grammar theology publishers switch libraries seismic failures learned printing literary collectors demand solution subsequent plan languages revolution collections books solid 95 ocr longevity source 15 services digital private criticism 98 80 customisation openness first 19th latin sciences typography vernacular free discourse archaeology philosophy solutions societies music affordability shift digitised century geography proportions digitising experiencing training mathematics unclear outlines natural printed language costumisation until commercialisation successful modelled engine advent standard modify market renaissance intellectual invention storage character competitors guarantee commercial publication basic limited improvement businesses recognition risks accurate event packages medicine software optical tesseract accuracy law

Project "LatinOCR" data sheet

The following table provides information about the project.

Coordinator	UNIVERSITY OF DURHAM Organization address address: STOCKTON ROAD THE PALATINE CENTRE city: DURHAM postcode: DH1 3LE website: www.dur.ac.uk contact info title: n.a. name: n.a. surname: n.a. function: n.a. email: n.a. telephone: n.a. fax: n.a.
Coordinator Country	United Kingdom [UK]
Total cost	148˙178 €
EC max contribution	148˙178 € (100%)
Programme	1. H2020-EU.1.1. (EXCELLENT SCIENCE - European Research Council (ERC))
Code Call	ERC-2014-PoC
Funding Scheme	ERC-POC
Starting year	2015
Duration (year-month-day)	from 2015-03-01 to 2016-08-31

Partnership

Take a look of project's partnership.

#	participants	country	role	EC contrib. [€]
1	UNIVERSITY OF DURHAM UNIVERSITY OF DURHAM Organization address address: STOCKTON ROAD THE PALATINE CENTRE city: DURHAM postcode: DH1 3LE website: www.dur.ac.uk contact info title: n.a. name: n.a. surname: n.a. function: n.a. email: n.a. telephone: n.a. fax: n.a.	UK (DURHAM)	coordinator	148˙178.00

Map

Project objective

This project aims to provide the first viable and accurate solution for digitising early printed books in Latin using Optical Character Recognition. Our basic OCR package will be free and open-source, in order to ensure affordability, longevity, and openness for improvement (three failures of our commercial competitors). Our Company Limited by Guarantee will market costumisation, training, support, and further development tailored to specific collections of books (the standard failure of open-source solutions). Customisation services are essential in our market. Early printed Latin cannot be successfully digitised using standard OCR packages (whether open-source or commercial): these currently have an accuracy of no more than 15%. We plan to modify the open-source Tesseract engine, by training it to account for Latin grammar and early typography: this will increase its accuracy of recognition to about 80%. Customisation tailored to specific collections of books will further improve accuracy to about 95% to 98%.

Our company will address the needs of libraries, digital publishers, researchers, learned societies, and private collectors of early books. Our commercialisation plan is modelled on that of other successful businesses based on open-source software.

The demand for Latin OCR is strong, as publishers and libraries switch to digital publication and storage. From the invention of printing in the Renaissance until well into the 19th century, Latin was the European language of every intellectual discourse: the natural sciences, mathematics, philosophy, theology, law, literary criticism, geography, archaeology, music, medicine. The subsequent shift to using the vernacular languages was a seismic event. We are now experiencing a revolution of similar proportions: the advent of digital publication is bringing opportunities and risks whose outlines are still unclear. This project aims to offer a solid technical bridge between the digital future and the Latin past.

Are you the coordinator (or a participant) of this project? Plaese send me more information about the "LATINOCR" project.

For instance: the website url (it has not provided by EU-opendata yet), the logo, a more detailed description of the project (in plain text as a rtf file or a word file), some pictures (as picture files, not embedded into any word file), twitter account, linkedin page, etc.

Send me an email (fabio@fabiodisconzi.com) and I put them in your project's page as son as possible.

Thanks. And then put a link of this page into your project's website.

The information about "LATINOCR" are provided by the European Opendata Portal: CORDIS opendata.

More projects from the same programme (H2020-EU.1.1.)

CohoSing (2019)

Cohomology and Singularities

Read More

CHIPTRANSFORM (2018)

On-chip optical communication with transformation optics

Read More

CARBYNE (2020)

New carbon reactivity rules for molecular editing

Read More