Coordinatore | MAX PLANCK GESELLSCHAFT ZUR FOERDERUNG DER WISSENSCHAFTEN E.V.
Organization address
address: Campus E1.4 8 contact info |
Nazionalità Coordinatore | Germany [DE] |
Totale costo | 3˙731˙952 € |
EC contributo | 2˙797˙868 € |
Programma | FP7-ICT
Specific Programme "Cooperation": Information and communication technologies |
Code Call | FP7-ICT-2009-5 |
Funding Scheme | CP |
Anno di inizio | 2010 |
Periodo (anno-mese-giorno) | 2010-09-01 - 2013-08-31 |
# | ||||
---|---|---|---|---|
1 | MAX PLANCK GESELLSCHAFT ZUR FOERDERUNG DER WISSENSCHAFTEN E.V. | DE | coordinator | 0.00 |
2 |
HANZO ARCHIVES LIMITED
Organization address
address: CLIFTON STREET contact info |
UK (LONDON) | participant | 0.00 |
3 |
MAGYAR TUDOMANYOS AKADEMIA SZAMITASTECHNIKAI ES AUTOMATIZALASI KUTATOINTEZET
Organization address
address: Kende utca 13-17 contact info |
HU (BUDAPEST) | participant | 0.00 |
4 |
STICHTING INTERNET MEMORY FOUNDATION
Organization address
address: PRINSENGRACHT 707 contact info |
NL (AMSTERDAM) | participant | 0.00 |
5 |
THE HEBREW UNIVERSITY OF JERUSALEM.
Organization address
address: GIVAT RAM CAMPUS contact info |
IL (JERUSALEM) | participant | 0.00 |
6 |
UNIVERSITY OF PATRAS
Organization address
address: University Campus- Rio contact info |
EL (PATRAS) | participant | 0.00 |
Esplora la "nuvola delle parole (Word Cloud) per avere un'idea di massima del progetto.
To understand what is required to support new innovative Internet applications, a solid understanding of Internet content characteristics (size, distribution, form, structure, evolution, dynamic) is necessary. The LAWA project (LAWA - Longitudinal Analytics of Web Archive data) will build an Internet-based experimental testbed for large-scale data analytics. Its emphasis is on developing a sustainable infrastructure, scalable methods, and easily usable software tools for aggregating, querying, and analyzing heterogeneous data at Internet scale. For decades, compute power and storage have become steadily cheaper, while network speeds, although increasing, has not kept up. The result is that data is becoming increasingly local and thus distributed in nature. It has become necessary to move more analysis to the data, not the reverse. The Internet is already, a long-scaled heterogeneous complex system.nLAWA will federate distributed FIRE facilities with the rich centralized Web repository of the European Archive, to create a Virtual Web Observatory and use Web data analytics as a use case study to validate our design. The outcome of our work will enable Web-scale analysis of data, to unlock large-scale study of the content aspect of the Internet and bring this dimension on the roadmap of Future Internet Research. In four workpackages we will extend the open-source Hadoop parallel query management software by novel methods for data access and import, develop new methods of distributed storage with indexing, offer scalable aggregation, mine metadata and text along the time dimension, and advance the art of automatic classification of Web contents.nLAWA adds value to the FIRE community by offering access to very large datasets across thousands of storage and processing nodes, with advanced methods and open-source tools for intelligently analysis at Internet scale enabling research for the Future Internet to take into account the challenge of content explosion.