Authors

Abstract

The project aims at building a distributed web crawling infrastructure, whose functionality is to fetch publications' metadata from the APICe online repository, provided a set of keywords are given for searching.

As far as non-functional properties are concerned, the infrastructure should be:

  • distributed and open, that is, any number of web crawlers may be deployed on any number of networked machines, possibly even at run-time
  • fault-tolerant to disconnections and crashes, that is, both disconnections and crashes should be  detected as soon as possible and (ii) properly managed – e.g., crawling tasks of the disconnected/crashed crawler re-assigned –
  • resource-efficient, that is, the infrastructure should be able to execute smoothly on resource-constrained devices—e.g., RasPi systems

Usage of either  the TuCSoN middleware for coordinating crawlers, or (ii) the JADE framework for programming crawlers is mandatory. Usage of both is considered a plus.

Reference Material

Material

    

Course

Distributed Systems

— a.y.

2015/2016

— credits

9

— cycle

2nd Cycle

— language

wit.gif

Teachers

— professor

Andrea Omicini

Context

— university

Alma Mater Studiorum-Università di Bologna

— campus

Cesena

— department / faculty / school

DISI

— 2nd-cycle course

8614 Ingegneria e scienze informatiche 

URLs & IDs

AMS page
official schedule

— course ID

58260

Partita IVA: 01131710376 - Copyright © 2008-2021 APICe@DISI Research Group - PRIVACY