We describe a new Web service architecture designed to make it possible to collect data from traditional plain HTML Web sites, aggregate and serve them in more advanced formats, e.g. as RSS feeds. To locate the relevant data in the plain HTML pages, the architecture requires the insertion of some meta tags in the commented text. Hence, the extra markup remains totally transparent to users and programs. Such annotated HTML documents are then routinely pulled by our Web service, which then aggregates the data and serves them over several channels, e.g. RSS 1.0 or 2.0. Also, a REST-style Web Service allows users to submit XQuery queries to the feeds database. Finally, we discuss scalability issues w.r.t. polling frequencies.

A lightweight architecture for RSS polling of arbitrary Web sources / S. Bossa, G. Fiumara, A. Provetti (CEUR WORKSHOP PROCEEDINGS). - In: CEUR Workshop Proceedings / [a cura di] F. De Paoli, A. Di Stefano, A. Omicini, C. Santoro. - [s.l] : CEUR-Workshop, 2006. - pp. 118-123 (( Intervento presentato al 7. convegno WOA tenutosi a Catania nel 2006.

A lightweight architecture for RSS polling of arbitrary Web sources

A. Provetti
Ultimo
2006

Abstract

We describe a new Web service architecture designed to make it possible to collect data from traditional plain HTML Web sites, aggregate and serve them in more advanced formats, e.g. as RSS feeds. To locate the relevant data in the plain HTML pages, the architecture requires the insertion of some meta tags in the commented text. Hence, the extra markup remains totally transparent to users and programs. Such annotated HTML documents are then routinely pulled by our Web service, which then aggregates the data and serves them over several channels, e.g. RSS 1.0 or 2.0. Also, a REST-style Web Service allows users to submit XQuery queries to the feeds database. Finally, we discuss scalability issues w.r.t. polling frequencies.
Settore INF/01 - Informatica
2006
https://ceur-ws.org/Vol-204/P14.pdf
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
P14.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 126.12 kB
Formato Adobe PDF
126.12 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/963840
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? ND
social impact