The 4chan portal has been known for several years as a ``fringe'' internet service for sharing and commenting pictures. Thanks to the possibility to post anonymously, guaranteed by the total lack of a registration/identification mechanism, the portal has somewhat evolved to a global, if mostly US-centred, locus for the posting of extreme views, including racism and all sorts of hate speech. A pivotal role in the emergence of the website as a bastion of ``free speech" has been played by the /pol/ board (https://boards.4chan.org/pol/), which declares its commitment to host ``politically incorrect'' discussions. Several research groups have intensively studied 4chan structure, dynamics and contents. Thanks to works such as[4, 12], we now have a fairly clear description of how 4chan works and what type of discussion dynamics the site supports. In particular, the latter work shed light on the extremely ephemeral nature of discussions, with threads lasting on the website for a few hours at most, and often just for minutes - depending on the traffic they generate - before being removed to make room for new discussion. Given the fast-paced nature of the evolution of the content of the boards, and especially given how such ephemerality shapes the tone and the content of the discussion itself [4, 14], it is of extreme importance for researchers to be able to capture the content of the threads at various points over the course of their short lives. To the best of our knowledge, the existing 4chan literature has relied either on autoptic exploration by the scholars [14], or on large scale data collection campaigns that drew their content from the archived versions of the threads [12], i.e. on copies of the threads as they appeared at the time of their closure, and at that time only. In order to observe at a more fine-grained level the content on the website, we devised a ``scraping'' architecture, summarised in Figure 2, which based on the OXPath platform [9]. It enables the retrieval of the threads posted on a board at various points while they are still live.

Live Monitoring 4chan Discussion Threads / Y. Prifti, I. Pozzana, A. Provetti. - (2021 Jul 20). (Intervento presentato al 6. convegno International Conference on Computational Social Sciences tenutosi a Amsterdam nel 2021).

Live Monitoring 4chan Discussion Threads

A. Provetti
Ultimo
2021

Abstract

The 4chan portal has been known for several years as a ``fringe'' internet service for sharing and commenting pictures. Thanks to the possibility to post anonymously, guaranteed by the total lack of a registration/identification mechanism, the portal has somewhat evolved to a global, if mostly US-centred, locus for the posting of extreme views, including racism and all sorts of hate speech. A pivotal role in the emergence of the website as a bastion of ``free speech" has been played by the /pol/ board (https://boards.4chan.org/pol/), which declares its commitment to host ``politically incorrect'' discussions. Several research groups have intensively studied 4chan structure, dynamics and contents. Thanks to works such as[4, 12], we now have a fairly clear description of how 4chan works and what type of discussion dynamics the site supports. In particular, the latter work shed light on the extremely ephemeral nature of discussions, with threads lasting on the website for a few hours at most, and often just for minutes - depending on the traffic they generate - before being removed to make room for new discussion. Given the fast-paced nature of the evolution of the content of the boards, and especially given how such ephemerality shapes the tone and the content of the discussion itself [4, 14], it is of extreme importance for researchers to be able to capture the content of the threads at various points over the course of their short lives. To the best of our knowledge, the existing 4chan literature has relied either on autoptic exploration by the scholars [14], or on large scale data collection campaigns that drew their content from the archived versions of the threads [12], i.e. on copies of the threads as they appeared at the time of their closure, and at that time only. In order to observe at a more fine-grained level the content on the website, we devised a ``scraping'' architecture, summarised in Figure 2, which based on the OXPath platform [9]. It enables the retrieval of the threads posted on a board at various points while they are still live.
4chan; online conspiracy theories; social web; data retrieval; distributedsystems; web communities
Settore INF/01 - Informatica
20-lug-2021
https://easychair.org/publications/preprint/LLBs
File in questo prodotto:
File Dimensione Formato  
EasyChair-Preprint-6123.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 383.92 kB
Formato Adobe PDF
383.92 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/890320
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact