Repositorio Institucional
Repositorio Institucional
CONICET Digital
Datos de
Investigación
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
    • TODO
  • Ayuda
    • Qué son y qué no son los Datos de Investigación
    • Cómo obtener un DOI/Handle
    • Cómo reutilizar y citar los Datos de Investigación
    • Preguntas frecuentes | FAQs
    • Contacto
  • Novedades
    • Noticias
    • Boletines
  • Acerca de
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • METADATOS
  • CONDICIONES DE USO
  • ARCHIVOS
  • ITEMS RELACIONADOS
  • ESTADISTICAS
 
 
Datos de investigación

SpanishTweetsCOVID-19: A Social Media Enriched Covid-19 Twitter Spanish Dataset

Autores: Tommasel, AntonelaIcon
Colaboradores: Rodriguez, Juan ManuelIcon
Publicador: Consejo Nacional de Investigaciones Científicas y Técnicas
Fecha de depósito: 12/05/2023
Fecha de recolección: 01/03/2020-31/10/2020
Clasificación temática:
Otras Ciencias de la Computación e Información

Resumen

This dataset presents a large-scale collection of millions of Twitter posts related to the coronavirus pandemic in Spanish language. The collection was built by monitoring public posts written in Spanish containing a diverse set of hashtags related to the COVID-19, as well as tweets shared by the official Argentinian government offices, such as ministries and secretaries at different levels. Data was collected between March and August 2020 using the Twitter API. In addition to tweets IDs, the dataset includes information about mentions, retweets, media, URLs, hashtags, replies, users and content-based user relations, allowing the observation of the dynamics of the shared information. Data is presented in different tables that can be analysed separately or combined. The dataset aims at serving as source for studying several coronavirus effects in people through social media, including the impact of public policies, the perception of risk and related disease consequences, the adoption of guidelines, the emergence, dynamics and propagation of disinformation and rumours, the formation of communities and other social phenomena, the evolution of health related indicators (such as fear, stress, sleep disorders, or children behaviour changes), among other possibilities. In this sense, the dataset can be useful for multi-disciplinary researchers related to the different fields of data science, social network analysis, social computing, medical informatics, social sciences, among others.

Información Técnica

The raw data belonging to the Twitter posts were retrieved from the Twitter API using our own toll called Faking it!, which internally uses Twitter4J for easily integrating with the Twitter API. Faking it! can also be used to rehydrate the data collection. In all cases, longs are encoded as Radix 32 Strings. The code for processing and analysing the raw data and the shared tables is also available at the Faking it! repository at https://github.com/knife982000/FakingIt.
Palabras clave: SOCIAL SCIENCES, SOCIAL MEDIA, MEDICAL INFORMATICS, SOCIAL NETWORKS ANALYSIS, SPANISH LANGUAGE, TWITTER, COVID-19
Previsualización destacada
Identificador del recurso
URI: http://hdl.handle.net/11336/197411
Colecciones
Datos de Investigación(ISISTAN)
Datos de Investigación de INSTITUTO SUPERIOR DE INGENIERIA DEL SOFTWARE
Citación
Tommasel, Antonela; (2023): SpanishTweetsCOVID-19: A Social Media Enriched Covid-19 Twitter Spanish Dataset. Consejo Nacional de Investigaciones Científicas y Técnicas. (dataset). http://hdl.handle.net/11336/197411
Condiciones de uso
Las buenas prácticas científicas esperan que se otorgue el crédito adecuado mediante una citación. Utilice un formato de citación y aplique estas normas de reutilización.
info:eu-repo/semantics/openAccess
Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Unported (CC BY-NC-SA 2.5)
Compartir
Archivos del conjunto de datos
Archivo
Notas de uso
Tamaño
 
CONICET_Digital_Nro.c1ac497b-c0d9-42b2-b267-fc8e031b6034_A.csv
Relation between the tweets and the users mentioned in them. RTs and tweets without mentions are not included in the analysis.  Más
590.5Mb
  Descarga
CONICET_Digital_Nro.c1ac497b-c0d9-42b2-b267-fc8e031b6034_D.zip
Directed edges for each of the users sharing tweets in the dataset. Edges are based on the relations established by the shared content, namely: retweeted, replied, quoted or mentioned at least a tweet of the related user.  Más
806.5Mb
  Descarga
CONICET_Digital_Nro.c1ac497b-c0d9-42b2-b267-fc8e031b6034_G.csv
Profile information for each of the users sharing any of the collected tweets.  Más
565.8Mb
  Descarga
CONICET_Digital_Nro.c1ac497b-c0d9-42b2-b267-fc8e031b6034_J.csv
Relation between the tweets and the replies they received. Replies are included for tweets shared by verified users.  Más
8.833Mb
  Descarga
CONICET_Digital_Nro.c1ac497b-c0d9-42b2-b267-fc8e031b6034_M.csv
Relation between the tweets and the urls included in them. RTs and tweets without urls are not included in the analysis.  Más
376.5Mb
  Descarga
CONICET_Digital_Nro.c1ac497b-c0d9-42b2-b267-fc8e031b6034_P.csv
Relation between the tweets and the hashtags used in them. RTs and tweets without hashtags are not included in the analysis.  Más
347.3Mb
  Descarga
CONICET_Digital_Nro.c1ac497b-c0d9-42b2-b267-fc8e031b6034_S.csv
Analysis of several of the principal features of Twitter posts for either sharing additional content, or involving other users. RTs are not included in the analysis.  Más
650.2Mb
  Descarga
CONICET_Digital_Nro.7855ffc5-7d97-4ba4-ac78-1f6726b2251d_A.zip
Tweets were classified according to their characteristics in three categories: Original tweet, Retweet (RT) or Reply. In the case of Original and Reply tweets it is also indicated whether it included a quoted tweet.  Más
1.389Gb
  Descarga
CONICET_Digital_Nro.0f879973-6165-4981-9c4d-f56ac750229e_G.rar
Relation between the user who shared the tweet, and date and time each tweet was created. In case it was available, the place in which the tweet was posted is included.  Más
700Mb
  Descarga
CONICET_Digital_Nro.0f879973-6165-4981-9c4d-f56ac750229e_D.rar
Relation between the user who shared the tweet, and date and time each tweet was created. In case it was available, the place in which the tweet was posted is included.  Más
700Mb
  Descarga
CONICET_Digital_Nro.0f879973-6165-4981-9c4d-f56ac750229e_A.rar
Relation between the user who shared the tweet, and date and time each tweet was created. In case it was available, the place in which the tweet was posted is included.  Más
271.4Mb
  Descarga
 
 
Descargar todo
  Descargar solo metadatos (JSON)   Descargar solo metadatos (XML)
 
Preparando la descarga
 

Ver el registro completo

Publicaciones relacionadas

  • Forecasting mental health and emotions based on social media expressions during the COVID-19 pandemic
  • Artículo Tracking the evolution of crisis processes and mental health on social media during the COVID-19 pandemic
    Tommasel, Antonela ; Diaz Pace, Jorge Andres ; Godoy, Daniela Lis ; Rodriguez, Juan Manuel (Taylor & Francis Ltd, 2021-11)

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Explorar

  • Autores
  • Disciplinas
  • Comunidades
  • Todo

Ayuda

  • Qué son y qué no son los Datos de Investigación
  • Cómo obtener un DOI/Handle
  • Cómo reutilizar y citar los Datos de Investigación
  • Preguntas frecuentes | FAQs
  • Contacto

Novedades

  • Noticias
  • Boletines

Acerca de

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES