A Study on NLP Model Ensembles and Data Augmentation Techniques for Separating Critical Thinking from Conspiracy Theories in English Texts

Conspiracy theories suggest that relevant events are orchestrated by secret and powerful groups, taking special importance in times of social upheaval and spreading rapidly through societies. In recent times the transformation in the dissemination of information following the emergence of the Internet, allowing the emergence of blogs, digital media and especially social networks has greatly amplified the propagation of this type of theories, causing an increase in their effects in the real world. In particular, the preference-based content filters of social networks can lead individuals to reinforce their own beliefs and isolation in like-minded communities, which is of particular concern when it comes to conspiracy thinking. The impact of these theories in the real world can be seen following incidents such as the one known as Pizzagate, where false accusations in the context of the 2016 U.S. presidential election led to violent attacks on a pizzeria in Washington D.C., or how conspiracy theories about COVID-19 vaccines posed an additional challenge to vaccination campaigns.
To try to combat the proliferation of this type of theories, one of the main approaches is the moderation of content on social networks, by reducing the virality of certain content, and increasing the informational diversity of users. Correctly distinguishing between conspiracy theories and critical thinking is crucial for accurate content moderation, as misidentification can push rational critics into conspiracy communities, highlighting the importance of developing effective methods to identify conspiratorial content. However, the large amount of content posted on social networks makes manual moderation unfeasible and requires, if not total, at least a first stage of automatic detection. Our study focuses on addressing this challenge, differentiating as accurately as possible English texts that present conspiratorial thinking from those that present critical thinking, taking advantage of advanced artificial intelligence models in the area of natural language processing.
In order to train and evaluate these models, a previously labeled dataset of adequate size and quality is required. For this master’s thesis we have used the dataset provided by the PAN 2024 data science challenge. Specifically, our experiments consist of both individual models and ensembles, using variations of the BERT model, such as BERT-base, BERT-large and RoBERTa. We tested different loss functions, such as cross-entropy, Mix-Up and a hybrid function between supervised contrastive loss and cross-entropy. We also experimented with data augmentation techniques such as synonym substitution and random word insertion and substitution. Finally, we also explored the optimally of the MCC by setting the prediction range of the classifier to be labeled as unknown. Our best performing model during testing was evaluated within the PAN 2024 competition, in Task 1 in its English version, and achieved a Matthews Correlation Coefficient (MCC) of 0.8149 in the whole competition, which secured it the eighth place in the ranking and demonstrated a considerable level of efficiency in identifying conspiratorial content.

​Conspiracy theories suggest that relevant events are orchestrated by secret and powerful groups, taking special importance in times of social upheaval and spreading rapidly through societies. In recent times the transformation in the dissemination of information following the emergence of the Internet, allowing the emergence of blogs, digital media and especially social networks has greatly amplified the propagation of this type of theories, causing an increase in their effects in the real world. In particular, the preference-based content filters of social networks can lead individuals to reinforce their own beliefs and isolation in like-minded communities, which is of particular concern when it comes to conspiracy thinking. The impact of these theories in the real world can be seen following incidents such as the one known as Pizzagate, where false accusations in the context of the 2016 U.S. presidential election led to violent attacks on a pizzeria in Washington D.C., or how conspiracy theories about COVID-19 vaccines posed an additional challenge to vaccination campaigns.
To try to combat the proliferation of this type of theories, one of the main approaches is the moderation of content on social networks, by reducing the virality of certain content, and increasing the informational diversity of users. Correctly distinguishing between conspiracy theories and critical thinking is crucial for accurate content moderation, as misidentification can push rational critics into conspiracy communities, highlighting the importance of developing effective methods to identify conspiratorial content. However, the large amount of content posted on social networks makes manual moderation unfeasible and requires, if not total, at least a first stage of automatic detection. Our study focuses on addressing this challenge, differentiating as accurately as possible English texts that present conspiratorial thinking from those that present critical thinking, taking advantage of advanced artificial intelligence models in the area of natural language processing.
In order to train and evaluate these models, a previously labeled dataset of adequate size and quality is required. For this master’s thesis we have used the dataset provided by the PAN 2024 data science challenge. Specifically, our experiments consist of both individual models and ensembles, using variations of the BERT model, such as BERT-base, BERT-large and RoBERTa. We tested different loss functions, such as cross-entropy, Mix-Up and a hybrid function between supervised contrastive loss and cross-entropy. We also experimented with data augmentation techniques such as synonym substitution and random word insertion and substitution. Finally, we also explored the optimally of the MCC by setting the prediction range of the classifier to be labeled as unknown. Our best performing model during testing was evaluated within the PAN 2024 competition, in Task 1 in its English version, and achieved a Matthews Correlation Coefficient (MCC) of 0.8149 in the whole competition, which secured it the eighth place in the ranking and demonstrated a considerable level of efficiency in identifying conspiratorial content. Read More