Tuning in to Terrorist Signals

Shaun Wright

Tuning in to Terrorist Signals

Shaun Wright

Department of Biological Sciences

Research output: Thesis › Doctoral Thesis

38 Downloads (Pure)

Abstract

Twitter, social media and big data promise much in terms of terrorist signals amenable to analysis. As, however, these signals are noisy, subjectively ambiguous and new, this thesis addresses four questions that are key to reliably ‘tuning in’ to these signals. Each chapter uses big data to investigate patterns too subtle to have been amenable to prior study, with the importance of controlling for the noise associated with big data a central theme running through the thesis.

Chapter 1 introduces the work, Chapter 2 reviews the relevant literature and Chapter 3 introduces and discusses the overarching methodology.

Chapter 4 considers the validity of inferring information about users from their Twitter language and tweets. I demonstrate that language can be horizontally transmitted and inherited; with behaviour and interactions leading to and predicting, changes in language. This extends previous work with small sample work that did not exclude imitation.

In Chapter 5, I characterise jihadist-linked accounts that resurge back from suspension—as identified with novel methods. I show that suspension is less disruptive than previous case studies implied, but that pseudoreplication has been underestimated (Wright, 2016).

Having demonstrated the scale of resurgence, Chapter 6 tests whether automated machine methods can improve identification. I develop a text similarity based model and validate it against human-annotated data.

The final research chapter, Chapter 7, tackles noise in big data when inferring
information about events in the offline world. Extending similar work, I evaluate
computational and human coded predictions of how positive geopolitical events are for Daesh. I demonstrate that while the Baqiya family tweets differently on different types of day, most patterns emerge as easily by chance in the negative control data.

The work is novel as although some attempts have been made to address the questions in this thesis—or similar ones—using case studies, small samples and laboratory studies, all of these suffer limitations. Some studies have not asked the exact same question, some conclusions have been insufficiently supported with evidence and others have simply been beyond the reach of existing methods.

Together, the pieces of work in this thesis shows that computational analysis of big data enables tuning in to subtle signals and sometimes reveals conclusions that contradict less developed research. Control noise, however, often contains as many patterns and thus, future studies should pay particular attention to their methodologies when using noisy, subjective, social media data.

Original language	English
Qualification	Ph.D.
Awarding Institution	Royal Holloway, University of London
Supervisors/Advisors	Jansen , Vincent A.A., Supervisor Denney, David, Supervisor Pinkerton, Alasdair, Supervisor Adey, Peter, Supervisor Bryden, John, Supervisor
Award date	1 Jun 2017
Publication status	Unpublished - 2017

Keywords

Terrorism
Twitter
Social Media
Language
Language and Linguistics
Security

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

Wright, S., 2017. Tuning in to Terrorist Signals [PhD Thesis]. Royal Holloway University of London.Other version, 5.59 MBLicence: CC BY-NC-SA

Cite this

@phdthesis{6ca6592bfc2340deb5cd48af27e83541,

title = "Tuning in to Terrorist Signals",

abstract = "Twitter, social media and big data promise much in terms of terrorist signals amenable to analysis. As, however, these signals are noisy, subjectively ambiguous and new, this thesis addresses four questions that are key to reliably {\textquoteleft}tuning in{\textquoteright} to these signals. Each chapter uses big data to investigate patterns too subtle to have been amenable to prior study, with the importance of controlling for the noise associated with big data a central theme running through the thesis.Chapter 1 introduces the work, Chapter 2 reviews the relevant literature and Chapter 3 introduces and discusses the overarching methodology.Chapter 4 considers the validity of inferring information about users from their Twitter language and tweets. I demonstrate that language can be horizontally transmitted and inherited; with behaviour and interactions leading to and predicting, changes in language. This extends previous work with small sample work that did not exclude imitation.In Chapter 5, I characterise jihadist-linked accounts that resurge back from suspension—as identified with novel methods. I show that suspension is less disruptive than previous case studies implied, but that pseudoreplication has been underestimated (Wright, 2016).Having demonstrated the scale of resurgence, Chapter 6 tests whether automated machine methods can improve identification. I develop a text similarity based model and validate it against human-annotated data.The final research chapter, Chapter 7, tackles noise in big data when inferringinformation about events in the offline world. Extending similar work, I evaluatecomputational and human coded predictions of how positive geopolitical events are for Daesh. I demonstrate that while the Baqiya family tweets differently on different types of day, most patterns emerge as easily by chance in the negative control data.The work is novel as although some attempts have been made to address the questions in this thesis—or similar ones—using case studies, small samples and laboratory studies, all of these suffer limitations. Some studies have not asked the exact same question, some conclusions have been insufficiently supported with evidence and others have simply been beyond the reach of existing methods.Together, the pieces of work in this thesis shows that computational analysis of big data enables tuning in to subtle signals and sometimes reveals conclusions that contradict less developed research. Control noise, however, often contains as many patterns and thus, future studies should pay particular attention to their methodologies when using noisy, subjective, social media data.",

keywords = "Terrorism, Twitter, Social Media, Language, Language and Linguistics, Security",

author = "Shaun Wright",

year = "2017",

language = "English",

school = "Royal Holloway, University of London",

}

TY - BOOK

T1 - Tuning in to Terrorist Signals

AU - Wright, Shaun

PY - 2017

Y1 - 2017

N2 - Twitter, social media and big data promise much in terms of terrorist signals amenable to analysis. As, however, these signals are noisy, subjectively ambiguous and new, this thesis addresses four questions that are key to reliably ‘tuning in’ to these signals. Each chapter uses big data to investigate patterns too subtle to have been amenable to prior study, with the importance of controlling for the noise associated with big data a central theme running through the thesis.Chapter 1 introduces the work, Chapter 2 reviews the relevant literature and Chapter 3 introduces and discusses the overarching methodology.Chapter 4 considers the validity of inferring information about users from their Twitter language and tweets. I demonstrate that language can be horizontally transmitted and inherited; with behaviour and interactions leading to and predicting, changes in language. This extends previous work with small sample work that did not exclude imitation.In Chapter 5, I characterise jihadist-linked accounts that resurge back from suspension—as identified with novel methods. I show that suspension is less disruptive than previous case studies implied, but that pseudoreplication has been underestimated (Wright, 2016).Having demonstrated the scale of resurgence, Chapter 6 tests whether automated machine methods can improve identification. I develop a text similarity based model and validate it against human-annotated data.The final research chapter, Chapter 7, tackles noise in big data when inferringinformation about events in the offline world. Extending similar work, I evaluatecomputational and human coded predictions of how positive geopolitical events are for Daesh. I demonstrate that while the Baqiya family tweets differently on different types of day, most patterns emerge as easily by chance in the negative control data.The work is novel as although some attempts have been made to address the questions in this thesis—or similar ones—using case studies, small samples and laboratory studies, all of these suffer limitations. Some studies have not asked the exact same question, some conclusions have been insufficiently supported with evidence and others have simply been beyond the reach of existing methods.Together, the pieces of work in this thesis shows that computational analysis of big data enables tuning in to subtle signals and sometimes reveals conclusions that contradict less developed research. Control noise, however, often contains as many patterns and thus, future studies should pay particular attention to their methodologies when using noisy, subjective, social media data.

AB - Twitter, social media and big data promise much in terms of terrorist signals amenable to analysis. As, however, these signals are noisy, subjectively ambiguous and new, this thesis addresses four questions that are key to reliably ‘tuning in’ to these signals. Each chapter uses big data to investigate patterns too subtle to have been amenable to prior study, with the importance of controlling for the noise associated with big data a central theme running through the thesis.Chapter 1 introduces the work, Chapter 2 reviews the relevant literature and Chapter 3 introduces and discusses the overarching methodology.Chapter 4 considers the validity of inferring information about users from their Twitter language and tweets. I demonstrate that language can be horizontally transmitted and inherited; with behaviour and interactions leading to and predicting, changes in language. This extends previous work with small sample work that did not exclude imitation.In Chapter 5, I characterise jihadist-linked accounts that resurge back from suspension—as identified with novel methods. I show that suspension is less disruptive than previous case studies implied, but that pseudoreplication has been underestimated (Wright, 2016).Having demonstrated the scale of resurgence, Chapter 6 tests whether automated machine methods can improve identification. I develop a text similarity based model and validate it against human-annotated data.The final research chapter, Chapter 7, tackles noise in big data when inferringinformation about events in the offline world. Extending similar work, I evaluatecomputational and human coded predictions of how positive geopolitical events are for Daesh. I demonstrate that while the Baqiya family tweets differently on different types of day, most patterns emerge as easily by chance in the negative control data.The work is novel as although some attempts have been made to address the questions in this thesis—or similar ones—using case studies, small samples and laboratory studies, all of these suffer limitations. Some studies have not asked the exact same question, some conclusions have been insufficiently supported with evidence and others have simply been beyond the reach of existing methods.Together, the pieces of work in this thesis shows that computational analysis of big data enables tuning in to subtle signals and sometimes reveals conclusions that contradict less developed research. Control noise, however, often contains as many patterns and thus, future studies should pay particular attention to their methodologies when using noisy, subjective, social media data.

KW - Terrorism

KW - Twitter

KW - Social Media

KW - Language

KW - Language and Linguistics

KW - Security

M3 - Doctoral Thesis

ER -