Importance of Natural Resources

How did Pew Research Center identify Twitter bots?

– Bots. They’re all around us, and they can serve a number of
purposes, both good and bad. They’re in our homes,
they’re on our phones, and they’re common on social
media platforms like Twitter. But just how common are they? Pew Research Center estimates that as much as two-thirds
of tweeted links to popular websites are
generated by bots, not humans. Now, that’s only tweeted
links to popular websites. But how did we determine
that bots are responsible for sharing so much content on Twitter, and what exactly is a bot? Broadly speaking, a bot
is a software application that can complete automated tasks, often replacing the need
for humans to do them. Ever ask the voice assistant on your phone to play that song you’ve
been super into lately? Yup, that’s a bot. Bots appear in all
sorts of digital spaces, and people interact
with them all the time, sometimes without even realizing it. It’s no secret that there
are bots on Twitter. There are Twitter bots that can send alerts after an earthquake or post what’s new on Netflix. And then, there may be bots
that act with malicious intent, trying to spread misinformation
or sow confusion. Our study of Twitter bots
focused on automated accounts that tweet or retweet links
to content around the web without human intervention. This is part of a larger research agenda at Pew Research Center to better understand the
flow of information online and the impact it has on society. To measure how pervasive bots are in sharing links to
external sites on Twitter, we started by collecting 1.2 million English-language
tweets containing links using Twitter’s streaming API over a 47-day period
in the summer of 2017. This gave us a random
sample of public tweets, up to 1% of all public posts each day. We wanted to know both
how many of these links were shared by bot accounts and what kind of content
bots were sharing. Was it mostly news? Was it sports? Or was it something else entirely? To figure this out, we
wrote a computer program that followed each tweeted
link to its destination and then saved the location
of that page in a database. We then isolated nearly 3,000 websites that were shared most frequently. After we had a list of tweets with links to these most popular sites, we counted how many came from bots. Sounds easy, right? Well in practice, it’s fairly complicated. First, classifying a
million tweets by hand takes a long time. Second, relatively few
automated Twitter accounts identify themselves as bots, and it’s not easy to know if
an account is a bot or not. So, in order to determine which of the accounts in our sample were, we turned to a tool called Botometer. Now, if you’re wondering
what Botometer is, it’s a machine-learning algorithm developed by researchers
at Indiana University and the University of Southern California. It uses over 1,000 pieces of information about a Twitter account to determine if that
account is likely a bot. This information includes
the age of the account, who they follow, the
content of their tweets, among other things. Also, rather than simply
declaring a Twitter account to be a bot or a human, Botometer actually gives each account a score between 0 and 1. But researchers like ourselves
are the ones to decide where to draw the line
between human and bot by choosing a threshold score. So to ensure Botometer was a reliable tool for our study, we tested it using data from human coders who classified over 300
accounts as automated or not. We then had Botometer classify the same accounts at various thresholds to determine the most accurate score to use for our full sample of tweets. By the way, you can see all of these tests in the methodology section of our report. In the end, we estimated
that automated accounts are responsible for
sharing about two-thirds of tweeted links in our sample. Comparatively, two-thirds of tweeted links to popular news and current events sites were also shared by bots. And overall, bots were
most likely to share links to sites that focused on
sports or adult content. So, now you know how we
identified bots on Twitter. But what we can’t say from this study is whether the links
shared by bot accounts contain truthful information or not, or the extent to which
human users interacted with content posted by suspected bots. It is important to remember
that not all bots are nefarious, and some of them may
even play a valuable role in the social media ecosystem. But we’ll let you decide
that for yourself. To learn more about our methods and about what types of content
bots are sharing on Twitter, check out our report “Bots
in the Twittersphere” at

Reader Comments

  1. I noticed some news accounts posting the same article link at different time in one or several days. I think it is the bot and it is OK.

  2. Bots everywhere on twitter. These bots are killing twitter . Identifying bots is important for twitter to live years. Unless you didnt solve this problem twitter will be deserted to bots.

  3. I hear more and more about bots these days. The first time I ever communicated with a custom bot was in Quake 3. It sometimes responded with questions or just did a good job mimicking a person. I was simply trying to play a game and the server just happened to have custom bots on it. My computer got hacked back in the days of the BSOD and no reboot was saving it. I had no idea what I stumbled across but I actually gave up gaming for years after that behavior. Some one got pretty malicious over a video game.

Leave a Reply

Your email address will not be published. Required fields are marked *