A year ago, Twitter began testing a feature that asked users to pause and reconsider before responding to a tweet using “harmful” language – meaning language that was offensive, trolling or otherwise offensive in nature.
Today, the company says it is rolling out improved versions of prompts to English language users of iOS and soon Android, adjusting its systems to determine when to send reminders to better understand when the language used in the reply is actually harmful. The emphasis is on the gradual emergence, or the idea behind the noses, of developing psychological strategies to help people make better decisions about what they post. Studies have shown that a nod like this can lead a person to edit and delete posts that they would otherwise regret.
This has also been found to be true in Twitter’s own tests. It states that 34% of people have corrected their initial answer after seeing the prompt, or will not send the answer at all. And once requested people create an average of 11% less offensive responses in the future.
This indicates that at least for some small groups the prompt has a lasting effect on user behavior. (Twitter also found that users who were requested were less likely to return malicious responses, but did not approve the amount of this metric.) However, Twitter’s initial tests ran into some problems into its system and algorithms often struggling to understand the definition that occurs in conversations. For example, it can’t always distinguish between offensive responses and sarcasm or sometimes even friendly banners.
It also struggled to consider situations in which language was being recovered by the presented communities and then being used in a non-harmful way. The goal of the improvements starting today is to address these issues. Twitter says it has adapted the technology to these regions and others. Now, it will take into account the relationship between the author and the respondents. This means that if they both follow each other and answers often; they may have a better idea of the preferred tone of communication.
Twitter says it has improved technology to more accurately identify powerful language, including pornography. And it’s easier for those who see the prompts to let Twitter know if the prompts are helpful or relevant – data that can help systems improve. We must see how good all these works are.