The science of going viral (and other observations from a media data scientist)

Ronen Ben-Hador has a background in medical research. So when he started his new role at Sizmek, he was startled to discover that 'going viral' wasn't simply an analogy.

The etymology of when the verb ‘going viral’ was first used is a little vague, but the internet tells me that the Oxford English Dictionary’s earliest citation for the use of viral in the context of ‘involving the rapid spread of information’ was in 1989.

Since then it has entered everyday language, accelerated by the launch of YouTube in 2005, which made the sharing of video content much easier – as did the subsequent launch of social media channels. Consequently, pretty much everyone knows that ‘going viral’ means a piece of content that spreads quickly through the population like a virus. But fewer might realise just how alike an actual virus and a viral piece of content are.

To be honest, I’d assumed it was just a great analogy. Until I accidentally discovered there was more to it.

And this is how.

The discovery

Before I joined Sizmek four years ago, I worked across several industries including medical research, meteorology, audio technology and software. Vastly different fields, but my core purpose has remained the same – to find patterns in the data that can be harnessed for knowledge and advancement.

So, one of my first projects in my new media job was to look at social media channel traffic. I wasn’t briefed to look at viral content, simply to see where the best bang for an advertiser’s buck across different categories was going to come from at any particular time. The idea was for me to put together a ranking of popularity.

In order to try and find out what subject was going to be popular at a certain time, I had to look at past popularity. And when I looked at the data points on the graph I recognised the curves and patterns; I was looking at a colony of viruses. In a previous role, I was researching the way viral infections spread in order to find out how to treat them optimally. The two phenomena – biological viral spreading and social virus spreading – are incredibly similar.

The similarities

If we take the example of a generic type of stomach virus. It starts off with one or two you might have caught from your kid and then sits in your stomach and, with the right ‘food’ and resources, they begin to multiply – each cell giving rise to another cell until they get to a peak.

Then at a certain time, viruses start to decrease. There are many reasons, it could be you’ve taken medicine, it could be that your body is producing antibodies to fight the disease, or it could be that the resources which were making it thrive were exhausted. Either way, it’s always the same pattern. Viruses start, they inflate themselves and then they deflate themselves.

But whilst the pattern is the same, the acceleration or deflation can happen at different rates. My previous research in the medical world was trying to pinpoint the right moment to intervene with treatment.

It’s the same mechanism with viral content. The content appears somewhere and sits on that site. With the right resources (usually a site that already has a high degree of socialisation) one person shares it with six people in their network, then these six people share with another six – and so on and so forth. Then at a certain time, traffic to the webpage begins to decrease as the number of people interested in the topic is exhausted. Just like a virus it deflates itself as the resources it took to thrive (interest) is depleted.

And, give or take, they have pretty much the same life cycle. So much so that when I looked at the graph, I knew that we could develop a predictive algorithm to identify when a piece of content is going viral and – correspondingly, when advertisers should be placing their ads in order to reach a wide audience. It’s the same curve as in medical research. You know in the first few hours.

As I’d discovered this during research for another project (my first at Sizmek), I was pretty cautious about sharing my findings. Although I was confident that the data was telling the same story, I wanted to triple check. After all, viral content isn’t a physical process – was it really mirroring an actual virus that exactly?

The answer was yes. Fast forward a few years and my predictive algorithm is now a product.

This is notable as in the advertising technology industry, this isn’t always the case. Proof of concepts can sometimes take two to three years to develop once they are passed over to engineers and the product team. In many industries I’ve worked in, this isn’t too much of a problem, but in the media industry, the sheer pace of change, can mean something quickly becomes obsolete. I can discover something pretty significant at the time but then the industry moves on.

This is what has kept me in advertising for the last few years. As a data scientist you’re excited by identifying patterns and meaning in the numbers and when they are constantly on the move it adds an extra layer of complexity to the discovery process.

So next time you read about the rapid pace of change in the media industry, know that the data I see from across the globe each and every day absolutely backs this statement up.

Ronen Ben-Hador is head of research peer 39 at Sizmek.


Get the latest media and marketing industry news (and views) direct to your inbox.

Sign up to the free Mumbrella newsletter now.



Sign up to our free daily update to get the latest in media and marketing.