Back in March we presented at WARC’s Online Research Now & Next Conference, introducing what we then called Augmented Research.
The idea is simple: powering traditional qualitative and quantitative research with real-time data.
When we were invited to speak at Warc’s Datacentric Conference, we thought it would be interesting to discuss one of the latest research pilots we have been running in the area of augmented research.
The objective of the O2 Brand Graph pilot was to mine social media data in a way that would allow us to connect it to audience studies.
What follows is an initial exploration of how we can you use social media to augment a segmentation model with real-time data.

***
Many companies are learning to listen to conversations related to their brands and competitors.
However, there’s more to social media intelligence than tracking conversations by keywords.
Current social media research focuses on opinion mining and declares itself unable to map audiences. But I think we are giving up too soon.
This inability appears to be born from an assumption in the research industry that you can’t use social media to map audiences because you don’t have access to demographics.
Far from being reality, this assumption is mostly due to three reasons:
- The architects of social media mining platforms are often not led by a research agenda, but by a tech agenda – this leads to a tendency to productise and mass sell platforms, which can run in counterpoint to an openness to experimentation;
- Researchers are often not makers or technologists – therefore, they are often lazily happy with what they are given in terms of tools;
- Researchers do not always know what can be done with existing social media data streams, such as basic machine learning to figure out gender and age groups.
However, mapping audiences through social media IS possible. It’s just not in the way we used to research audiences before.
It’s all in the way you screen your audience and sample it, and in social media sampling via demographics doesn’t work. But there are many other ways of defining and screening an audience. In this study we explored one way.
Instead of tracking contents by keywords (“horizontal” tracking – any content mentioning specific keywords and keyphrases), we looked into mining social media contents and behaviours by audiences (“vertical” tracking – any content generated from a set of sources, regardless of the features of the content).

Whilst tracking social media by keywords allows us to get an understanding of how a specific topic is discussed online, tracking social media by users allows us to build a map of an audience, its hubs, its behaviours and its interests.
We called it the Brand Graph: the conjunction of the Social Graph (defined here as the network of people who are within 2 degrees of separation from the brand through social media channels) and the Interest Graph (the network of interests, topics, activities and behaviours associated with the nodes of the social graph).
***
What can you do with it?
- Dynamically understand who your audience is and how is it changing, in real-time;
- Dynamically understand what your audience is about, what makes an interesting topic and how broader cultural conversations affect it;
- Segment your audience in clusters based on topics of interest, passions, life stages, professions, online behaviours etc.;
- Plan and fine tune the content of your social media strategy;
- Engage with your audience in the right way (channels, mechanics, times of the day, tone of voice etc.);
- Assess the impact of your strategies in real-time.
Going forward, we see the brand graph becoming one of the key tools to build a seamless connection between your brand and its audience, networking it with its passions and synching it with its behaviours to maximize relevance and impact.
***
So, how did we go about building the O2 Brand Graph?
First of all we had to identify a specific pool of social media users and then analyse their public activity.
For the purpose of this pilot we limited the online audience to one channel – Twitter. We focussed on Twitter because of the granularity of the data publicly available around contents and behaviours.
Sample: We defined our sample as the entire audience of O2 on Twitter, i.e. 58.339+ Twitter users who were following @O2 (as of November 2011).
Methodologies: Statistical analysis, Semantic analysis, Network analysis, Netnography and Content analysis.
By looking at the profiles and the activity of this audience we were able to map the O2 Brand Graph on Twitter.
***
We grouped the findings in three areas:
Mapping the Social Graph

We wanted to identify sub-communities within the O2 audience on Twitter.
Because Twitter is an interest graph, we assumed that following someone implied sharing the interest of the followed user.
Therefore, a subcommunity would be identified by a high concentration of horizontal connections within the graph.
To get this information we had to map:
- 58,339 users following @O2;
- Who was following each of the 58.339 users;
- Who else in the graph any of the users was following other than O2 or the primary O2 follower.
For the sake of this exercise we looked at a sample of 1000 users. We then selected the top users with less than 2000 followers. We then mapped their connection to O2. And finally mapped who was following them.
Finally we mapped how the primary and secondary followers were connected to each other user in the graph.
We ended up plotting a graph of 1 million nodes, 1 million primary connections and 574,278 horizontal connections within the graph.
The blue links represent how primary and secondary followers are connected to each other within the graph.
By looking at the density of the connections we could identify hubs within the audience and points of high concentration of similar interests.
Once we knew where the hubs were we than isolated then and looked into the clusters.
We spotted 10 clusters and profiled them, identifying sub communities around topics such as fashion, music, rugby, technology and marketing.
Mining the interest graph / profiles and behaviours

We then analysed the static data of 58,339 profiles on Twitter gathering insights around 10 key dimensions:
- Who are they (life stage, profession, passions that define them etc.)?
- When did they join Twitter?
- Where are they based?
- Where do they tweet from?
- How often do they Tweet?
- When do they Tweet during the day?
- How many people are following them?
- How many people are they following?
- How often are they engaging in conversation with fellow users?
- How influential are they?

Mining the interest graph / interests and passions.

Finally, we analysed 3,120,371 public tweets, 122,220 tweets/day (avg), generated by the @O2 followers over one month (November 2011).
Based on this corpus we were able to gather real-time insights around a series of questions such as:
- What does the audience talk about?
- How and why do the topics change over time?
- Which contents are the most engaging (i.e. generate the highest number of reactions)?
- Which contents get shared the most?
- Which social media channels are the most popular amongst the audience?
- Which news sites are referred to more often?
- Which brands and products do they talk about?
- Which adverts do they mention?
- What movies are they into?
- Where does the brand fit in this landscape?
- How do they talk about the brand’s main competitors?
All this information is constantly updated to the second and can be sliced according to any timeframe, audience segment, audience location and basically any dimension of the audience profile or of the audience social graph.

***
The deck above outlines some of the initial data gathered and the insights uncovered. But as you can imagine this is only a glimpse of what we could learn with this kind of study. An example? Slice the topics of conversation of your audience by time of the day and you will know who would you be talking to and what you should be talking about at what time of the day.
As the last image in the deck – “The Measurers” – alludes to, with social media data we are at the very beginning of a new era of audience understanding powered by a new science of measurement.
Pilots like the Brand Graph are initial attempts at defining the boundaries of what can be measured, what could and SHOULD be measured and what we can learn from it to do a better job.
Feedback and questions welcome, belligerent challenges even more so.

connect