Hello. I am a social media specialist, web designer and developer. When not hard at work, I am probably travelling the world! More about me

Entries tagged to 'Measurement'

Facebook Comment Deep-Dive: Analyzing an HTC Comment Thread

Posted 7 August 2011 Tagged to , ,

HTC recently posted to their Facebook wall a simple question: “How many mobile phones have you owned?” Within a day they received about 2000 answers. Using the Facebook’s Graph API I wondered how hard it would be to automate the analysis to find out the average number. Here is the result:

Note: I removed about a dozen responses that stated ownership of more than 500 mobile phones. While this is probably feasible, it was a small minority and skewed my results too much.

Previous Manufacturer

In most cases, the owner upgraded from a phone by one of these makers to an HTC. As such, these are the “losers”, with Nokia coming out worst.

Current Model

Commenters more readily their shared current model. As expected, most of these are HTCs, though some commenters left HTC for a new maker.

Comments by User Gender

While expecting a male bias, I didn’t expect it to be so one-sided. I’m interested to know if the page demographics are similarly skewed.

Comments by Locale

Facebook doesn’t provide location for un-authenticated requests. Fortunately, locale is a useful proxy. (EN includes US and GB)

Comments by Sentiment

This measures the sentiment toward HTC and their products, not the sentiment of the comment. The sentiment was reviewed manually, so the accuracy is quite high. However, the small sample-size should discourage reading too much into this set of results.

Methodology

The process for this was actually quite straight-forward. First I used the Facebook Graph API to download all the comments into a database. Because I didn’t want to manually review 2000 comments, my first pass automatically checked for comments containing only numbers. This took only a few minutes to code and knocked out 400 comments.

I then selected very short messages with the assumption that they were just numerical answers with punctuation. This proved prescient since many people noted the number of phones they’d owned and appended smiley faces. This took care of another 250 or so messages. To review the final set, I coded a small web application to manually step through each comment. Through this application I could also note the comment sentiment, past and current phones, and other facets. After some refinement to the app, I was able to review about a dozen messages a minute and complete the entire review in less than two hours.

Notes

I looked at several other Facebook posts with a large number of comments, but choose this one because many of the answers would be uniquely easy to automatically parse. If the question required a prose response, the analysis would need to be done more manually, which would be quite time-consuming. Crowd sourcing could provide a solution, perhaps through something like Amazon’s Mechanical Turk.

Because I coded the web application solely for this topic, I was able to very specific in my search parameters, which allowed for some interesting insights. For example, it was fascinating to see people’s ownership history (especially how their brand-loyalty has evolved over time). Similar analysis using an off-the-shelf tool (Sysomos, Radian 6, etc) would not have provided this level of customization and granularity.

Read Comments

Seattle Twitter Technorati: Who’s the best networked?

Posted 18 July 2011 Tagged to , ,

After mapping the best-networked DC Twitter technorati, I figured I’d try it out on an environment I’m not familiar with: Seattle.

The results

Here are the results of the best-networked Seattle tech Twitter users:

Rank Handle Name Relationships Followers
1 ShaunaCausey Shauna Causey 867 25530
2 moniguzman Monica Guzman 602 13516
3 BillGates Bill Gates 564 2999410
4 jennihogan Jenni Hogan 529 33243
5 TheNewsChick Linda Thomas 527 15648
6 KevinUrie Kevin Urie 524 3837
7 briancrouch Brian Crouch 520 11354
8 ChrisPirillo Chris Pirillo 492 96065
9 jshuey Jeff Shuey 485 16870
10 LilyJang Lily Jang 483 15718
11 seattlewinegal Barbara Evans 467 12300
12 Shih_Wei Veronica Wei Sopher 455 3670
13 Ryanintheus Ryan Hodgson 440 2777
14 JessEstrada Jess Estrada 433 4543
15 BMW Brian M. Westbrook 429 7269
16 ColinAC Colin Christianson 415 5505
17 JenniferCabala Jennifer Cabala 404 3196
18 johnhcook John Cook 403 5740
19 thinkmaya thinkmaya 397 6218

Shauna Causey, who manages social media at Nordstrom, ranks the highest, by quite a margin over second-placed Monica Guzman, a tech journalist. Unsurprisingly, Bill Gates, perhaps the world’s foremost geek (and I use that approvingly), ranks highly in third. Two broadcast journalists, Jenni Hogan and Linda Thomas, follow-up in forth and fifth. Kevin Urie, founder of Social Media Club Seattle, ranks sixth.

Rounding out the top ten are: Brian Crouch (7th), social media manager for the Music Group; Chris Pirillo (8th), a social media strategist; Jeff Shuey (9th), board member of the Seattle Social Media Club; and, finally, Lily Jang (10th), a local TV anchor.

For the methodology and additional information, check out the post I did on my DC analysis.

A few notes

  • This analysis mapped 1910 individual Twitter accounts, resulting in 4.2 1.4 million relationships to crunch. Interestingly, this is fewer accounts than the DC analysis (2700 accounts), but more than twice as many relationships (1.9 million) Check out the update below.
  • There were more local journalists in my Seattle results, likely due to the wider network of relationships indexed as compared to the DC analysis. I debated removing these accounts to focus just on people self-identified as involved in the tech scene, but I didn’t want to adjust the results too much.
  • There were a good number of “institutional” Twitter accounts that I filtered out. Check out the complete list of results to see how they scored.
  • It’s interesting to see how well Kevin Urie was ranked. Despite having a relatively small following, he was very well connected within this network, no doubt through his work with the Seattle SMC.
  • Oh, and why Seattle? My girlfriend is working there and can help validate the results.

Know the Seattle tech scene well? How do my results look to you? Anyone I missed?

Update

For some reason the app created multiple entries in the relationships database table, which quite inflated the numbers above. Since a “relationship” means someone within the network analyzed is following that account, you cannot have more followers than there were accounts analyzed. I should have noticed this when writing up the results (alas, late night hacking). This doesn’t change the ranking order much, though JenniferCabala and johnhcook did swap places in 17 and 18 (sorry John). Monica Guzman‘s follow-up questions brought the error to my attention!

I should also say that this is pretty experimental and involves hacking on nights and weekends. Nonetheless, it seemed to have worked well on the DC network.

Seattle skyline photograph by Bala

Read Comments

DC Twitter Technorati: Who’s the best networked?

Posted 11 July 2011 Tagged to , ,

When it comes to social media for business, there is one question on everyone’s mind: Who are the influential people in my area? Unfortunately answering this is easier said than done. Take Twitter for example. You could look at a user’s total followers or the number of lists they are on, but those are blunt instruments at best. When you’re focused on a specific topic, those numbers can be downright misleading.

After mulling this over, I figured a good measure of potential influence would be how well networked a person is in a particular topical environment. To test this hypothesis I decided to look at an area I know pretty well: the Washington DC tech scene. Since I already have a good sense of this community, I could verify the analytical results from my own knowledge.

The Results

After doing my analysis, here is my ranking of the top ten most networked individuals:

Rank Handle Name Relationships Followers
1 corbett3000 Peter Corbett 671 7,980
2 dcconcierge Shana Glickfield 644 5,979
3 FrankGruber Frank Gruber 571 27,172
4 cheeky_geeky Mark Drapeau 539 19,652
5 DCeventjunkie Lisa Byrne 492 5,755
6 shashib Shashi Bellamkonda 482 14,287
7 digiphile Alex Howard 462 78,433
8 alexpriest Alex Priest 427 4,940
9 SteveCase Steve Case 414 416,114
10 digitalsista Shireen Mitchell 408 7,562

Overall, this squares pretty well with my knowledge of this community. Number one, @corbett3000, belongs to none other than Peter Corbett, the CEO of iStrategyLabs, a leading DC technology firm and organizer of many DC tech conferences (including the upcoming 10-day DC tech festival). Coming in second is Shana Glickfield, who goes by @dcconcierge, and is DC’s consummate networker and FOMO sufferer. Rounding out the top three is Frank Gruber, @frankgruber, CEO of TechCocktail.com. Number four is every one’s favorite Microsoft staffer, Mark Drapeau who tweets at @cheeky_geeky.

It is particularly interesting to see how a large Twitter following does not necessary translate into a significant number of relationships within this particular network.

For kicks, I threw the top forty accounts into NodeXL to see what the network looks like:

Network of the top forty DC Twitter technorati. Each line represents a single follower-followed relationship. It is interesting to see which accounts are more central within this small network and which are more peripheral.

Network of the top forty DC Twitter technorati. Each line represents a single follower-followed relationship. It is interesting to see which accounts are more central within this small network and which are more peripheral.

Methodology

And the big question: how did I arrive at these results? Here is the process I used:

  1. My starting point was trying to figure out how to measure the number of connections within a particular geographic region and subject area. For this, I needed a good index of who was active in the DC tech sphere on Twitter. Fortunately, this part of the job has been done by the community in the form of Twitter lists. I trolled through a large selection of Twitter lists looking for ones that had “DC” and either “social media” or “technology” and entered those info my database.
  2. I then went through each list and saved all the individual accounts on that list.
  3. From the users on these lists, I ended up with a database of about 2700 Twitter accounts that, through the Twitter lists, were related to the DC tech scene.
  4. In the most time-consuming part of the analysis, I set-up a system to download all the people these Twitter accounts followed. Since a follow is an expression of interest in the followed account, I counted that as a “vote”. This is much the same way Google considers a link a vote of confidence in the linked page.
  5. After a few days of downloading data from Twitter, I had a database of nearly 2 million follower-to-followed relationships. Using this index, I checked which accounts were most frequently followed and ranked them according.
  6. I was able to automate nearly every set after selecting the Twitter lists, but the final step requires a good deal of “eyes on screen”. The final list included a number of very widely followed accounts, but not ones I was interested in. Since I was only focusing on individuals in DC, I removed a lot of institutions and people outside of the city. For example, the top three @mashable, @barackobama, and @techcrunch are all widely followed, but not members of the DC tech scene. As such, I excluded these from the final results. (It is interesting that Mashable has a much wider following in DC than Techcrunch. I imagine if I was looking at Silicon Valley the ranking would be inverted.)

I was able to automate most of the above steps using a small app I coded, so much of the data collection took place while I was fast asleep in bed. Due to Twitter API data call limits, most of this was done using automated CRON jobs. Steps 1 and 6 though were not automated and required diligently going through the data.

Final Thoughts

So that’s it? Certainly not, this only measures who is widely followed within this topical subset, not who is actually influential. You’d need to combine this with other measures (frequency of retweets, ability to drive conversations, and so on) to get real sense of influence.

Fortunately the results do square pretty well with my understanding of the DC tech scene, which helps validate the approach. Most likely I’ll do some more playing around with this technique, so stay tuned.

Read Comments

Public Media Camp: Hubs and Spokes and a Look at Measurement

Posted 27 October 2009 Tagged to , , ,

Recently I had the pleasure to participate in the Public Media Camp, an unconference focused on strengthening local and national public broadcasting. A good portion of the discussion focused on the disruptive and new opportunities being presented by Internet-based dissemination and social media.

Of Hubs and Spokes

While the focus on social media related well to my work in public diplomacy, the very structure of public media actually seems quite similar to the hub and spoke model of the central State Department in Washington and the various embassies, consulates and missions scattered around the world. As with public broadcasting, content is produced and disseminated in Washington and the very diverse missions overseas. Just as NPR or PBS in Washington balances the needs of their direct national audience with the needs of their affiliate stations, the State Department also has to support an international audience for its America.gov properties while meeting overseas mission needs.

Additionally, most public media outlets focus more on informing audiences and social change than increasing profits. Public diplomacy has similar goals: changing perceptions about the United States’ and its policies and creating a better environment for U.S. goals, such as democratization, improving religious freedoms and so on. Without profits as a baseline metric, both organizations aim for more intangible goals, such as those elucidated above. This makes measurement more challenging, with related knock-on effects.

Continue reading