What goes into a del.icio.us post? How do people choose the tags to apply? How does the interface affect tag choice? How does this influence flocking behavior? I’d like to take a stab here at these questions and more.
Throughout this post I’ll be using the example of 23 Photo Sharing. I found it through “nontraditional” means (not a web search, physical advertisement, or by word-of-mouth), and while the front page of the site is rich, it does not show exactly how the site works. The combination of information and ambiguity present on the tagged page is good for this discussion.
I’ll go ahead and tag this entry — hold on a second.
Okay, I’m back. I’ve posted it to my general bookmarks account, del.icio.us/phyzome.
Needless to say, everyone has their own unique tagging style, so one model isn’t going to cover everything. But I have noticed a few variables that seem to be fairly independent of each other and which can be used to describe a large portion of the del.icio.us population. There are a few tagging style continua to consider. Where are you on each continuum?
- fringe vs. core (tail vs. head)
- When tagging, do you tend to use the obvious, central ideas as tags, or do you also include the more tangentially related items? I’m the latter, a fringe tagger. I include any and every descriptive aspect of an object in its tags. Others, core taggers, tend to stay with the main concepts. Maybe they don’t like the clutter.
- implicit vs. explicit
- Do you use redundant tags? Explicit taggers might tag 23hq.com as ‘webservice’, ‘service’, and ‘web’, whereas implicit taggers might choose only ‘service’ and ‘web’. In fact, implicit taggers would likely drop the ‘web’ as well, since it is implied and not the topic of the site. Would you be more likely to tag a post both ‘linux’ and ‘computer’, or just ‘computer’? I vary in my position on this spectrum.
- innovated vs. suggested
- When I tag a post, I try to avoid looking at the suggestions until I’ve exhausted all of my own associations. Then I check out the recommended tags. This way, I can avoid getting distracted by similar terms, which would wipe out the better-fitting ones I come up with unaided. That makes me a highly innovated tagger — and I suspect we are rare.
Tagging on the edge
Have a look at the URL page for a recent news item. There are several things to notice here.
First, the tag ‘absurd’ appears twice — an interesting tag to use, and apparently a shared sentiment. This shows a move away from strict subject-based tagging and towards reaction-subject-context tagging. More on this later.
Second, the tags ‘RIAA‘ and ‘MPAA‘ both show up, as well as ‘DRM‘. They aren’t actually mentioned in the article, which does mention the MPA, a completely different organization. DRM is also not a subject of the article. While the ‘MPAA’ tag is likely an honest mistake, the ‘DRM’ and ‘RIAA’ tags are indicative of “associative tagging”. The article describes a situation very similar to the current DRM, MPAA, and RIAA controversies, and taggers are either assuming there is a relationship or consciously tagging the similarity itself.
Del.icio.us gives you a list of recommended tags, but I prefer to first write down all of the tags that spontaneously pop into my head. For comparison, I’ve put together a table for comparing several tag lists (the popular tags, the tags I spontaneously used, the tags that del.icio.us recommends, and the recommendations I accepted) for 23hq.com.
|Popular||Most commonly-used tags for a link, often misleading|
|Spontaneous||I thought of these as I looked at the site|
|Recommended||Del.icio.us recommended these, I know not the algorithms|
|Accepted||These are tags I didn’t think of until I looked at the Recommended list|
Observations & Ruminations
The first thing that really caught my attention was the presence of the tag ‘flickr’. I assume people are using “flickr” to indicate that 23 is a flickr-like site (simultaneously genericizing flickr and promoting it). I wonder how the folks at flickr feel about this? In fact, I wonder how I feel about this. Perhaps it is a sign of things to come: tagging sites with the names of similar sites. Maybe I’ll start doing that as well. It can’t hurt; the tag ‘flickr’ can only otherwise be legitimately applied to flickr itself, a few galleries, and perhaps an article on flickr — that’s a pretty tight niche. Tagging “flickr-homologous” sites with ‘flickr’ shouldn’t interfere greatly with actual flickr-related sites — those should have ‘flickr’ in their core tagset, not their tail.
Remember how several people had tagged an article with ‘absurd’? I am extremely interested in this. People are beginning to use tags that go beyond content to the realm of reaction and context. This “contextual tagging”, as I call it, gives an intriguing window into taggers’ minds. Just as subject tagging on del.icio.us shows how interest waxes and wanes over time in certain subject areas, contextual tagging shows attitudes over time as well.
I use a wonderful program called KimDaBa to organize my photos, and I started using contextual tagging there long before I used it for link tagging. I’m not exactly sure where the difference lies — perhaps because the photos are designed to invoke feelings, reaction tags are more important. Perhaps also context is sometimes lacking in the photo itself, and needs to be added more explicitly.
I confess that I am merely fascinated by contextual tagging, and don’t have much more analysis to present on the topic. I do feel that it will be important in some way, though.
Innovated vs. Suggested
Suggested taggers create a swarm effect, scanning the information less critically and letting more crap through. They can skew the core of the link’s tags by over-emphasizing an “incorrect” set of tags. If a poor tag gets established early on, it can rocket up to the top (though this isn’t always bad — see the ‘flickr’ example). Innovated taggers tend to balance the link’s tags toward a more appropriate description of the link.
Fringe vs. Core
Fringe taggers have a tendency to add cruft to the search results, especially in an environment like del.icio.us. The stochastic effect of thousands of taggers generally smooths this out, but in smaller tagsonomies, this can overpower a user browsing the tags. That’s why tag clouds are so handy — they visually filter out the fringe tags. Fringe taggers also tend to reduce the height of the tag graph — the core is not as distinct from the fringe. I call this flattening.
Core taggers strengthen the best result for a particular search (where a clearly best result exists at all). This does not work for searches where many unique but overlapping good results exist — and the core tags lack the power to distinguish between variants on the desired results.
I would like to add that fringe taggers aid fringe searchers. The effectiveness of a tagging style is directly related to the search styles. I tend to be a fringe searcher — I use less common tags to home in on the perfect result.
Implicit vs. Explicit
Implicit and explicit tagging styles have an extremely powerful effect on search results. Without even an attempt at a term hierarchy in place, as with del.icio.us, a search for ‘technology’ will turn up none of the results tagged only with ‘computer’, though they are obviously relevant. At the same time, explicit tags can cause major clutter. Some tagsonomies support semi-hierarchies, where tags can have parents or even ancestries. Potentially, the tag bundle feature on del.icio.us could provide rough tag-similarity guessing to improve implicit search results.
There are several types of del.icio.us posts that I have noticed, and perhaps I should have handled these separately to some extent:
- I have an account dedicated solely to news alerts and important events. Each post is outdated within the week, so I like to keep them separate. The other type of spread-the-word post is the maybe-someone-else-will-need-this post.
- These make up the largest chunk of del.icio.us posts, I believe. Perhaps the trend is shifting slightly towards the previous category, but this still seems to be the reason for most posts.
- Does anybody actually go back and look at their to-read list? It’s rare that I do.
- What if you don’t have a tagging system for your website? Just create a del.icio.us account for your site, and tag all your pages. Then use the del.icious engine to retrieve and search. (I lied a bit. This is somewhat theoretical — I haven’t actually seen any of these, but I was tempted to make an account to do this before I found a tagging plugin for WordPress.) I’ve also tagged a few obvious resources like Google even though no one would ever need the link. It’s just an obsessive-compulsive thing, I guess.
Raise your hand if you’d prefer comma-delimited tags instead of space-delimited tags. Uh-huh, that’s what I thought. No more CamelCase and dropped punctuation, more coherent tagging!
Sometimes I want to smack someone for the way they’ve used a tag. But that impulse runs counter to the nature of these emergent taxonomies. I can only manage my own tagging, and what happens to the rest of the tagosphere is out of my hands. (That’s probably a good thing.)
You can give people the technology, and at most suggest what they might use it for, but that’s it. You can’t control what end-users will do with a product or service. Spam is a classic example, as is Google Maps. One has a bad ending, the other a good one.
The tag is an elemental structure, just like a collection or a look-up table. Whatever bells and whistles you put around it, the essence doesn’t change — shared brief descriptors in a many-to-many relationship with discrete data nodes.
What power and meaning people choose to imbue a tag with evolves out of the control of anyone.