Are Tags Vannevar Bush's "Trails"?
Posted 4/29/2006 07:04:00 PM |

Bush's article is a fun read, full of postwar optimism and a sense that all can be accomplished. As it turns out, Bush was right - the Memex could be accomplished. Microsoft's MyLifeBits (see the 2006 CACM writeup for the most recent literature) is a Memex of sorts, a relational database of multimedia life. And if form factor wasn't an issue, the Memex would be more than a laboratory reality; for the most part all facets of Bush's system exist and work relatively well.
Well, except for one - that's the notion of "trails". Introducing trails, Bush describes a highly adaptive system of pathfinders back to stuff you care about. If you're going to record an entire life, you need very effective ways to re-find your information. This is not a trivial problem, however - text can be searched effectively, but what about sound and video, pictures and places? We're still figuring that out - and there's a lot of work to do. What's more, media alone isn't good enough; we need context to be able to understand why a kept picture or sound recording is interesting. Microsoft addresses this with textual annotation in the MyLifeBits/Stuff I've Seen work.
I think Microsoft is on to something - annotation is important. In the papers, the authors describe systems to collect story annotation about events. While there is a bias against loose hierarchy, I couldn't help but think how much this approach wants to reminds me of tagging. The hard problems of the Memex are in dealing with material that can't be "indexed", so instead we annotate the material with our stories. The only problem is that stories move completely away from any notion of controlled vocabulary; association becomes a free-text search problem instead of a categorization problem. Sure, we've got the tools to deal with that, but it just doesn't feel efficient.
Enter tagging. For the most part I believe tagging systems work best for personal re-finding. At sufficient scale, our collective personal re-finding becomes a collective knowledge, but this is an ancillary effect, rather than an intended purpose. Tagging systems don't work because a ton of people use them; they work because tags are valuable to us. In a recent, much discussed article on social tagging in the enterprise, Raytheon employees described a system where librarians would gate-keep classification tags; that is, people would tag items, and the librarians would allow specific tags to be integrated into the document's classification, if they felt the tag was correct. Stowe Boyd raised the right issue, this is tagging mis-implemented - tagging is not a top-down expert system, rather, it is a system for personal use.
Now I diverge my path for just a second, to provide an example. After putting 600-odd items into del.icio.us, I finally "got" del.icio.us. That is to say, I finally figured out how to use del.icio.us in the way that's right for me. Here's an example. When I'm planning travel, I'll tag all of the things related to my travel with the name of the place I'm visiting. So that means I'll tag the conference website, the hotel, the ground transport and places around town I want to check out with the city name. I'll also tag my airline with the city. As you can see, one of these things is not like the other. Tagging a hotel in Vancouver with "vancouver" makes sense - but tagging Orbitz.com with "vancouver"? Would an expert ever allow this tag? No, they'd tell me to tag it with something more general, like airline or travel.
However, in tagging Orbitz.com with vancouver, I've made it personally relevant and re-findable. I can go to del.icio.us/fstutzman/vancouver and find everything I need related to my Vancouver trip. Its a fantastic re-finding system, and to that extent, it is facilitated by tags. But didn't I just break the system by giving Orbitz a tag of vancouver? No, because tags alone are really only valuable to the tagger. It is only when the number of taggers grow to a certain size do you get the notable ancillary effect of collective meaning.
Tags are characterized in systems by a "cloud" - the cloud represents the tags most commonly applied to an item. Consider 100 people tagging Orbitz.com; certainly, some will tag Orbitz like I do - but many will "properly" tag orbitz with "travel" or "airfare". Tags like "vancouver" and "ohio" will languish at the obscure end of the tag cloud, whereas the common, general tags, like travel and airfare, will get pushed to the top by the group. Its a nice side-effect, but its important to remember that's only what it is - a side-effect of system's scale. Raytheon breaks the model doubly by 1) implementing tags as a "public" finding system, rather than a personal re-finding system and 2) preventing the tag cloud from naturally working itself out. Imagine if I wanted to tag Orbitz with vancouver, and the system rejected my tag as "out of scope." This would completely break the system for me. In fact, if I wanted to tag Orbitz with gibberish, as long as that gibberish was meaningful to me, it would be completely fine. The opportunistic tag cloud is only a side-effect of scale, and by concentrating on that, rather than personal re-findability, we break the model.
So this gets me back to the Memex. In a tag-aware Memex, I can easily tag things - whether they be pdf's, mp3's or video files, in a way that I'll be able to recall them later. I can tag a video "2006 birthday" and be reasonably sure that 50 years later I'll know that video clip was from my 2006 birthday. Here's the thing, though - I'm not naturally good at tagging. Earlier, I said something to the extent that it took me 600 tagged items to figure out how to use del.icio.us in a way that's right for me. We can do better. It makes me think there's best practice, a best practice for tagging for personal re-findability. This best practice is obviously a mix of self-awareness (how do I think about things so I can find them at a later date) and good vocabulary/classification skills; put simply, we can find ways to teach tagging so that more users can embrace it. To this extent, librarians could relinquish their gatekeeper positions and come to understand and teach this folksonomic best practice; by throwing down the gates and "embracing the messiness", a populace of taggers could put together fantastic indices.
The most important thing, though, is that we continue to understand the collective knowledge in these systems are effects of scale; personal re-findability is what the systems do best. Through better understanding of the practice of tagging, we can teach people to be better taggers (better being a very quantitative measure in which a person is more successful a re-finding items using different tagging strategies); with better taggers, we can have better collective knowledge. This better collective knowledge won't come from the gatekeepers - instead it will come from the bottom up - from self-aware, passionate taggers.
Postscript: I believe tags are Bush's trails. The Memex only makes sense when people can easily and successfully use it, and a system of folksonomy is the best to-date innovation in this area. We often approach tags as if they should exist in a world without tags and control; to a certain extent, they should - but simply because tags are lawless, this doesn't mean we can't do tags better. We don't need to impose laws, but we can provide education that helps people tag better. Where are the studies of tagging and re-finding? Where is the tag-oriented update of the classic Barreau and Nardi paper "Finding and Re-finding?" There's a world of opportunity for forward thinking library and information scientists. Tags clearly work - so let's help people tag better; it will only benefit us all.
Permalink |
|
to this post
View blog reactions | Post to
8 Comments: (Post a Comment)
- At April 30, 2006 9:40 AM, Kevin Farnham said...
-
In the book my wife and I have written about MySpace.com ("MySpace Safety: 51 Tips for Teens and Parents," to be self-published in a few weeks) our last "safety tip" and a kind of epilogue to the book is titled "Can You Ever Really Leave?" It's a warning to MySpace users, especially the teens, that what they post today may well become a permanent record of their lives, that will be searchable 15 years from now by anyone who wishes to perform the search.
The example I use is a job search. The employer is screening potential candidates using a tool that allows entering keywords or categories, and a top-level summary is returned, with links that let you dig down to another level of categorized documents. From there you can select subcategories and/or documents -- kind of like a free-form XML or DOM web page tree structure. At the end of each branch is a node that is an actual document.
Actually, now that I think of it, and considering your post, individual documents will be broken into topics because the documents themselves will have been analyzed and tagged. (I may have to add that to the book, or at least to a follow-up book). So the last node on a branch won't be a document, but rather a tagged topic area within a document.
The point of the story in our book is "post with care, because the FUTURE is watching" -- so the employer browses the candidate's life record. Linking downward through topic areas related to "risk" the employer sees that the candidate has seemed to enjoy somewhat dangerous activity that flaunts "common sense". For example, there's the 100-mph drive he took with friends when he was 17, about which a friend posted a comment on his MySpace site...
Thanks for sharing your very interesting ideas on this blog. I'm really glad I found it! - At April 30, 2006 9:50 PM, Fred Stutzman said...
-
A frightening vision of the future. If we're to be penalized for everything we do and say, for the mistakes our friends make in disclosure, for the simple fact of living online, what's the purpose? If we're all to be prosecuted for our first-degree actions, and the actions of second- and third-degreee participants, doesn't network effect practically stipulate that we'll all eventually be guilty of something?
Your paranoia is founded in logic, and I don't dispute it. There *is* information out there about us. There are harvesters, and companies willing to sell the data. Our government is extremely interested in our social behavior, and they'll spare no expense to analyze it. But under all of this, we're still human, and by our nature we'll err. And by our nature we'll also forgive, because we have to. The Ministry of Love runs the future you describe, and if that's a future we're looking at, I recommend that students enjoy their lives online, because the future doesn't hold much for them.
The thing is, and this is getting seriously off the topic of my post, but we're adaptive. We know we're not perfect, and we know we'll not always make the right decision. We'll adapt to the long-term consequences of social networks just like we adapt to anything else, believe it or not. And there's nothing particularly virtuous about being a ghost in the system (Gary Marx's theories on identity are very interesting), and even if we were perfect angels - look who got elected our president! Someone with a history of DWI - someone who wouldn't go on the record saying they *didn't* use cocaine! Like it or not, we accept and deal with our flaws.
Frightening children with this virtual panopticon isn't the answer. Why? Because they don't want to listen to that. It's hopelessly out of touch. They lack the context of age to appreciate such a theory - and even if you're right, if the audience doesn't understand the message, it's a lost cause. You've got to give them something they can grasp, something at their level that makes sense to them. The government ditched scare tactics in the drug war a while ago ("This is your brain on drugs"), and I think we're well advised to follow suit. - At April 30, 2006 11:11 PM, Kevin Farnham said...
-
Hmm... as usual, your ideas make me think. First of all, my background may have something to do with this vision. Most of my work has been mathematical modeling and analysis of huge measured data sets. On the side I've done a significant amount of toying with automated analysis of text documents, identification of patterns, etc. Much of the latter has been theoretical, not actually brought to realization in an actual program (Google does such things for us today, I guess).
Also, I'm older than you, have watched Vietnam, dozens of proxy U.S. / Soviet wars fought in third world countries, the student in front of the tank in Tianenmen Square, the collapse of the Soviet Union,... from there on you're probably aware of it all. Not only did I see it, a purpose of my work was to make sure something like 9/11 never happened. But it did.
I do believe the Internet, in its openness, is one of the brightest hopes we have for a future where international atrocities and even horrors like Darfur no longer happen. "Places" like MySpace, and the Internet in general, link people worldwide. And, the more communication that exists between peoples/nations, the less likely there is to be war between them.
I have an example of changes by a Government institution to adapt to the "imperfect" nature of humans (or, you could say, to adapt to "reality"). Our current and previous president were born in 1946. They were in college in the 1960s, a time when a majority of students probably used drugs. So, one president apparently had some fumes somewhere in the vicinity of his lungs (but not quite inside them) and the other apparently sees no proveable public evidence that he actually used coke (so therefore we should assume he didn't). Presidents can get away with such things.
Now, the Department of Defense asks people who need a clearance about their drug history. In the 1970s and early 1980s, they used to ask "have you ever used an illegal drug?" That was a troubling question for many people who were in their 20s or early 30s at that time. Some people were concerned that if they told the truth they would not get their clearance.
Later on, the question was changed to: "have you used any illegal drug in the past 7 years" or something like that. So, the organization "forgave" the past, so long as you were "clean" in your recent life.
The society of the future may be like that. But, spam, scams, spyware are all indications that there will always be thieves and opportunists, I think.
Corporations scare me more than governments, in my thinking lately. Their goal is maximizing profit for their shareholders. It's an "us versus them" attitude by definition. If one corporation doesn't use the available technology, their competitor will...
Your last paragraph is most relevant to our current book. My scare story is 2 pages at the end of a much more conventional set of online safety lessons applied to MySpace features and data structures. Though, the book is titled "MySpace Safety: 51 Tips for Teens and Parents" I'm actually assuming it will be read 99% by parents (who will then advise their teens). What we're hoping to do, at minimum, is to convert the current irrational "I don't want to know about MySpace I just want to ban it" fear into a realization that this is a new world, banning it just bans you from having contact with your kid, so let's sit down and look at this thing (MySpace) calmly and logically... I'll consider your your advice as we do our final read through the book before sending it off to the printer. Thanks! - At May 01, 2006 3:12 AM, Fred Stutzman said...
-
Kevin - another interesting reply. And I read your MySpace book (the 32 page version) and I found it very balanced, written in the appropriate language, and very common-sensical. In other words, a very solid product.
One of the outcomes of my Facebook research was a sense of realization that anything we do can be recorded, documented and retrieved. And that's scary. I've talked about this with people, and I get a lot of blank stares. The truth is - no one knows what the long term on this is...but I think we can draw on what we know works, we can bridge that gap in a reasonable way. - At May 01, 2006 12:38 PM, Jason Griffey said...
-
And to talk about tagging for a moment...
Fred, I'm hitting an instructional wall concerning tagging in my current position. I'm trying very hard to introduce tech that I've seen make things easier/faster/more efficient, and in doing so I've shown the reference department del.icio.us. I created a group account, in hopes that we could build a sort of Reference reference...an ongoing group of links to things that are relevant to reference/instruction. When you find something interesting...add it!
But tagging, I'm finding, isn't obvious to most people. Everyone I've shown this has said "What do I use to tag it?" and when I try to describe the process (anything...just whatever you want that will help you find it later) it's very non-intuitive...perhaps especially for a librarian! Do you have any thoughts on overcoming this sort of initial WTF? barrier?
With that said: delicious is my second brain. I'm mad for it...and use it for everything from travel and projects and ongoing responsibilities, all with tags that mean something to me. Once you "see" it, it's amazing. - At May 01, 2006 12:47 PM, Kevin Farnham said...
-
Can I quote you about the "MySpace Primer" concise edition?
It really is almost eerie in ways, what's happening, and how rapidly it's changing, the way people interact and form new types of communities. Young people just step right into it, having never seen a different world.
But, how does one define "privacy" in this new age? It can't be said that privacy is now irrelevant. Facebook and many other "closed" (to certain people) online communities demonstrate that... There, one group of people shares a communal privacy with respect to the ineligible, excluded "others". - At May 01, 2006 1:06 PM, Fred Stutzman said...
-
Jason - I absolutely know your pain. So many people I know (who are technically proficient, forward-thinking, etc) scowl at me about del.icio.us. They don't use it, or they don't know how to use it, or they don't know why they should use it. The education gap is real, and if it took me x months of using the site to really *get it*, I worry about less technical people. But to be on the flip side a little, the thing that made me understand del.icio.us wasn't any sort of gained technical knowledge...rather, it was introspection. Once I figured out how to get a mental model, how the tags worked for me...I got it. But that was looking inside rather than looking at documentation. It fascinated me. And to that extent, I was struck by a bolt of lightning and I'm going to head back to SILS and set up a very interesting study of tagging....once we start to grasp how tagging works *for* people, we can start talking about how better to educate users. But right now we're missing that building block that we can build this theory on, and I think our study might be able to help provide that.
Kevin - Sure on the quote. My blog is CC with attribution required, so feel free to reuse! And the privacy issue...Its fascinating. There is no privacy, right, in the sense that anythign we say and do online can be copied and shared. But to abstract down a level, isn't privacy just an abstract structural concept? That is to say, don't we create our notions of privacy? I think we'll evolve this as new actors in our privacy equation emerge...we'll understand new norms of privacy. - At May 01, 2006 3:44 PM, Jason Griffey said...
-
Fred: I agree...it didn't take me 600 links, but I was well into the hundreds before I really "got" the power of tagging in regards to findability. My favorite aspect of del.icio.us has to be the ability to divide the tags up into several spheres, and have each sphere be as inclusive as you want. I kind of think about it like a series of Venn diagrams, where I can throw Link X into a pre-existing circle, or create a new circle for it, or (and this is the brilliant part) do both. It's the both that is the really phenomenal part. I can decide at any time which parts of the diagram I want to overlap or separate.
I'd love to know what you've got in mind for the paper, whenever you get to that point. Email me! I'd love to be involved, or just be able to see your take on it.



