Comment: vanderwal (Jan 7, 2005)
Lou, I agree there will be scaling issues with the current generation of folksonomy tools and that the current batch does not lend itself to searching all that well. I don't think that development will stop with the current state.
With Flickr I do see it as a breakthrough, not so much because of the folksonomy, but the tagging of objects that currently have no means of searching on without any metadata. With objects some is better than none, and Flickr is open to others adding tags to augment those that the user provides so to provide richness of information.
Del.icio.us seems to be providing something similar to Flickr with the tagging of external digital objects, but its use a thesaurus development tool that is instantly cross-cultural and cross-discipline is not perceived by the people using it as such. Cross-cultural and cross-discipline thesaurus, particularly in fluid/adaptive/emergent vocabulary fields is insanely tough to build, let alone keep current. Not only can a tool like these build the bottom-up it can wrap around and incorporate the top-down. It can close the darn gap.
Increasingly in user testing I run across people who do not use the site navigation to find information on a site, nor do they use in-site search. I regularly see people go out to their favorite search engine and put in the term for what they believe they are looking for. The search engines often (and increasingly so) return the information they were looking for using the user's term. The term may not be what the site uses in its controlled vocabulary and may be a term that is rarely used on the site. How did the search find it? Outside links using the term pointing to the content. The original folksonomy is the web. Search engines leverage this metadata in very useful manners. Similar uses will emerge to wrangle with the new Flickr and del.icio.us tools to leverage tagging of objects that did not easily exist for the objects previously.
I really like that you have included the information ecology as our tools and means of categorizing and structuring formation change and grow as we learn. Taxonomies and ontologies do the same, but there have been tools lacking for areas that morph at less than glacial paces. The science are wonderful places for top-down and formal classification. Other information areas are more transient. I agree that the information architect's job is to find the tools that best fit the environment and the need. It is increasingly important to keep an open tool in our tool belt to ease capturing emergent vocabulary, but also to provide people a means to keep the attraction to the people who desire to find and use that information again. People best recognize and are attracted to terms the know and understand and there seems to be no better way than to have a means for users to do just that than their own metadata tool.
There is a future use as well for the folksonomy tools. Digital annotations of physical objects is not just around the corner it is happening now. Projects like Urban Tapestries are working to make the physical digital as well. It would be good to grasp and understand how regular people will interact with these concepts as they tag and use the tagged objects and space. I am a believer that the early education systems must teach information organizing methods and techniques (I learned Dewey in the first grade and it stuck, how difficult could it be), but our systems have not and we have the current state we are in today.
Will people wrap their minds around developing and applying metadata? Will they see a need to? I am increasing believing that we are building at least two classes of people, those that are information and technology savy and those who are not. This chasm will be the barrier to jobs and even education in the not too distant future. Can we design across this chasm?
Comment: helge (Jan 7, 2005)
I'm not sure your "summer"-example convinces me. ok, say I apply "summer" to my photos (which, as a sidenote, I wouldn't upload to del.icio.us but to fickr), which indeed would drown them in a crowd too large. But as a searcher I would rather search for "summer beach surfing" and get exactly what i want. IT wouldn't matter too much that I'd miss out some photos that are just tagged "summer" and not "surfing". And next time I upload photos again I would take care to give the better ones more and more exactly tags. I would also learn that well tagged photos of mine get a lot more feedback by other users. So the key is (social) feedback. Scaling the model I'm sure photos tagged "summer" would just come from first-timers. That would ruin the keyword "summer", yet the whole model would still work.
Comment: Mark Cross (Jan 7, 2005)
Louis, I feel you are right on the one hand, but perhaps wrong on the other.
Correct on the scalability of "value", but it looks like they could remain good personal tools.
You can also factor out the spammers or noise if you wish.
del.icio.us is unique that I could find for a bookmark manager in that you can actually export you records.
But I'm looking to run the next major release of sitebar.org to provide tradition catagorisation for mine, as it never release will import RSS/ATOM/XBEL etc and export as well.
I have kept every bookmark since Feb 1994, so I have quite a few problems!
Comment: Zes (Jan 7, 2005)
How about RSS based metadata that each flickr, furl, etc. user can tap into to manage their content? That local applications can also poll?
I have a dream -- create metadata once, use many times.
Comment: Lou (Jan 7, 2005)
Woops; Helga, thanks for pointing that out. Have replaced the reference from del.icio.us to Flickr. And great point about the social feedback, but still, even today's well-conceived, precise tags could possibly be too broad down the road, and I'm not sure what incentives there will be re-tag two years from now.
Comment: helge (Jan 7, 2005)
lou, i'm not counting on retagging, i'm counting on an indivual user's learning curve based on social feedback they receive. (and may i point out one more thing? it's helge, male, not helga, female. :-)
Comment: marianne (Jan 7, 2005)
I share Lou's concern about the salvation nature of Folksonomies. While it would be nice if we as individuals could organize the entire Web to our individual information behavior, it is highly unlikely to improve our ability to find our own content, let alone the content of another that might be just what we're looking for but tagged with terms that are meaningful only to that person. For me, the development of refined search algorithms that combine content with search term offer more hope for Web search. The development of a realistic Keyword in Context methodology that allows the content creator to develop meaningful tags based on CV or free association from the content will make headway in the Enterprise search world. Keywords aren't bad...they're merely drawn that way and unfairly so in certain circumstances. I also find promise in the Taxomony on the Fly methodology that was presented at the IA Summit in Portland. Here the taxonomist is drawing terms from search logs and associating them with the existing taxonomy.
Comment: kael (Jan 7, 2005)
Folksonomy should be conventional, even implicitely.
Comment: kael (Jan 7, 2005)
Folksonomy should be conventional, even implicitely.
Comment: Susanna (Jan 8, 2005)
Very timely post, as we've been discussing this at work recently. The research group I work with has developed a metadata generation tool that seeks to provide the "best of both worlds" you're talking about for physical oceanographers. It helps users generate standard metadata (though in some cases, what's standard is still being debated) and also allows research groups to use a shared data dictionary for defining metadata terms.
Ideally, the data dictionary defines common metadata using some sort of widely-used standard (e.g. that defined by NOAA or the Navy) and then allows research groups to add and define their own metadata for terms that might not be as widely used. The original idea was to get entire research groups using the same metadata, and it's blossomed into a way to help researchers easily adopt metadata standards without having to go looking for them and evaluating them, taking time away from doing actual research.
You can read more about the software, named Meta-Door, here: http://nautilus.baruch.sc.edu/carocoops_website/data_metadata.htm
Comment: vanderwal (Jan 8, 2005)
Lou, I don't know if you have used del.icio.us much, but this broad folksonomy tool (many users tagging one object) gets to synonyms very quickly and does a good job of providing emergent vocabularies very quickly. I don't think you would have made your synonym comment if you had more than dabbled in del.icio.us. Even looking at del.icio.us page for Clay Shirky folksonomy article could easily provide a darn good start with out doing much work.
The focus on preferred terms I continually find problematic in user testing, particularly on sites that have broad audiences. I don't think the folksonomy tools are meant to provide a preferred term. In fact it seems ironic to think they would as they are tools for people to work in terms the individual understands. This is counter to the preferred term mentality that is pushed down on people. People can have a tough time finding information and the folksonomy provides an easy means for these people to keep found information found. The have a tough time attracting the information they want on to their screen, possibly because of vocabulary problems (cultural or discipline related). The folksonomy tool not only allows people to keep the information found by applying terms that they are attracted to (because they understand them), but they allow others with similar vocabularies to more easily attract that information to them as well.
Sure, with preferred terms we can hit 80 percent with these terms (in the 80/20 metaphor), but that is a low B in school, if not much lower on a curve. We can do better than that. Why not aim far closer to the 100%?
Comment: Dennis Moser (Jan 10, 2005)
"Sure, with preferred terms we can hit 80 percent with these terms (in the 80/20 metaphor), but that is a low B in school, if not much lower on a curve. We can do better than that. Why not aim far closer to the 100%"
An old axiom that I know Lou will recognize from our classes together: When you're searching online, we can provide you 100% relevance or 100% recall. Which one would you like?
The numbers you've chosen to indicate ou ability to enhance searching sounds like the Paretto numbers, and we know how hard that last 20% is...
Comment: Simon Forrest (Jan 10, 2005)
You might be interested in taking a look at what Headshift are doing in this area, combining more formal faceted classifications with bottom-up, keyword-based tagging ('social tagging'). For example, see Lee Bryant's post "can social tagging overcome barriers to content classification?" (http://www.headshift.com/archives/002085.cfm).
Comment: Rob (Jan 12, 2005)
I've never used flickr, but delicious has a wonderful piece of metadata that lets you find your own stuff easily - your username.
You can easily view everything you've posted, along with all the tags you've used, and if something isn't right it stands out.
This means that if you end up using a synonym because you can't remember the correct tag, a couple of clicks can fix it easily.
Where appropriate, tagging schemes like these enhance the experience - to me, gmail has it right with the ability to recall previously used tags or create new ones as required.
Comment: Sam Rose (Jan 12, 2005)
Also, delicious has proven highly usefull for collaborative research among small and medium size groups.
Chekc out the cooperation tag for an example
Comment: Lou (Jan 13, 2005)
Thomas, I agree with most of what you've said here, especially regarding tagging of objects like images where there was no metadata before. But I'm really not clear on how del.icio.us is nailing synonyms. Here are the current common tags for Shirky's article:
Sure, there are some synonyms in there. But it takes some work to pull apart the synonyms from terms that should be connected with broader, narrow, and related term relationships. (And some are altogether unrelated, like "toread;" perhaps an inkling of a different facet?)
Obviously, this can be a helpful step toward building a more formalized vocabulary--maybe 80% of the way as Dennis is suggesting--but surely there are settings where that really hard remaining 20% is also a really critically important 20%. Do you see folksonomies evolving functionality to make up that difference? Or do you feel it's good enough for most situations?
Comment: Reg Cheramy (Jan 14, 2005)
One of the key differences between tagging in del.icio.us and flickr is:
In flickr: a photo is only tagged by it's user/uploader. If the user 'gets it wrong' or doesn't add enough tags, it's pretty much going to stay that way (noone is going to retag.)
In delicious: the same bookmarks is tagged by many people, adding multiple overlaps of tags that guarantees a good cross section of tags (depending on the pages popularity.)
Flickr could improve by implementing something like http://www.espgame.org/ where users compete to tag an image similarly.
I strongly believe tagonomies will be rampant in the coming years and that we've barely scratched their coolness factor.
Comment: vanderwal (Jan 17, 2005)
Lou, thanks for pulling out the numbers and tags for Clay's post. Today's tools will still need a human mind to pull together synonyms from this list. But, the tool has provided us with a cheap and easy capturing of these terms and their usage. Keep in mind people did this for their own benefit and to some degree for fun. When people are providing vocabulary for their own retrieval they will tend to be more honest and may put more time into it (as opposed to things like the ESP Game).
In what you have pulled together we are seeing more than one facet, which is fine as we are seeing varied uses of not only tagging, but we can also see varied uses for the information or object. Those of us who are working to design beyond the first use of information find the varied facets helpful for learning information reuse. Seeing the author in the information would make a bibliography easier to make given one of the metadata or folksonomy terms. Having this type of metadata available will make harvesting the information for varied uses much easier.
One of the problems I have run into on most projects or jobs I have been on is cross-cultural or cross-discipline vocabulary. The folksonomy is currently cultural and discipline agnostic, which has some benefits. It allows those from a formal education to find this article though the ethnoclassification tag, which may be the preferred term in a formal top-down taxonomy for cultural anthropologists. More parochial vocabularies used, such as tagging and metadata, which may be part of another disciplines top-down taxonomy. Lastly we have emergent terms (folksonomy) mixed in that are not on anybodies top-down taxonomy this week as they have not been captured.
I see tools being built to augment folksonomies and to harvest the information from folksonomies. Even though Google was the a forerunner in folksonomy as it gave a weighted value to the metadata in a hyperlink that pointed to a URL, folksonomies are relatively new. We need to figure out how to best bring folksonomies into our toolbelt and how to capture who is using them and how widely they will spread out the rings of technology adopters (right now it is the more technically savvy that are eager to tag). I also hear many saying that the pain of tagging is relatively low for the high value return they get by just being able to easily have access to information they already found or to objects that have no addressable information to search or normally draw close to themselves.
Currently I don't think it is good enough for most situations, but I think as people realize they have a need to improve how they tag and add metadata, they will be more eager to learn and embrace better methods. I continually see high frustration levels in many regular users of desktop systems and general websites along the lines of not being able to find information they know is there. The frustration grows higher when people know they have found the information before and they are sure they know where they found it, but can not get it on their screen when they need it. More often than not I see vocabulary as a variable in their troubles as well as unfamiliar hierarchies.
I also see folksonomies as where file storage in operating systems will go, at least as an option to users. Getting to that euphoric dream state where we have the right information at our finger tips when we need it will take other solutions than the ones we have been playing with up to now. Tagging can start getting people over that hurdle as people can tag digital objects with terms that relate to projects, meetings, research, etc. and things can be found more easily. Information can be brought closer to the person who may require that information more easily, in a predictive manner, if related information and objects have a layer with the hooks to allow them to be drawn toward the user.
Folksonomies are akin to a personal semantic web, where one is giving up a broadly used structure that works fairly well globally and needs a lot of work for one that works darn well personally with much less work. I do think (again) that people will see a need to learn better methods of building their metadata and they will see a good return for learning it.
Comment: Lou (Jan 20, 2005)
Susanna, I took a short look at MetaDoor; wondering if you could describe the process of how user-supplied tags influence/enter the controlled vocabulary? (Didn't see this detailed on the site.) Thanks.
Comment: Bruno (Jan 21, 2005)
What if the descriptive taxonomy (what this thing is) was open-ended (a folksonomy), but the functional taxonomy (what would you do with this thing) was controlled?
So, say I was bookmarking this post: I could tag it with any words I wanted - tech, library, cataloging, and so on. Those words describe what this item is about in ways that are primarily relevant to me. If they also happen to make sense for someone else, fine.
Then I would also have to choose one or more verbs, words that describe what I want to do with this item. Do I want to read it, save it, comment on it, disagree with it, build something with it, etc.
Comment: Lou (Jan 21, 2005)
Bruno, that's a great idea, though if I understand it, I think it'd be best for personal use rather than broader use. I wonder if the verbs could act as something of a "workflow" facet, separate from the descriptive (nouns and adjectives) facet?
Comment: Brian Del Vecchio (Jan 21, 2005)
Lou, in response to your list of tags as an illustration that the tags are diverging rather than converging, there is one important piece that you're missing.
First, notice that the list contains separate counts for tags 'taxonomy' and 'taxonomies'. Joshua has said on the delicious-discuss mailing list that he applies a stemming algorithm to the search tearms, so that they are considered equivalent when searching. So in this basic way, the system can blur the classification a bit in order to get better results.
Further, del.icio.us has a feature which suggests 'similar tags' for a given tag. This finds other tags in the system with a close correspondence to the currently suggested tags. (That feature has been disabled for a few months during a rewrite of the underlying data model).
The point here is that through analysis of the dataset, the system can identify relationships between tags which, while not strictly synonymous, are very useful when searching. As a result, the requirements on tagging discipline are loosened. So in the end it won't matter if you use 'humor' or 'humour' or 'humorous' or 'funny'.
I think even del.icio.us has a long way to go here in developing a query syntax that leverages what the system knows about the tags. Currently you can do straight tag intersections, but that's it. As Joshua put it back in September:
"Seriously, I'm not adding anything else to the tag calculus until I figure out how to make the interface to what exists already better..."
Comment: Eric Scheid (Jan 21, 2005)
could tags be the gateway drug to better metadata discipline?
everyone is excited, even folks that would never be interested in metadata, and they are getting involved, slowly getting addicted to the idea of applying metadata-lite to content ... and at the same time are (re-)discovering all the woes inherent in a light-weight metadata system (synonyms, lack of hierarchy, context, etc).
Maybe the masses need to work this out for themselves.
Comment: Eric Rodenbeck (Jan 23, 2005)
> They're highly unlikely to
> develop beyond flat lists and
> accrue the broader and narrower
> term relationships that we see
> in thesauri.
This is probably right, in that folksonomies themselves will likely remain flat and non-hierarchical. But this doesn't necessarily mean that they won't become incredibly useful as sources for other kinds of organizational systems.
http://mappr.com/ (disclaimer - I'm one of the developers) is taking placename tags added to photos on flickr.com, and making best guesses as to where to place them on a map (of the US and environs only, for now). What's happening is that an explicit hierarchical taxonomy (city and state names, along with zip codes) is being intersected with a folksonomy (the seething mass of user-determined flickr tags). The guesses it makes can be wrong - for example photos tagged with "duck" are being placed in Duck, West Virginia - but in these cases a low level of confidence is indicated. Photos tagged with "sanfrancisco california 94103" are very likely meant to be associated with the Mission District in San Francisco, so a high level of confidence is indicated.
This application of a taxonomy to a folksonomy (we're trying to find a better phrase for this) allows for the aggregation of disparate photos into more meaningful groupings - for example, photos tagged with "los angeles" and those tagged with "san francisco" will all show up in the RSS feed provided for California:
> And it's a safe bet that
> no one will bother to go back
> and re-tag their photos with
> more precise terms.
I think this makes sense in the abstract - but what we're starting to find is that for the kinds of flickr projects that lend themselves to mapping, people are only too happy to re-tag their photos with data that mappr can better deal with. For example, a guy who's driving around the country taking photos of himself and his car recently went through his set of photos and re-tagged them with city and state names, and "capitol," resulting in this set:
> I'd love to hear of any good
> examples of metadata ecologies
> where user-generated tags and
> controlled vocabularies have
> been successfully combined in
> a single process to knit together
> disparate blobs of content.
I don't know if it's "good," but this pretty much defines what mappr's doing :)
Comment: George Oates (Jan 24, 2005)
> In flickr: a photo is only tagged by it's
> user/uploader. If the user 'gets it
> wrong' or doesn't add enough tags, it's
> pretty much going to stay that way (noone
> is going to retag.)"
(disclaimer: I work on flickr)
Actually, you can specify who you want to be able to add tags to your photos, from anyone to only you. And, the tag 'adder' can add or remove the tag anytime (whether it be the photo owner or not).
There are also other subsets of tagged photos available in Flickr, e.g. you can see all the tags used on photos in a group pool (http://flickr.com/groups/squaredcircle/pool/).
My 2c - The synonyms are the tricky bit, and impressing a taxonomy upon this doesn't necessarily fix the issue, although it depends who or what is responsible for the taxonomy... :)
The idea of tag stability --> survival --> perpetuation is interesting to me personally, particularly when you recognise the importance of social interaction in this mix. That's one of the things that's been fantastic to watch on Flickr... people in groups like Squared Circle (http://flickr.com/groups/squaredcircle/) have propagated the 'desired' tag to use for photos within that group, and hence the tag has 'weight' and stability in the system.
So many ways to slice and dice so many great photos :)
Comment: David Engel (Jan 25, 2005)
I would not normally bother with adding my $.02 from a common man - unschooled in the arts of information architecture, etc. - but I would like to give my example of combining the ideas of controlled taxonomies and folksonomies - my weblog. Unfortunately, it is in redesign, but the reason why is part of the example.
After some time of avoiding having a controlled taxonomy, I became distressed with the breadth of the categories I had created - used in a manner similar to tags, but not *quite* as flexible. Some time during the process I began using http://del.icio.us and found that it helped control my taxonomy - "Does this new link really need a new tag, or does it fit in a group that I have already started?" was often answered with the second choice.
After 700 links, I reviewed my collection of tags and the usage of each, and I realized I had developed my controlled taxonomy. These would be the areas of interest for my weblog.
Initially, I was only influenced by my own tags when adding a new link, so it really wasn't social software as far as I was using it. The Del.icio.us now has a new interface which points out the most commonly used tags by other people for links, so the social influence is climbing, and I think that will lead to an improved taxonomy for my own site: I can get a better feel for what category/tag others would consider a link, so I can make it easier to find. It becomes a form of user-feedback.
In that regard, I think folksonomies will always have a use for information architects. Now, before the controlled taxonomy is fully developed, it can be compared with what the users would have developed had they been in control.
Comment: Steve Brooks (Jan 29, 2005)
I see controlled vocabulary as "finding" and tags as "exploring".
Comment: Jim Wilde (Mar 18, 2005)
I cobbled some OSS stuff together that takes advantages of controlled taxonomies, as well as folksonomies (think del.icio.us). LOL. Yes. I'm in my basement working, unshowered, and still in my jammies. I love this stuff! This is my art. Lemme tell ya 'bout it.
Comment: Kent Gibson (May 13, 2005)
I have implemented a taxonomy / folksonomy, later it should develop more into a proper community folksonomy. If you have any comments I would be very interested to hear them.
Maybe I should do more reserch on what a folksonomy means, but if you can put the categorys where there should be in a democratic way would that not qualify?
Comment: kent (Jun 16, 2005)
well the folksonomy is now up and running if anyone wants to try it.
Comment: karl (Aug 2, 2005)
You said: "Folksonomies aren't likely to organically arrive at preferred terms for concepts, or even evolve synonymous clusters. "
It would be interesting to have your analysis of the notion of "clusters" ala flickr.
Comment: Lou (Aug 4, 2005)
Too soon to say; I need to spend more time looking them over. Rashmi Sinha has some good thoughts that are an excellent starting point for understanding Flickr's clustering: http://www.rashmisinha.com/archives/05_08/flickr-clusters.html
Comment spam has forced me to close comment functionality for older entries. However, if you have something vital to add concerning this entry (or its associated comments), please email your sage insights to me (lou [at] louisrosenfeld dot com). I'll make sure your comments are added to the conversation. Sorry for the inconvenience.