Clay Shirky continues to just totally nail the questions of metadata, authority, and user-created content. Today's installment: why crappy, cheap, user-generated, uncontrolled metadata will win out over expensive, controlled, useful, professionally generated metadata:
Furthermore, users pollute controlled vocabularies, either because they misapply the words, or stretch them to uses the designers never imagined, or because the designers say "Oh, let's throw in an 'Other' category, as a fail-safe" which then balloons so far out of control that most of what gets filed gets filed in the junk drawer. Usenet blew up in exactly this fashion, where the 7 top-level controlled categories were extended to include an 8th, the 'alt.' hierarchy, which exploded and came to dwarf the entire, sanctioned corpus of groups.
The cost of finding your way through 60K photos tagged 'summer', when you can use other latent characteristics like 'who posted it?' and 'when did they post it?', is nothing compared to the cost of trying to design a controlled vocabulary and then force users to apply it evenly and universally.
This is something the 'well-designed metadata' crowd has never understood — just because it's better to have well-designed metadata along one axis does not mean that it is better along all axes, and the axis of cost, in particular, will trump any other advantage as it grows larger. And the cost of tagging large systems rigorously is crippling, so fantasies of using controlled metadata in environments like Flickr are really fantasies of users suddenly deciding to become disciples of information architecture.