?

Log in

No account? Create an account
Tags - Omnia mutantur, nihil interit.
March 29th, 2005
01:46 am
[User Picture]

[Link]

Previous Entry Share Next Entry
Tags

This post inspired by papertygre's earlier post.

flickr and del.icio.us both have metadata systems built around text “tags”. LiveJournal is considering adding text tags too. Now, none of these sites use text tags as the only form of metadata—at a minimum, they all associate a post (photo, link) with an author and a date, as well, and in some cases the available metadata is significantly richer (LiveJournal additionally has userpics, current mood, current music, protection options, and possibly some other things). However, text tags are at least on flickr and on del.icio.us the primary form of user-defined metadata.

The advantage of tags is that the UI for them is very light-weight (no need for widgets to select them from a list—even a single text box will suffice), the data model is similarly lightweight (no relevant data associated with a tag other than the items that fall under it) and they're created ad hoc (arguably another aspect of the UI, but arguably not). They seem to be without disadvantage at first glance, since after all, you can just decide not to use them.

However, in my opinion, tags are actually kind of pernicious. The problem is that I think they really aren't adequate to solve some of the problems to which they will inevitably be applied anyway because of the “when you have a hammer, every problem looks like a nail” fallacy, and that a system that was minimally more structured would be capable of solving far more problems with a minimal increase in complexity.

My solution is to replace each tag with a pair of tags: the first one is a tag in the same sense as on flickr or del.icio.us currently (although I would want to allow spaces and pretty much all punctuation characters in them), and the second is another tag that tells what kind of thing the first tag is. Thus if I have a photo taken by me, I can enter ("Kenn Hamm", "Photographer"); if I have one with me as a subject, I can enter ("Kenn Hamm", "Subject"). You can ignore the type tags at query time if you want, in which case you get tag-like behavior. You can also ignore them (by leaving them blank) at data entry time, in which case you force tag-like behavior.

The trick to make the UI for this not much heavier than that for normal text tags is to have predefined sets of tag types that are useful for certain types of items. For example, for photos, subject, photographer and location are all likely to be relevant; some other data such as date can be extracted from EXIF information and need not be manually entered at all. You can still use arbitrary tag types (or none) if you want to. For even more convenience, once a tag type is selected there could be a shortcut for values you've used with that tag type before.

This system would allow repeated tag types, in the case of an image that has multiple subjects, for example, but it's not a totally general tool for representing arbitrary data structures—I wouldn't add any way to represent hierarchy, because presenting a UI for that would probably be so complex and annoying that even I wouldn't bother to use it. But I still think the addition of tag types would allow a far broader and more useful range of queries to be fulfilled.

It looks unlikely that anyone will implement this for me, and it's my idea, anyway, so I'm seriously considering implementing it in my photo-management scripts as a test case (since I've already decided to implement my own photo-management software, for a variety of reasons).

(5 comments | Leave a comment)

Comments
 
[User Picture]
From:papertygre
Date:March 29th, 2005 07:53 am (UTC)
(Link)
You could use a fancy Google autocomplete style text entry box for the UI, in order to keep it lightweight but still let you choose predefined or previously used values.
[User Picture]
From:kenoubi
Date:March 29th, 2005 02:52 pm (UTC)
(Link)
Google-style? I wrote that before I ever saw it in gmail! :-)

I should show you the newer version of my combo-box widget some time.
[User Picture]
From:sui66iy
Date:March 29th, 2005 02:39 pm (UTC)
(Link)
I share your faith in extensible attribute-value pairs. I've been working on a P2P database optimized for storing attr-value bundles for a while... here's an old paper on the subject.

http://www.maya.com/web/what/papers/maya_universal_database.pdf

In theory we're getting close to some sort of public release. I'll be interested in your application; it might be very compatible with what we're doing.
[User Picture]
From:kenoubi
Date:March 30th, 2005 06:02 am (UTC)
(Link)
I read the paper that you linked. The big difference I can see is that I want to allow repeated attributes, whereas the system described in the paper specifically disallows this. Of course, this could be handled either by having a set of attributes with predefined names ("subject0", "subject1", "subject2", etc.) or by having the value of an attribute consist of multiple separated values (probably null-separated, in the case of strings). Is there a standard way to handle this under the paradigm you're working with? The utility of having some way to represent multi-valued attributes really seems too much for me to pass up.
[User Picture]
From:sui66iy
Date:March 30th, 2005 06:28 am (UTC)
(Link)
Yeah, we have a rich type system for values. It includes primitive types like integers, strings, floats, arbitrary-precision numbers, UUIDs as well as container types like heterogeneous lists and dictionaries. It also allows media types (arbitrary binary blobs and MIME-type tagged objects). The type system is fully recursive, so if you want a list of lists of lists of dictionaries you can do that (though I think it would be stupid, from an information design standpoint).
Omnia mutantur, nihil interit Powered by LiveJournal.com