Ladling out tag soup: How Microformats can help us discover cool stuff
At Appozite we’re working to make the whole process of product discovery easier. Ultimately we want users to be able to find stuff that fits them perfectly but that they didn’t even necessarily know existed. The only way to make that happen is to make it easier for software to locate and understand what products are available online, what attributes those products have and how other people feel about those products. Once software can do that, it’s not far from being able to help you figure out what you might like from all that stuff floating around out there. The problem is that on today’s web this is really tough.
Generally people are able to find products by searching for and reading product pages on e-commerce sites. Humans, of course, perform this miraculous feat because we’re very good at looking at a page consisting of images, some bold text and a few lists and determining most everything they need to know about a product. Software, however, is terrible at this. If you look under the hood at the HTML on these pages it’s all tag soup and basically impossible for software to understand. It might look good to us, but to a piece of software, it’s total static.
Luckily, people are working on the issue of making it possible for software to understand what’s going on. Loosely, the efforts to do this are grouped under the heading of “the semantic web“. This is a pretty broad topic and extends from the complex subject-predicate-object statements of RDF to the much more simple html approach of Microformats. Though RDF and its more complex ilk can and will probably aid in product discovery in the future, I want to focus on the Microformats approach. So what’s a Microformat? I think the prize for best description goes to Drew McLellan on the Microformats wiki:
Microformats are a way of attaching extra meaning to the information published on a web page. This extra semantic richness works alongside the information already presented, and can be used for the benefit of people and computers. This is mostly done through adding special pre-defined names to the class attribute of existing XHTML markup.
So, basically it’s a standard way of using HTML to describe a particular thing that you’re displaying in a web page. This thing might be a person, an event or a relationship between people. It might also be… you guessed it, a product.
With that in mind, I wanted to do a quick review of the work being done over at microformats.org on product related Microformats that we think can contribute to better shopping, reviewing and recommendation experiences for everyone by making data more avaialble to software that helps you find cool stuff. These are hReview, hListing and the currently moribound hProduct.
hListing
hListing is a proposal for a format describing items listed by a seller for sale to a buyer. The focus of hListing is on properly describing a classified ad; more of a Craigslist or EBay listing than a product for sale on a traditional e-commerce site. With this frame of reference in mind, hListing requires the the formatted listing include a listing action (such as sell, rent, trade, etc).
hReview
hReview is one of the older Microformats and is in version 0.3 (which passes for old in MF land). Though technically a draft specification, it’s been adopted fairly widely “in the wild“. The concept behind hReview is simple; provide a format for describing what a user thought of something (product, service, event, etc). hReview requires a pointer to an item being reviewed in the form of a URI, hCard (for a person being reviewed) or hCalendar (for an event being reviewed). All other aspects of an hReview are optional but the key ones are reviewer, rating, dtreviewed and description. Together these describe a pretty standard “3 out of 5 stars + description” review that we see on most sites providing product review capabilities.
hProduct
Of the three, hProduct has seen the weakest effort at official specification. In fact, it’s on the Microformats.org list as “moribound” and the Microformats folks would probably say that the term “hProduct” shouldn’t even exist as it has not officially advanced beyond the research stages according to their process. That said, hProduct has the noble goal of describing a product itself outside the context of a classified-style listing of that product for sale (hListing) or a user’s feelings about that product (hReview). The rough draft proposal for hProduct combines the standard attributes you would expect such as name, (text) description, image and uri with a more elaborate product description capability using extensible property-value pairs. Anyone who’s ever worked with Amazon’s extensive product property-value taxonomies will see that this methodology is a fairly resonable approach to describing products. However, it seems that the Microformats community is not very enthusiastic about hProduct and generally suggests people consider hListing instead.
So where does all this leave those of us who want to discover cool stuff online? Of course if sites don’t use Microformats, the tag-soup situation doesn’t improve. We’re going to do our part at Appozite by making sure we support hReview and hListing where approriate in our software. We’re also going to work with the Microformats folks to see if we can help revive hProduct because we think it has a lot of potential for cleaning up the tag-soup of today’s e-commerce sites. We also hope any of you reading out there will join us and the rest of the Microformats community to encourage their use. So, next time you stop into your local e-commerce store to pick something up, see if they’re supporting microformats on their pages. If they don’t, tell them they should!
Hayes @ June 22, 2008
[...] by Hayes Davis on the Appozite blog in June’08: hProduct has the noble goal of describing a product itself outside the context of [...]