Have you heard of tagging
Its a relatively new phenomenon that is broadly marketed by many of the new web services i.e. Web 2.0, and Social Media. Tagging is no different than the name implies, it is used to ‘tag’ (meta) data with meaningful (semantic) names that help to make the information more ’searchable’ for humans. When you tag content - video, text, or sound, you give the data meaning in a ‘human sort of way’ (technical term :).
An example would be to tag my personal home page as - roger kondrat, london, uk, personal, home, diary. Now if you went to a site that allows for semantic search e.g. diigo, del.icio.us, etc you could then use those terms to find my personal home page. In a way Google does this through your sites metadata, but it also does a few other things like searches the words on each webpage looking for ‘keyword density’ which will affect how that page is found and they also take an approach that is relatively agnostic.
That last part is the most important part, it means Google doesn’t consider you as an individual which means Google’s results are not distorted by your culture, unique language characteristics or your (possibly) unique perspective.
Why is this relevant? Well if you have ever worked for a large company with a vast website and found it hard to find what you are looking for by using traditional methods (file search), or newer methods (google appliance, or google desktop search) then you know sometimes the machine just doesn’t understand the man.
So we have started moving towards semantic search not for everything but for some things. Developers and business professionals are looking to semantic search for finding Intranet knowledge that was historically difficult to find, and often lost or difficult to find knowledge is revenue lost to a corporation.
However semantic technologies used alone can carry significant risks. The most important risk being dated terminology.
For example if you read my blog you may be tempted to book mark a post and maybe it will be about Dabbledb a startup that is typically tagged with the term Web2.0. Now in two years if I am trying to find Dabbledb what good is searching Web2.0 going to do for me. In five years I probably won’t even remember what web2.0 is never mind who was part of it.
Not only that but was it web2, web2.0, web 2.0, web 2, Web20, etc.
That is alot of derivatives and yet they could all still be projecting the same meaning don’t you think? This example should make it a little obvious how semantic tagging can adversely affect relevance especially in an organic environment such as the Internet. Tagging however does have potential as an Enterprise tool because the Enterprise environment is more controlled and tagging in this case could herald vastly improve data management.