I had the opportunity to speak with Tantek Celik the other day for a few minutes in London after he did a presentation about the work he has been doing with Microformats (I mentioned this a few other posts so apologies for the repeat) at Technorati as I happen to like Microformats but until he spoke about it that day I just didn’t realise there was a name for it as such.
Lucky me. At the end of his presentation he spoke about hCard a very specific sub-section of the microformats effort and I asked him question that although not original I was curious to his answer and that is ‘what about spam, if I put my hCard on the web that means my email is now indexable and spammers will have full access to an authentic email address’. I could tell by the expression on his face he wasn’t surprised by the question and I did feel as though he did look like it was a concern, however thats conjection and I shall move to his comments (paraphrasing of course).
Tantek said that if hCard were to integrate in some way with a trust mechanism e.g. XFN (his example not mine), then that may help, however at this time he hasn’t come up with a good answer.
To be honest that has me worried especially if we are as an industry going to try move people in this direction. IT people know spammers have ‘bots’ out there searching the web for email addresses to spam its why you go to a website and you see a contact address that looks something like this roger (at) techwinter (dot) com, some people are even making their emails into images. Myself if I post my email on a website its usually created for a specific purpose and then discarded.
I wonder, what if any are the protections we are going to put into place should we move to the hCard microformat as a standard? I have some thoughts but I will post them another day. What are your thoughts?
All I know is for now my hCard is email free and will stay that way until I am confident, I won’t get spammed.
Ping THIS!
|
{ 8 comments… read them below or add one }
Devon 01.30.07 at 2:23 am
I don’t see anything in the hCard info @ microformats about whether it’s wrong to mangle the e-mail address in the tag in any way. But I can’t help but think that any spambot could easily scrape whatever’s in the @class=’email’ and then try a few algorithms to figure out what might work and if something can’t be done with it then just append it to a file and have a human see if he can read it and then shove it into a database. I don’t know, it just seems near impossible since there’s a specific defined area to grab.
Devon 01.30.07 at 2:23 am
I don't see anything in the hCard info @ microformats about whether it's wrong to mangle the e-mail address in the tag in any way. But I can't help but think that any spambot could easily scrape whatever's in the @class='email' and then try a few algorithms to figure out what might work and if something can't be done with it then just append it to a file and have a human see if he can read it and then shove it into a database. I don't know, it just seems near impossible since there's a specific defined area to grab.
Roger Kondrat 01.30.07 at 11:15 am
Devon,
You make a good point and I totally agree, its like putting a big sign ‘harvest me’ next to your email.
I have been thinking that it would be nice to have an hCard behind a Captcha style gateway for semi-private information on my site.
Any thoughts?
Roger Kondrat 01.30.07 at 11:15 am
Devon,
You make a good point and I totally agree, its like putting a big sign 'harvest me' next to your email.
I have been thinking that it would be nice to have an hCard behind a Captcha style gateway for semi-private information on my site.
Any thoughts?
Julian 02.03.07 at 12:08 pm
Glad this has been thought about — this idea just dawned on me this morning when reviewing the implications of microformats myself. Totally agree that, more broadly, the semantic web means your page fundamentally will be available for machine consumption, for better or worse. How to protect the worse seems to be the biggest issue confronting microformats currently.
Julian 02.03.07 at 12:08 pm
Glad this has been thought about — this idea just dawned on me this morning when reviewing the implications of microformats myself. Totally agree that, more broadly, the semantic web means your page fundamentally will be available for machine consumption, for better or worse. How to protect the worse seems to be the biggest issue confronting microformats currently.
Daniel Aleksandersen 02.11.07 at 5:21 pm
Short answer: Yes, definitely.
The only way to prevent spam from destroying the hCard (or uploading vCards to the internet!) is by destroying spam first!
There is a second solution: Make GnuPG signing and encryption mandatory to successfully email people. This would reduce the amount of spam as it would be administratively more difficult to send spam, and it would be more processor intense to send lots and lots of emails.
Daniel Aleksandersen 02.11.07 at 5:21 pm
Short answer: Yes, definitely.
The only way to prevent spam from destroying the hCard (or uploading vCards to the internet!) is by destroying spam first!
There is a second solution: Make GnuPG signing and encryption mandatory to successfully email people. This would reduce the amount of spam as it would be administratively more difficult to send spam, and it would be more processor intense to send lots and lots of emails.