The Electronic Frontier Foundation is, in their own words:
a donor-supported membership organization working to protect our fundamental rights regardless of technology; to educate the press, policymakers and the general public about civil liberties issues related to technology; and to act as a defender of those liberties. Among our various activities, EFF opposes misguided legislation, initiates and defends court cases preserving individuals’ rights, launches global public campaigns, introduces leading edge proposals and papers, hosts frequent educational events, engages the press regularly, and publishes a comprehensive archive of digital civil liberties information at one of the most linked-to websites in the world: → http://www.eff.org.
They are beyond doubt one of the most important and intelligent privacy watchdogs in the world. Cryptography buffs will recall the massive support they gave to Phil Zimmerman, the inventor of Pretty Good Privacy (PGP), the first widely available military strength encryption program, in his year long legal battle with the U. S. government under Clinton.
Hence, it’s not surprising that they have set their minds on offering information on how to blog in public without → losing your job or suffering political persecution, etc. Which is all very well: the advice they are giving on their web page “How to Blog Safely (About Work or Anything Else)” is pretty sound and accurate and well intended.
It’s only when we read the part titled “Don’t Be Googleable” that we were struck by a disturbing degree of naivity. Here’s what it says:
4. Don’t Be Googleable
If you want to exclude most major search engines like Google from including your blog in search results, you can create a special file that tells these search services to ignore your domain. The file is called robots.txt, or a Robots Text File. You can also use it to exclude search engines from gaining access to certain parts of your blog. If you don’t know how to do this yourself, you can use the “Robots Text File Generator” tool for free at → Web Tool Central.
This really sounds too bad to be true, but rub your eyes as you may, there it is: robots.txt, eh? As any SEO worth his or her or its salt could have told them in a whiffy, Google has a long, ignominous track record of ignoring the robots.txt convention ever and again. (Usually trivialized as a “technical glitch” by the Googleguys of this world.) And the other engines are no exception. Granted that it’s not an ongoing violation, but it’s persistent enough to more than just mildly worry about if privacy and anonymity are at stake.
Moreover, putting one’s trust in Google of all unlikely institutions, a company exceedingly well connected with what is euphemistically termed the “intelligence community” (read snoops) and whose head honchos are notorious for their security clearance with the → NSA and possibly other agencies, is tantamount to having all your sheep herded by that really cool and accommodating wolf next door. If you feel that’s an unfair overstatement, ask any Chinese dissident …
Time to get real, EFF! First, there ain’t no such thing as reliable “privacy” let alone “anonymity” on the Web. While you may get away with a lot of tricks as long as nobody’s looking too hard, don’t bet the farm on it that you actually will.
Second, if you don’t want your site Googled (or Yahoo!ed or MSNed or whatever), password-protect your web site or blog! (And don’t even dream of using weak passwords such as your birthday or your mother’s maiden name.)
Ok, so this may not always be feasible if you want to maintain a public blog (or any other web site for that matter) drawing traffic in droves. Still, you might feature only a short excerpt of your posts visibly and require readers to log in for the rest. (And yes, this won’t protect you from trolls, but the issue at hand here is search engines, not human snitches, remember?)
Further, don’t forget to at least put a “noindex, nofollow” command in your meta tags as well and make use of “noarchive” to avoid your pages being cached. Download our free → cache buster script and implement it on every page you want to protect.
Again, this is no 100% reliable protection but it can help you dramatically to stay under the radar unless they’re really after you. (Which, of course, they well may be if you happen to subscribe to a currently unpopular creed, political opinion, color of skin, etc. …)
If passwords and user registrations are no viable option, go for → IP delivery (cloaking): serve one set of harmless content to the search engine spiders and the real McCoy to your human readers.
Make sure your list of → search engine spiders (also termed “crawlers” or “searchbots”) is scrupulously up to par! Don’t work from dated, faulty or incomplete lists – that’s like sealing your house’s windows and chimneys and gullies while leaving the front door wide ajar …
You might also try to do all those dodgy things with your blog site the search engine optimization “white hat” crowd are so happy to get worked up about all the time. The intention being, of course, to actually get banned from the engines’ indices – now is that applied iconoclasm, or what?
However, all you’ll probably stand to gain from this approach is two basic realizations, namely a) that a good many of these people couldn’t even optimize a dictionary if you threatened them with a gilded Google AdWords Professional Certificate, and b) that not all’s as it seems on do-gooder planet, with many purportedly long penalized “spam techniques” still working like a song, especially if you don’t want them to. So put it on Murphy’s Law.
But seriously, there’s more, a whole lot more in fact that can be done, but unless you’re an experienced code monkey and sysadmin you probably don’t want to hear about it.
Finally, while many “white hats” may strongly disagree, when it comes to your personal security, privacy and anonymity, always heed that old Chinese proverb:
Be afraid of the search engines – they ain’t nice people and they don’t like you!