The
facts about Cloaking
Cloaking is the technique of returning different pages according
to who or what is requesting them. E.g. a surfer would receive the
actual page whereas a search engine spider would receive a
different page, but would assume that it is the actual page that
surfers see.
The
purposes of cloaking for search engine optimization are to hide
highly optimized pages from people, so that they can't be stolen,
and to provide search engine spiders with highly optimized pages
that wouldn't look particularly good in browsers.
There are three ways of cloaking. One is "IP delivery",
where the IP addresses of spiders are recognized at the server and
handled accordingly; another is "User-Agent delivery",
where the spiders' User-Agents are recognised, and the third is
the first two combined.
Is cloaking ethical?
Let me put if this way. Search engines do it all the time. For
instance, Google delivers different pages according to where in
the world the surfer is located. People in the UK receive AdWord
advertisements that are relative to the UK, and people in the U.S.
receive AdWords that are relative to the U.S. Google delivers
different pages according to the surfer's IP address. That's IP
delivery, and that's cloaking.
Also, from time to time, search engines prevent certain people
from gaining access to their .com versions. Instead, by checking
the surfers' IP addresses, they redirect people to their localized
versions - even when the surfers really do want to go to the .com
version! Again, that's IP delivery, and that's cloaking.
Because the search engines do it, it is clear that cloaking isn't
instrinsically unethical or wrong. If it's ok for the search
engines to do it, then it must be ok for everyone else to do it.
So what is it about cloaking that some people are dead against?
There is no sensible answer to that. The general idea is that
serving people one page and serving the search engines a different
page is simply wrong, because the engines are ranking the page
according to what they believe it to be and not according to what
it actually is. That idea is purely a matter of principle, and
nothing at all to do with common sense.
The common sense view is that, if a page is ranked correctly,
according to its topic, then surfers will find it in the search
results, click on it and go to exactly where they expect to go to
from reading the listing in the search results. It doesn't matter
how the page came to be ranked in that position, and it doesn't
matter if another page took its place when the engine was
evaluating it. As long as it is ranked correctly, according to its
topic, surfers are perfectly happy.
The fact that cloaking can be used to send people to sites and
topics that they did not expect to go to when clicking on a
listing in the search results, is an excellent reason to be
against the misuse of cloaking, but it is no reason at all to be
against the cloaking technique in general.
An example of how cloaking can actually help Google
Google's crawlers won't spider pages that have anything that looks
like Session IDs in their URLs. If they did, they run the risk of
spidering a potentially infinite number of pages, because each
page that is requested would contain links to other pages, and the
link URLs would contain the current session ID, which makes them
different URLs than the last time the page was requested. And so
it would go on and on and on, producing a vast number of unique
URLs to spider and index.
It means that Google won't spider most of the pages on some
websites. But Google actually wants to spider most or all of each
website's pages. The solution is to cloak the pages. By spotting
page requests from the Google spiders, and delivering modified
pages without the normal Session IDs in the link URLs, Google is
able to spider all of a site's pages. This is precisely what
Google wants, it's what the website owner wants and, if asked, it
would be what sufers want. It helps everybody and harms no-one.
This example alone demonstrates that cloaking is not intrinsically
wrong or unethical. The technique can be used unethically, but it
has various perfectly ethical uses.
How is cloaking done?
A simple way of doing it on an Apache server is by using the .htaccess
file with the mod_rewrite module. With an .htaccess file in place,
every file request is subject to it.
The .htaccess file has many uses but, for cloaking purposes, it
employs Apache's mod_rewrite module to check for the search engine
spiders' IP addresses, User-Agents, or both. If a spider is
detected, then mod_rewrite is used to return a page that has been
specially designed for spiders. If the requester is not a spider,
then the request goes through as normal and the normal page is
returned. Spiders are not aware of the switch. As far as they are
concerned, they are getting the page that they requested.
Cloaking Software
Fantomas is widely recognised as the best cloaking software
around. Information about it can be found here.


