detecting cloaked web pages Best answer on the web

Author: mike  //  Category: xn--g7qx97f.com
  • I suspect that some pages I'm competing against for search engine rankings are cloaked. Other than comparing the source code with the search engine listing (which may match even if cloaked), is there a tool or method I can use to more accurately detect a cloaked page? I'm not looking to cause trouble for anyone. I just want to know what I'm up against. "Yes" answers only please.


  • By the way, if you have a few sets of searches you would like checked for the first 100 in the Google list for cloaking, just post a question with those key phrases and I'll run the program against the checks. It would take some alteration to the code, but certainly not much, and I can grab the first 100. Just post the keyword phrases you would like and I'll set it up and post you the report.
    Thanks,

    webadept-ga


  • Hello,
    I used to use IP cloaking (before it was frowned on by the search engines) and the script was free. This is free too http://scriptsmatrix.com/Detailed/558.html - but has since been removed. This is also an inexpensive IP cloaking service http://www.improved-ranking.com. However it is important to point out I am not recommending these services but simply highlighting their existance and the ease that unscrupulous webmasters can implement IP cloaking. Conversely at the other end of the spectrum, in order to see these cloaked pages you would need to perform IP spoofing, to impersonate an identity (one of the search engines) by assuming their IP address and user agent, which is legally and ethically questionable in itself. regards
    lot-ga


  • Hello tobes-ga

    I didn't find a way to accurately detect a cloaked page, (as the major search engines have trouble) but any pages you suspect may be cloaked, I can tell some methods to reveal cloaked pages in a deliberately vague way, as the methods of doing so may be illegal and against policies. So you will have an idea and not a step by step guide of how to actually do it. A bit limiting I know, but if you will find this useful let me know, kind regards
    lot-ga


  • Hi again, Thanks for the response.

    As Hailstorm and Lot have pointed out, there are different types of cloaking, which this program doesn't look at. However, there are acceptable reasons to cloak a page using these methods. For instance, the IP address is known to come from France, so your page is sent out in the French language, rather than English. Or, the IP address was recorded for a registered user, so his preferences are given rather than the basic page. A page is created for IE, Netscape and "others" and the cloaking program sends the pages in the format best seen by those browsers. These are all reasonable means and methods of cloaking which are used by websites. Heck, I use them.
    What Google is miffed about is cloaking to the googlebot itself, and this is much more difficult to cloak. First off IP cloaking doesn't work, because, well, They are Google. As Lot pointed out as well, the IP can be spoofed. I can set up my server to have a spoofed Google address and run the program through that using code to send through that "spoofed" port and wala! I'm the googlebot. I'm not going to do that with this program, as it's not really required to do so.
    Cloaking to the bots is much harder, especially if the bots are checking. It is really easy for them to jump an IP address, become a IE browser, get the page and check again as the normal Bot. Just about any programmer at my level can do it too. So most of the "hype" on the page Hailstorm has given about the "greatness" of their service is just that.. hype.
    So, there is nothing wrong with cloaking a page for user viewing, and the service Hailstorm has given looks really good for that, and like I said, I do it myself. It's very useful for the users and helps keep your site fresh and alive. But cloaking to the bots is not easy to get away with and has huge repercussions. Also, as you say, most companies aren't going to spend the money on IP cloaking for Bots. They may do it for Users and IP blocks, but not for the bots. Anyone that runs a webserver for any length of time knows that the IP addresses change for the bots at random times. So a company could get unlisted simply because they didn't respond to the bot when it showed up, if it wasn't done right.
    Thanks,

    webadept-ga


  • Thanks again. For clarification, how accurate do you think the tool is? I checked the top five Goggle un-sponsored listings for the highly competitive keywords “data recovery" and (had to do it), "search engine optimization." The tool found only one cloaked page among them (for comparison, the overture PPC bid for #1 listings on these keywords is 9-10 bucks a pop!). If these results are correct, it would seem that page cloaking is of highly dubious value. So, should I take the cloaking check results with a grain of salt or renew my faith in humanity? Best,
    tobes


  • The tool is pretty accurate. I played with it for a while. First off it shows the webserver that the first request is coming from a IE 6.0 browser, the second request is coming from Googlebot/2.1 and the logs I checked on my servers and a few clients look identical to the real Googlebot as near as I can tell. I doubt that anyone would be able to tell the difference programmicaly. But, everyday someone is doing something someone else said was impossible. :-)
    Anyway, as far as cloaking goes, it is a lot of work, very high risk, and difficult to maintain. The simple check I created there, simple meaning rather fast to create, not in technology, is really easy to add too, and make even more devious. Search engines don't do it a lot because it is processor intensive and they need to keep their servers running as fast as they can. But I am sure that they do run checks periodically on various sites. Spot checks if you will. There are times when I'm asked "Why did my site suddenly vanish after being on top for the last year" and I find out that they were using a cloaked page. That answers the question. Also it brings up the huge risk.
    Once you are out of the index, it takes a very long time to get Google to put you back, and when they do your pagerank is down, far down. They don't like it, and they say so rather bluntly. So if you are a serious company on the Internet with a mind for a future, why risk it? PageRank and Page Relevance are available to anyone willing to put in the time and effort. There are no guarantees, but there is a guarantee that if you do cloak, eventually the bots are going to figure it out and your company is suddenly going to disappear.
    Faith in humanity? Maybe, maybe not. Faith in personal survival.. yes.. definitely. By the way, the one you found you can report to Google using this page here.. :
    ://www.google.com/contact/spamreport.html

    Thanks

    webadept-ga


  • Hi,

    1. What is cloaking?

    The term "cloaking" is used to describe a website that returns altered webpages to search engines crawling the site. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they'll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking to distort their search rankings.
    That is what Google calls cloaking. The way to detect this is to search the page twice, once making the server think the Googlebot is looking at it, and the second time by telling it a Webbrowser like Netscape is looking at it, this can be done with a Perl program.
    Since there doesn't appear to be a program out there for public use to do this, I decided to make one since it would only take an hour or so to accomplish that and it would be rather cool to have. You can go to:
    http://www.webadept.net/cloaker/index.html

    and use the program there to check websites.

    Thanks,

    webadept-ga


  • According to this site:

    http://www.searchengineworld.com/misc/cloaking_agents.htm

    There are five different types of cloaking:
    1) User Agent Cloaking (UA Cloaking)
    2) IP Agent Cloaking (IP Cloaking)
    3) IP and User Agent Cloaking (IPUA Cloaking).
    4) Referral based cloaking.
    5) Session based cloaking

    Some of these, especially IP cloaking, is extremely advanced. Fantomaster site http://fantomaster.com sells a database that updates all the search engine robot IP addresses several times a day.
    So, I am sorry to say that I don't think webadept's tool can work on advanced cloaking of this nature.


  • Thanks for the offer but I think I can check what I need with a few manual entries. I have serious doubts that the companies I need to check have deep enough pockets to buy cloaking that can get around your checker, so I think your method will do for my purposes. Thanks Again.









  • #If you have any other info about this subject , Please add it free.#
    Your name:
    E-mail:
    Telphone:

    Your comments:


    If you have any other info about detecting cloaked web pages , Please add it free.