How can we not allow indexing by search engines?

Asked

Viewed 618 times

17

Those days I put my domain in Google and he got my Web Site and my System.

I wish my System was hidden from Google and any other search engine.

  • You could do that?
  • And how to get indexing already done by Google?

2 answers

19


Just put a file called robots.txt (has good documentation in this website) in the folder you do not want to be indexed.

Some searchers interpret what is written inside to learn more how to follow your request.

The search engine is not required to ignore but usually they do. Evidently if no one should be able to access you have to take other steps limiting access to authenticated users, so it’s just a convention, it’s like you have a key-less door and a sign saying "don’t enter".

As it is possible to notice it is possible to declare which Urls can be accessed or not. It can be differentiated depending on the type of client (user-agent) who is accessing the website.

To ban at all website:

User-agent: *
Disallow: /

Some time after putting this file the mechanisms that respect this file will no longer show the content on your pages. However may have been filed and as far as I know without court order it cannot be removed.

Wikipedia article.

  • It depends on the robots.txt itself or dependent on the search engine "know how to read it"?

  • 2

    It’s the mechanism. It does what it wants. As I said in the reply there is nothing to prevent any access to content that is not protected by any form of authentication.

  • @Bigown Do you think that even though I put the robots Google and other machines to have archived my system did not stop to disentangle it? I found this robot generator. Should I trust it? http://www.marketingdebusca.com.br/robots-txt/

  • @Marconi can not advise what I do not know but I think so. See the generated content if it is similar to the examples I showed. About archiving I think just experimenting. But remember that you will not be preventing any access just prevent them from being indexed.

  • @bigown I know bigown, I just want Google to hide it. It is not intended to prevent access even if not.

7

You can use a file robots txt. at the root of your web directory with the following content:

User-agent: *
Disallow: /

Remembering that robots.txt is only one tip so Crawler doesn’t index that page. The main search engines respect what is indicated in the file, but this does not mean that the content will be invisible or inaccessible.

Through the Google Webmasters you can remove your site from search results google.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.