Page indexing that can redirect

Asked

Viewed 31 times

1

I have a certain site that contains some pages, but some of these pages are not being indexed by Google.

However the pages that Google does not index cannot be accessed if a certain option is not chosen before accessing them.

For example:

If the guy enters the page /page1 will be verified in the local Storage of the client if there is saved a certain configuration, if it is already configured it is proceeded to /page1 if not it will be redirected to a particular /homepage where the user will set this setting and after that is taken to /page1 again.

In this scenario, that would be what is causing the non-indexation of these pages?

  • 1

    I believe that the Google bot behaves like an ordinary user, so if the page is only accessible after a registration for example, it is to be expected that the Bot does not really index it, because it will not fill the registration of your site, so you do not have access to the page, so there’s no way to index it...

  • 1

    Yeah, I figured that out, but I haven’t been able to find anything to back it up yet. But I’m 90% sure that’s what it is....

  • I don’t have a reference source, but I’m sure you’ll find something documented there...

1 answer

1


I’ll give you an answer that I think might clear up a few points for you. Although it may not be the exact answer to your problem, understanding can help you eliminate some hypotheses.

First, in some cases the Google Bot fills in yes, forms and other types of input, but it only does so if it detects that there may be some content of interest. How Google interprets this "interest" that only Google engineers can answer you. But you can see details in this video of Google Webmasters, see from the minute 29 https://www.youtube.com/watch?time_continue=1751&v=QWL864VlW7I

Since this should be kept in mind that even if the Bot intends to fill in the fields these fields need to be friendly to the Bot, this means that they should not for example be personal data fields, including credit cards, Cpf, passwords etc.

Google will decide on an individual Basis if a FORM-Element on a page is considered to be Useful and then Try to Fill out that form using a small number of Different natural requests, made to Simulate an actual user.

Google only crawls forms which use the GET-method and do not Ask for personal information. Additionally, the form should be made up of no more than two input Fields.

Translation

"The Google decide individually whether an Element FORM on a page is considered useful and will attempt to fill out this form using a small number of different natural requests made to simulate a real user.

The Google screening only forms using the method GET and do not request personal information. In addition, the form shall be composed of no more than two input fields."

Source: https://www.sistrix.com/ask-sistrix/google-index-google-bot-crawler/can-the-google-bot-fill-out-and-crawl-forms/

Still under the Password field, see what says the Help Center google:

Googlebot and all other web crawlers are Unable to access content in password-protected Directories.

"The Googlebot and all other web crawlers cannot access content in password-protected directories."

Source: https://support.google.com/webmasters/answer/93708?hl=en&ref_topic=4598466

To complete here are all the topics of Help Center Google on how to block content from Google Bots. Maybe there is something you can do a "reverse engineering" to identify why the content is not indexed by Google.

https://support.google.com/webmasters/topic/4598466?hl=en&ref_topic=4617736

  • 1

    So I went through the links and video that you posted, and we did some analysis and studies here with the google console and in the development environment, and we came to the conclusion that this is really it, the fact that google does not fill our access form is on account that the same is a field where the user enters the name of his city and appears a list in which it has to select the valid city (a select that appears according to what the user type)and so probably the google bot can’t move on to the following pages. Thanks for the support!

  • 1

    @Mslacerda this is the theory revealing itself in the rss practice. Cool you have come here to share the result, at least now we know a little more how this bot really behaves and its limitations...

Browser other questions tagged

You are not signed in. Login or sign up in order to post.