This answer is very long, but here goes my 2 cents of what I learned until today by watching Google:
1. How Google knows which search keywords led the user to the site?
Note that when you do a Google search, the links to the sites found are listed as the actual links (for example, if the search was by StackOverflow
, the first result points to www.stackoverflow.com
).
But at each anchor (a
) there is a registered callback for the event onmousedown
. This event replaces the href
of the anchor by a href
that points to Google’s servers, and this in turn redirects the browser to the real site. See the images below:
Before clicking
After clicking on the link
This way, Google associates the typed keywords with the typed site, and can use this information to optimize search results (note that Google offers search results according to the visitation standard as well, being tracked through your Google account (I believe Chrome sends history data to Google as well, but don’t stop to analyze)).
Editing:
Note the value of href
after the execution of the event onmousedown
, that the new value of href
does not include the real site address (www.stackoverflow.com
). This indicates that some of the query parameters uniquely identifies the actual URL within the Google server in the context of the search. Thus, Google can track that the typed keywords are actually related to the URL, and can improve your search results in the next iteration of the ranking algorithm.
Another less technical point, but one that also justifies Google storing the search and results in its database in a referential way, is that if it wasn’t, it would be possible for a competitor to tamper with the final Urls, degrading the statistics used by Google in the ranking page. Example:
Imagine that in the href
, after the onmousedown
, Google has placed for the StackOverflow
something like /url?keyword=stackoverflow&real=www.stackoverflow.com
. This means that when the user clicks on the link, Google will be informed that the keyword typed was stackoverflow
and the real site is www.stackoverflow.com
.
Now imagine for example a virus on the client’s computer (or a MITM attack), changing the URL to: /url?keyword=aumente%20seu&url=www.stackoverflow.com
. When the user clicks on this link, Google will know that Keywords aumente seu
are related to www.stackoverflow.com
, degrading page ranking performance.
This could be done not only by competitors, but also by trolls and people who want to make their site appear at the top of the polls.
2. How Google Analytics can verify the origin of the click?
When the Analytics code runs on the visited page, it collects information from the browser and the system. Visiting the website www.nortonconsultoria.com.br, this information was collected and sent via GET parameters to Anlytics:
utmwv:5.6.2
utms:1
utmn:1342588638
utmhn:www.nortonconsultoria.com.br
utmcs:UTF-8
utmsr:1280x1024
utmvp:1265x716
utmsc:24-bit
utmul:en-us
utmje:1
utmfl:-
utmdt:Norton TI
utmhid:952692061
utmr:-
utmp:/
utmht:1423077254452
utmac:UA-41756695-1
utmcc:__utma=223515140.1707534080.1422473707.1422473707.1423077254.2;+__utmz=223515140.1422473707.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
utmjid:986782225
utmredir:1
utmu:qAAAAAAAAAAAAAAAAAAAAAAE~
In addition to this information, the HTTP header sends others, in particular the User-Agent
, used to display which platforms focus most on your site.
I believe that there are more complex ways for Google to collect other information, but I do not know. This is just data that I noticed that is sent to Google through the developer tools of Chrome (I looked in Firefox also to see if there were major differences, but found nothing).
I hope it helped a little!
Very good your comment. So in this case, does Analytics search for key word information directly at the Google base? Or have another way to find the terms matched to the link?
– lucasDotCom
I started to respond here in the comments, but I ended up noticing more things, and I thought I better include in the answer.
– Vinícius Gobbo A. de Oliveira
Right, so there’s no way of knowing which keyword the search resulted from if not Google itself.
– lucasDotCom