Get source code from Google Chrome extension page

Asked

Viewed 4,164 times

11

You can create an extension for Chrome that takes either the source code of the page or the entire text (Ctrl + To, Ctrl + C), to submit this to an external website (for data mining) and return the resulting content from the website? (in this case, a graph with the main terms).

Obs: form I created: (is in popup.js)

    var my_form=document.createElement('FORM');
    my_form.name='entrada';
    my_form.method='POST';
    my_form.action='http://sobek.ufrgs.br/newSobekSite/new-sobek.php';  
    my_form.submit();

1 answer

11


As answered by Soen it is possible:

manifest.json

Note: Change <all_urls> for something like "*:*//site1.com", "*://*site2.com", these sites would need you to "free access" to can make communication.

Note: It is likely that you should add the permission clipboardRead (and maybe the clipboardWrite) in the manifest:

{
    "name": "Get pages source",
    "version": "1.0",
    "manifest_version": 2,
    "description": "Pega o conteudo da página e envia para um servidor",
    "browser_action": {
       "default_icon": "icon.png",
       "default_popup": "popup.html"
    },
    "permissions": [
        "webRequest",
        "tabs",
        "clipboardWrite",
        "clipboardRead",
        "<all_urls>"
    ]
}

popup.js

You can use Ajax to send to another server request.source, as in the example:

chrome.extension.onMessage.addListener(function(request, sender) {
    if (request.action === "getSource") {
        var message, data, xhr;

        message = document.querySelector("#message");
        data = request.source;

        message.innerText = "Enviando ao servidor...";

        xhr  = new XMLHttpRequest();
        xhr.open("POST", "http://site1/webservice.php", true);
        xhr.onreadystatechange = function() {
            if(xhr.readyState === 4) {
                if (xhr.status === 200) {
                    message.innerText = "Resposta do servidor: " + xhr.responseText;
                } else {
                    message.innerText = "Err: " + xhr.status;
                }
            }
        };

        //Enviando dados como RAW
        xhr.send(request.source);
    }
});

function onWindowLoad()
{
    var message = document.querySelector('#message');

    chrome.tabs.executeScript(null, {
        file: "getPagesSource.js"
    }, function() {
        // If you try and inject into an extensions page or the webstore/NTP you'll get an error
        if (chrome.extension.lastError) {
            message.innerHTML = "Erro ao executar o script : <br>" + chrome.extension.lastError.message;
        }
    });
}

window.onload = onWindowLoad;

getPagesSource.js

To copy as if it were the user copying we use window.getSelection().addRange and a <div contentEditable="true"></div>

function copyFromDOM(target, rich) {
    var range, dom, source, posX, posY;

    posX = window.pageXOffset;
    posY = window.pageYOffset;

    dom = document.createElement("div");
    dom.contentEditable = true;

    range = document.createRange();
    range.selectNode(target);

    window.getSelection().removeAllRanges();
    window.getSelection().addRange(range);
    document.execCommand("copy");

    document.body.appendChild(dom);

    dom.focus();

    document.execCommand("paste");

    source = rich === true ? dom.innerHTML : dom.textContent;

    window.getSelection().removeAllRanges();
    document.body.removeChild(dom);

    window.setTimeout(function() {
        window.scrollTo(posX, posY);
    }, 1);

    range = dom = null;
    return source;
}

chrome.extension.sendMessage({
    action: "getSource",
    source: copyFromDOM(document.body, false)//Copia apenas texto
});

Note: If you want to copy with "rich-text", then use copyFromDOM(document.body, true)

Note: There was a problem in the use by the OP regarding the function copyFromDOM(document.body, false), he was using the Googlechrome 38, but after the update to the latest version the function started working normally.

Copying source code from page

To copy the source code of the page change the getPagesSource.js for something like (based on user response Rob W):

getPagesSource.js

// @author Rob W <https://stackoverflow.com/users/938089/rob-w>
// Demo: var serialized_html = DOMtoString(document);

function DOMtoString(document_root) {
    var html = '',
        node = document_root.firstChild;
    while (node) {
        switch (node.nodeType) {
        case Node.ELEMENT_NODE:
            html += node.outerHTML;
            break;
        case Node.TEXT_NODE:
            html += node.nodeValue;
            break;
        case Node.CDATA_SECTION_NODE:
            html += '<![CDATA[' + node.nodeValue + ']]>';
            break;
        case Node.COMMENT_NODE:
            html += '<!--' + node.nodeValue + '-->';
            break;
        case Node.DOCUMENT_TYPE_NODE:
            // (X)HTML documents are identified by public identifiers
            html += "<!DOCTYPE " + node.name + (node.publicId ? ' PUBLIC "' + node.publicId + '"' : '') + (!node.publicId && node.systemId ? ' SYSTEM' : '') + (node.systemId ? ' "' + node.systemId + '"' : '') + '>\n';
            break;
        }
        node = node.nextSibling;
    }
    return html;
}

chrome.extension.sendMessage({
    action: "getSource",
    source: DOMtoString(document)
});

Server receiving the data

Since I don’t know what your server’s language, I will provide an example with PHP, this example only writes to a file, but you can switch to a database and use data RAW instead of x-www-form-urlencoded (HTML form type), This is just one example, you can send the data in other ways to the server:

webservice.php

<?php
if (false === ($input = fopen('php://input', 'r'))) {
    echo 'Erro ao ler os dados recebidos';
} else if (false === ($output = fopen('meu-arquivo.txt', 'w'))) {
    echo 'Erro abrir arquivo para gravação';
    fclose($input);
    $input = NULL;
} else {
    $hasData = false;

    while (false === feof($input)) {
        $data = fgets($input, 128);
        if ($data !== '') {
            $hasData = true;
        }

        fwrite($output, $data);
    }

    fclose($input);
    fclose($output);

    $input = $output = NULL;

    echo $hasData ? 'Ok' : 'Área de seleção vazia, tente novamente';
}

If sending via POST with x-www-form-urlencoded (type HTML forms) will need to be used setRequestHeader and window.encodeURIComponent:

        xhr  = new XMLHttpRequest();
        xhr.open("POST", "http://site1/webservice.php", true);
        xhr.setRequestHeader("Content-type","application/x-www-form-urlencoded");

        xhr.onreadystatechange = function() {
            if(xhr.readyState === 4) {
                if (xhr.status === 200) {
                    message.innerText = "Resposta do servidor: " + xhr.responseText;
                } else {
                    message.innerText = "Err: " + xhr.status;
                }
            }
        };

        //Troque isto pela sua variável que é usada no SERVIDOR
        xhr.send('minha_variavel_do_servidor=' + window.encodeURIComponent(request.source));

Note: window.encodeURIComponent works with UTF-8 you may need on the server to decode this data, just in case you use windows-1252 or iso-8859-1

Showing result in a pop-up or new window

Just not being in the scope of the question AP requested the use of pop-up to display the results, as many extensions use pop-up to show updates or similar things so I decided to provide such example. To use is necessary window.open and window.open().document.write, to use in the call extension inside Ajax:

xhr.onreadystatechange = function() {
    var win;

    if(xhr.readyState === 4) {
        if (xhr.status === 200) {
            //Abre uma nova aba ou pop-up
            win = window.open("", "_blank", "width=600, height=600");
            win.document.write(xhr.responseText);
        } else {
            //Mostra o resultado na extensão
            message.innerText = "Err: " + xhr.status;
        }
    }
};

Avoiding erasing the user’s clipboard

If you only want to copy "texts" and avoid using clipBoard you can use the textContent, it will copy only the text and as I said it will not affect the clipboard, so it will not be necessary to add permissions to the manifest, since we will no longer manipulate the clipBoard, change the function copyFromDOM for:

function copyFromDOM(target, rich) {
    return rich === true ? target.innerHTML : target.textContent;
}
  • Got it! In this case it’s $_POST['input'], as I said (from what I saw in the script). thanks! But to give Ubmit on the page, will just sending to the server already do? I’ll try tonight/dawn to do it. Thanks for all the help! Obs: the end of the code ends in if the entry post is not null, runs the post. So I think so! And the answer I should get with a simple xhr.responseText or something, right? Obs2: Yes, I’m not aware of it, I’m new to it. Thanks for all the help there! ;)

  • Hi! A question, to make this same extension for firefox I saw that it is necessary to make some adjustments. But it has incompatibilities like the clipboardread/write permission. You would know me answer what is necessary to modify to make this extension available in Firefox as well?

  • Firefox is something that changes in many aspects, such as configuration of the manifest, following some examples: https://github.com/mdn/webextensions-examples, to create a precise extension of Firefox 45+ and then follow the steps described here: https://developer.mozilla.org/en-US/Add-ons/Webextensions/Your_first_webextension I don’t know if it’s incompatible with Clipboard, I think you may have done something wrong, because in the examples I passed you have the to copy Clipboard and the to copy DOM, If they both failed, it’s not a compatibility problem, it’s a mistake. @Lucianozancan

Browser other questions tagged

You are not signed in. Login or sign up in order to post.