3
Context:
I have a list of ISBN (International Standard Book Number) with about 100 records, all Brazilian books, and wanted to get the information about the book in a faster way, but it doesn’t have to be in real time.
To search the book via ISBN, access the ISBN Brasil on the part of research and inform the ISBN and return me the data, and then fill in a spreadsheet, to be very basic.
To do this search a little faster, I just need to type the Captcha once and the rest I refer to the url below, only changing the ISBN:
http://www.isbn.bn.br/website/consulta/cadastro/isbn/9788566250299
Need:
Based on that, I had the idea, informs 1 time the captcha and then I can search any book, changing the ISBN of the URL, but I wanted to do it automatically, and save in a TXT even separated by space the values of the fields.
Via console.log
I get the information I need, in a very simple way, but it works:
String.prototype.trim = function() {
return this.replace(/^\s+|\s+$/g, '');
};
var livro = '"' + document.getElementsByClassName("conteudo")[0].getElementsByTagName('div')[5].childNodes[3].nodeValue.trim() +
'" "' + document.getElementsByClassName("conteudo")[0].getElementsByTagName('div')[6].childNodes[3].nodeValue.trim() +
'" "' + document.getElementsByClassName("conteudo")[0].getElementsByTagName('div')[7].childNodes[3].nodeValue.trim() +
'" "' + document.getElementsByClassName("conteudo")[0].getElementsByTagName('div')[8].childNodes[3].nodeValue.trim() +
'" "' + document.getElementsByClassName("conteudo")[0].getElementsByTagName('div')[10].childNodes[3].nodeValue.trim() +
'" "' + document.getElementsByClassName("conteudo")[0].getElementsByTagName('div')[12].childNodes[3].nodeValue.trim() +
';' + document.getElementsByClassName("conteudo")[0].getElementsByTagName('div')[12].childNodes[5].nodeValue.trim() + '"';
console.log(livro);
Return:
"978-85-66250-29-9" "Começando com o linux: comando, serviços e administração" "1" "2013" "135" "Adriano Henrique de Almeida (Organizador);Paulo Eduardo Azevedo Silveira (Organizador);Daniel Romero ( Autor);"
Problem:
I have to do book by book, and I’m sick of doing this :/, I wanted to know
if you have some simple way to automate this, maybe you can walk through an array as an example below, and return the information even if it is in a simple TXT with spaces as return above.
Thank you
var isbn = ["9788566250299", "9788555191459", "9788555191039"];
Observing:
Using the google API Google Books Apis some books nay return result, as shown in this question Search book details with google-Books-api-in-php, so I would like to do it this way above, from which the result can be obtained by the URL:
https://www.googleapis.com/books/v1/volumes?q=isbn:9788566250299
But there is no return, already through the site of ISBN Brazil, has.
If I’m going to test this site, do I need to register? You always have to insert a captcha for each book?
– Sergio
No need to register, the link was wrong I arranged, just do the research a single time informing the captch, then by the link
http://www.isbn.bn.br/website/consulta/cadastro/isbn/
+ ISBN returns the book data from a new ISBN without typing again captch.– David
Okay, I managed to do a search but it takes a new captcha at every right search?
– Sergio
Ai que tá, the site does not validate if used the URL above and add the ISBN of the new search
– David
@Sergio, do you have any suggested terms so I can search and solve the problem? Or any hint, qqr thing. Thank you.
– David
If they don’t have an API to communicate, just by doing one by one by filling the captcha...
– Sergio
But you do not need to type the new captch for a new query, but the query should be via URL and not via query page, that is, after consulting the first time, it seems that the session is active, so using the URL
ttp://www.isbn.bn.br/website/consulta/cadastro/isbn/9788566250299
and exchanging the ISBN I have the information of a new book– David
David this is much more interesting... I’ll take a look later, but this way you can have a Rawler that will fetch all the pages.
– Sergio
You have a list of all the Isbns you want?
– Sergio