Most voted "web-scraping" questions

It’s the process of extracting information from websites. It is typically used by third-party applications to extract information or interact with a website that does not expose an API.

Learn more…

191 questions

Sort by count of

1
votes

1
answer

477
views

Web scraping at a specific url with Beautifulsoup

from bs4 import BeautifulSoup import requests import re url = 'http://www.bhaktiyogapura.com/2017/03/calendario-vaisnava-marco-de-2017/' header = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64;…

python python-3.x python-2.7 web-scraping
asked 8 years, 4 months ago Ed S 2,057
1
votes

1
answer

114
views

Web Crawler with Django’s view.py

I am making a simple web Crawler, using Django 2.0, I want to capture only the "title" class of the news and then render "Return render" to a simple html, below my view.py. I am currently using…

django web-scraping web-crawler scrapy scraping
asked 7 years, 2 months ago Bruno Lima 49
1
votes

3
answers

1839
views

Web scraping python running javascript on the CEF website

The CEF (Caixa Econômica Federal) changed the way it displays the results of the lotteries on its website, before I could get the results that all came in HTML via webscraping relatively easily…

python python-2.7 web-scraping
asked 7 years, 1 month ago MJAGO 149
1
votes

0
answers

58
views

Converting Python Web Scraper 2.7 to 3.5

Good afternoon, everyone! Here’s the deal: I found a script in Python 2.7 but I have version 3.6. As I am new in this field, I wanted to work manually to convert this script. Follow the code below:…

python python-3.x python-2.7 web-scraping
asked 8 years, 4 months ago Erick Fernandes 11
1
votes

1
answer

300
views

Urllib.request or Request?

I am studying web scraping and in many guides I have seen examples where they are used urllib.request and request.get. From what I’ve tested and understood the two do the same thing. So what’s the…

python web-scraping python-requests
asked 5 years, 9 months ago hidantachi 83
1
votes

0
answers

171
views

Error: read ECONNRESET and Error: connect ETIMEDOUT

Good night, you guys. I’m doing scraping (web scraping) from a website, I’m using the Nodejs and Xios. When I run the application it works perfectly bringing me the information I requested, but it…

javascript node.js web-scraping axios scraping
asked 5 years, 3 months ago Nando Patez 11
1
votes

1
answer

2076
views

Select an option from the Selenium Python drop down menu

I have a menu that presents several options, I want to select only the one that is active. When I give 'variable'. find_element_by_id('key') Selenium returns me ALL options. The active option has a…

python selenium web-scraping
asked 8 years ago Bergo de Almeida 181
1
votes

2
answers

88
views

Organize string data flow by default

Friends, I am working on a scraping project. At some point, I capture a table on the screen in the shape of a giant string, more or less like this: list = ('0004434-48.2010 n EU n (30 working days)…

python string list web-scraping
asked 8 years ago Bergo de Almeida 181
1
votes

1
answer

382
views

Python: select checkbox in an orderly way

I have a list containing hundreds of data in the format [ '5008489', 'Órgão: MPF', 'PROCEDIMENTO DO JUIZADO ESPECIAL', 'CPF', <selenium.webdriver.remote.webelement.WebElement…

python checkbox selenium selenium-webdriver web-scraping
asked 7 years, 11 months ago Bergo de Almeida 181
1
votes

1
answer

51
views

Web Scrapping R

I tried several ways but I can’t make Scrapping from the following table: http://www2.bmf.com.br/pages/portal/bmfbovespa/boletim1/TxRef1.asp. Until now through the following code: library("rvest")…

r web-scraping
asked 4 years, 11 months ago Felipe Silva 13
1
votes

1
answer

1158
views

Fill JS Form - Web Scraping in Python - Selenium and Phantomps

Friends. I’m developing a code to access the Anbima, fill in the fields and download the generated txt. I have been looking for a solution to this problem for a few days. So far, I have found that…

javascript python selenium web-scraping phantomjs
asked 7 years, 11 months ago Thales Marques 111
1
votes

0
answers

198
views

phpQuery Web Scraping Event

I want to get information from the website using phpQuery, but I’m still learning how to use. The information I want to get appears in a select but only after clicking it. Without clicking it…

php web-scraping
asked 7 years, 9 months ago rhundler 111
1
votes

1
answer

860
views

Beautifulsoup - True href links

I was studying about Webscraping with Python and started using the bs4 bilioteca (Beautifulsoup). When I started picking up the tags a and the attribute href, I realized that I could not access the…

html python web-scraping scraping
asked 7 years, 8 months ago Eron Medeiros 11
1
votes

1
answer

172
views

On big scrapings how to avoid Connectionerror?

In Python 3, I have a program to make web-scraping tables on websites. There are 5,299 pages, on each page there is a table With XHR I found the generated JSON on each page. But there is always a…

python request pandas web-scraping
asked 7 years, 3 months ago Reinaldo Chaves 333
1
votes

1
answer

219
views

How do I display variables on a Django page?

I am new to Django and am taking some information from a web page using lxml. I would like to know how to display the values on my website. import requests from lxml import html from…

django web-scraping
asked 7 years, 3 months ago scylo 38
1
votes

1
answer

332
views

Scratch parameters of a post method, with scrapy in python!

I need to collect information from a website using Spiders within Scrapy in Python, but the site is a method post and I’m learning the language while developing the project. I found a model of post…

python web-scraping scrapy
asked 7 years, 2 months ago Jonathan Igor Bockorny Pereira 111
1
votes

1
answer

87
views

Simultaneous threading (parallel processing) in R and serialized recording in Sqlite

Hey there, guys. I am trying to develop a code that makes it possible to perform parallel processing (parser) of HTML files using the R Language and, consecutively, record the data extracted from…

r sqlite web-scraping
asked 6 years, 9 months ago George Santiago 139
1
votes

1
answer

347
views

Scraping data using Robobrowser

I’m trying to scrape a form, to insert an attachment and submit, using Robobrowser. To open the page I do: browser.open('url') To get the form I make: form = browser.get_form(id='id_form') To enter…

python web-scraping beautifulsoup
asked 6 years, 7 months ago Rafael 477
1
votes

0
answers

42
views

Reading Table in Python with Beautiful Soup

I need to get a table of the transparency portal to then write to the database. I am using Beautiful Soup. I can’t bring in the request the part that has the data and consequently no tag that I look…

python python-3.x web-scraping beautifulsoup
asked 6 years, 4 months ago wmoura12 11
1
votes

1
answer

124
views

I need to compare the value of the last filled cell with the antepenultimate

ola thank you for your attention. I need to compare the value of the last filled cell with the antepenultimate. if the value is different I want to continue with the current value if seje = place…

vba excel-vba web-scraping range
asked 6 years, 2 months ago Thays Lima 11
1
votes

0
answers

651
views

Curl error 60: SSL Certificate problem: self Signed Certificate in Certificate chain (see http://curl.haxx.se/libcurl/c/libcurl-errors.html)

I’m trying to accomplish a WebScraping, however I am getting the following error as return: Curl error 60: SSL Certificate problem: self Signed Certificate in Certificate chain (see…

php laravel-5 ssl web-scraping
asked 5 years, 11 months ago Betini O. Heleno 457
1
votes

1
answer

40
views

Problem converting JSON to dataframe in R

I want to extract a JSON content from a website and convert it into a dataframe Website is https://schutz-shoes.com/products/amaia-sandal-metallic-leather?color=ouro%20gold Inside that site,…

json r web-scraping
asked 5 years, 9 months ago Henrique Faria de Oliveira 725
1
votes

1
answer

1363
views

Sites with authentication - Web Scraping - Python

BR: I’m trying to automate a web data acquisition process using Python. In my case, I need to pull the information from the page https://sistema.justwebtelecom.com.br/adm.php. However, before going…

python web-scraping python-requests
asked 5 years, 3 months ago Rafael Garcia 11
1
votes

1
answer

365
views

Automate web scraping in Python

I’m trying to get the speeches of the deputies, which can be found here. The site has several pages (1 to 300 +/-) and on each page has a table with a "summary" of the information, with 50 lines.…

python selenium-webdriver web-scraping
asked 5 years, 3 months ago Edubarth 23
1
votes

0
answers

15
views

Web scraping using python

I would like to print all the Divs of a particular website that are contained within a superior div <div class="history-feed__collection"> <div class="history-feed__card h-card h-card_sm…

web-scraping
asked 5 years, 1 month ago Gustavo 19
1
votes

1
answer

769
views

Xpath with Python - Pick up text after tag in a div

I’m trying to get a text after a tag that’s inside a div, in an html. The problem I’m having is that I’m not getting the text, just an empty string. I’ve looked elsewhere and I haven’t seen anyone…

python web-scraping xpath
asked 5 years, 1 month ago Wiliane Souza 21
1
votes

1
answer

31
views

Iterating over list with B Soup

I am trying to realize web Scrapping of a list of episodes of a series with BS. I mounted the structure below: #Importando todos os módulos import bs4 from bs4 import BeautifulSoup import…

python web-scraping
asked 5 years ago Leonardo Gouvêa Silva 35
1
votes

1
answer

650
views

Run python script by clicking html button

I need to feed a page html that will load and display the content dynamically with ajax/fetch. The problem is that I need to take this data from other websites that also upload this content through…

javascript html python ajax web-scraping
asked 4 years, 9 months ago Danillo Eder 605
1
votes

1
answer

73
views

Jsondecodeerror Expecting value: line 1 column 1 (char 0) - content-type: text/xml

I have a project to capture Atms next to a coordinate on the Mastercard website form. I can bring the result but not in json. By Content-Type be text.XML, should not allow to bring result in json?…

python web-scraping python-requests
asked 4 years, 9 months ago Diogo Ribeiro 41
1
votes

0
answers

57
views

Scraping on instagram

Hello, I would like to ask for a help, I’m wanting to do a scraping on Instagram to be able to analyze personas and extract some data as tags most used by people who follow a certain someone, I took…

python python-3.x web-scraping scraping
asked 4 years, 6 months ago Gregory Dias 11
1
votes

1
answer

36
views

Change the language of the result of a web-scraping with rvest from the IMDB site

I want to collect information about the IMDB Top 250 using the package rvest. While visiting the page link, the names of the movies appear in their original language, at least in my browser (Firefox…

r web-scraping rvest
asked 4 years, 5 months ago Marcus Nunes 17,915
1
votes

2
answers

185
views

Beautifulsoup: Catch text inside table

I’m trying to get specific values within a table, I have a similar code that I already use in the same way in another unique table structure within html, the problem and that I can’t get the text of…

html python table web-scraping beautifulsoup
asked 4 years, 5 months ago kleytonsolinho 31
1
votes

0
answers

33
views

Web Scraping with R - static table

I would like to consolidate the data of a betting site in a database on R. I’m trying it the way below, but my script doesn’t recognize the columns and rows of the table in fact, only the layout:…

r web-scraping tidyverse rvest
asked 4 years, 4 months ago Sarah de Paula Pereira 21
1
votes

1
answer

143
views

Runtime Error 438: Object does not accept property or Methods

I’m adapting VBA code for scraping, but I’m getting this message when it comes to sending the data to the Login form. Runtime Error 438: Object does not accept property or Methods Public Sub…

vba web-scraping
asked 4 years, 4 months ago Itamar Conceição 21
0
votes

1
answer

224
views

How to get the headlines of the Olympics on the CNN website with Python using Beautifulsoup?

I’d like an example of how to take the headlines of the Olympics in http://edition.cnn.com/sport/olympics using Beautifulsoup.…

python web-scraping
asked 8 years, 11 months ago Ed S 2,057
0
votes

2
answers

239
views

Web Scraping how to insert the result into the <img src=

I’m making a web scraping of a website, however I would like the returned images to come to me inside the <img src= but I’m not succeeding // Find all images foreach($html->find('img') as…

php web-scraping
asked 8 years, 5 months ago vncalmeida 41
0
votes

0
answers

87
views

Crawler for Woocommerce

Friends good afternoon. I’m developing a php Crawler that will make Scrapping some urls that I will inform. I’m trying to get him to pull the values of a dynamic url, but I’m not getting it. Could…

php curl web-crawler web-scraping
asked 9 years, 4 months ago jeann sebold 91
0
votes

1
answer

509
views

What is the best way to scrape the Datasus website in Python?

The link is this: http://tabnet.datasus.gov.br/cgi/tabcgi.exe?sih/cnv/nrbr.def I’m trying to send a POST through requests with a dictionary containing the categories I want, but then the URL remains…

python web-application selenium web-scraping scrapy
asked 7 years, 7 months ago Victor Serra 9
0
votes

1
answer

1001
views

How to capture the td of a web page using Selenium vba?

html code looks like this: <table> <tr> <td width="01%" class="tex3b"><img height="14" src="/imagens/tm_bullet.gif" width="6"></td> <td width="20%"…

html vba google-chrome selenium web-scraping
asked 8 years, 7 months ago Daniel Aristofanes 73
0
votes

1
answer

53
views

Parse Xpath from Int

I have a scrapy running the for to bring the day and link to something. Ex: t_day = div.xpath('.//a/text()').extract_first() a_day = div.xpath('.//a/@href').extract_first() day = int(t_day) if day…

python-3.x web-scraping scrapy xpath
asked 7 years, 2 months ago ChaarSales 3
0
votes

1
answer

458
views

Save Excel file to Python via Scrapy

As I do for mine Spider save all Excel data in a single XML file links which I extract? Or also save in each single XLS file in the project folder? Part of my Spider: def parse(self, response): divs…

python-3.x pandas web-scraping scrapy xls
asked 7 years, 2 months ago ChaarSales 3
0
votes

0
answers

80
views

Extract pdf documents from scrapy sites

It is possible to scan an entire site by going through all links in search of scrapy pdf files? would be something like apache nutch. I did a search but the staff only uses Xpath, and Xpath can not…

web-scraping scrapy
asked 6 years, 3 months ago mell system 3
0
votes

0
answers

59
views

Web Browser does not load link by Navigate win form c#

Good afternoon, I am doing a test to create a Webscraping in c#, but does not load the page in the web form, presenting the javascript error Can anyone help with this mistake?…

c# web-scraping
asked 6 years, 2 months ago Michel Diniz 141
0
votes

1
answer

47
views

Select does not update table data after selecting an option

I am trying to select the field with this query however the value of select is changed but does not reload the table values, showing all. But by clicking with the mouse, it works.…

jquery html-select web-scraping web-crawler
asked 6 years ago Gustavobezerra 3
0
votes

1
answer

142
views

How to create an Array within the other

I need to create an array that has Indice and values page_links receives the links of a page all_links_main = [] for link in page_links: all_links_main.append(link.get('href')) produto = [] for…

python python-3.x web-scraping scrapy
asked 5 years, 11 months ago JB_ 561
0
votes

1
answer

82
views

How to click a checkbox when another obscure element is it?

I’m writing a code to automate some processes on the SIAFI site, I couldn’t get Python to click on a checkbox, except by importing the package pynput and using the mouse positioning function with…

python web-scraping pynput
asked 7 years, 4 months ago Marcus Junior 1
0
votes

0
answers

76
views

Python, downloading file on a given day and time

I’m looking to structure a Python program that downloads files (manga) from a given site once a week. I’m training, I took the course of web scraping, but I am lost on how to perform these requests.…

python web-scraping
asked 7 years, 2 months ago Paulo Henrique 1
0
votes

1
answer

470
views

Iterating web pages using Requests and Python

I am beginner in web scraping. How to learn making a database from data on selling semi-new cars on some websites. One of the sites is this url =…

python-3.x http-request web-scraping python-requests
asked 7 years, 2 months ago Rafael Ribeiro 67
0
votes

1
answer

407
views

Specific chunk break in JSON file with python

Is it possible to break a line from a specific section of Json, transform it into an array, and then streamline it? Why do I ask this.. I am developing a file mining bot and came across a situation…

python json python-3.x web-scraping
asked 7 years, 2 months ago Jonathan Igor Bockorny Pereira 111
0
votes

0
answers

58
views

How to extract data for Models.py fields from Scrapy?

I intend to remove all "Municipios" from the tag starting on this page. https://www.anmp.pt/anmp/pro/mun1/mun101w3.php?cod=M2200 And then remove information such as: "name of the council", "mayor",…

python-2.7 web-scraping scrapy
asked 7 years ago Carlos Aboim 11