Most voted "web-scraping" questions

It’s the process of extracting information from websites. It is typically used by third-party applications to extract information or interact with a website that does not expose an API.

Learn more…

191 questions

Sort by count of

0
votes

2
answers

468
views

Concatenation of multiple lists with Python

Good afternoon! I have a problem and need help, I am working with 3 distinct lists that should be added to a dictionary, but so I can capture all values without one overwriting the other, I need to…

python array web-scraping scrapy
asked 6 years, 11 months ago Jonathan Igor Bockorny Pereira 111
0
votes

1
answer

62
views

I can’t do the "web scraping" properly from a Python comic strip site

Well, I was making a code that would check the day of each strip/gif of the page and, if the day is the same as the current day (in the code I put 14 only because the site does not update weekend…

python python-3.x web-scraping beautifulsoup
asked 6 years, 10 months ago Matheus Andrade 27
0
votes

1
answer

485
views

How can I use Scrapy in Anaconda

Hi, I’m having trouble creating a project with Scrapy. I’m studying data science in college and I have to use Scrapy. I’m using Anaconda. First through the Spider IDE (Anaconda Navigator), now I’m…

python web-scraping web-crawler scrapy anaconda
asked 6 years, 10 months ago Rafael Ventura 5
0
votes

1
answer

91
views

Adjust columns csv with Scrapy

I’m having a problem, python by default when it generates the csv file separates the columns by comma, but I need the created items to turn into the respective columns, but I’m not able to do the…

python web-scraping scrapy
asked 6 years, 10 months ago Jonathan Igor Bockorny Pereira 111
0
votes

1
answer

161
views

Crawler - how to access several pages

I created a code on Ode to search for the version of the system and the name of the municipality of a portal, however I am not able to make it search for the information of another municipality only…

javascript node.js web-scraping web-crawler
asked 6 years, 10 months ago jackbauer 5
0
votes

1
answer

400
views

Fix Encoding Problem while exporting to csv from a scrapy file

How can I fix encoding problem while saving file in csv? this problem is happening only when saved in csv. from scrapy import * from projeto_iruan.items import * import csv class…

python web-scraping scrapy
asked 6 years, 10 months ago Jonathan Igor Bockorny Pereira 111
0
votes

1
answer

59
views

How to name each row of a url list

I can name every line on the url list, to return the nickname I gave her? Like that was the result: Prefeitura Municipal de Bocaiúva do Sul | PRONIM TB 518.01.07-013 | Prefeitura Municipal de…

javascript node.js web-scraping web-crawler
asked 6 years, 10 months ago jackbauer 5
0
votes

2
answers

1265
views

How to convert CSV to XLSX with python?

How I Convert a File .csv generated by python to .xlsx? I’m in two trouble: One of them is that I couldn’t figure out how to make this conversion The second is that even passing the command crawl…

python web-scraping scrapy
asked 6 years, 10 months ago Jonathan Igor Bockorny Pereira 111
0
votes

1
answer

94
views

Data scraping with jsoup and saving in txt

Whoa, way to go, guys. I’m trying to learn data scraping on my own, and as my English doesn’t help, I’m turning 30. Basically this is it. In executing my code, he lists the athletes of the…

java web-scraping scraping
asked 6 years, 9 months ago Joe Reis 7
0
votes

1
answer

342
views

Doubt how to scrape data like Python using Beautifulsoup <Table>

I’m beginner and I’m trying to get a table of the website of the portal of transparency, but I’m not getting only comes to tag with no data. When I open the developer tool I visualize the data I…

python python-3.x web-scraping beautifulsoup
asked 6 years, 8 months ago jaderson08 1
0
votes

1
answer

33
views

Access Tag via beautifulsoup

Hello, I’m having difficulty accessing the price that is in the third line of the code via beautifulsoup. Does anyone have any idea how to access? <span id="ctl00_Conteudo_ctl01_spanPrecoPor"…

python web-scraping beautifulsoup
asked 6 years, 7 months ago Diogo Gonnelli 11
0
votes

1
answer

3418
views

Request API with Javascript

I am making a web application in which the goal will be to use an API to only list some information (GET) and for this I would like to use only Java and html. The API is this:…

javascript node.js web-service api web-scraping
asked 6 years, 7 months ago Diogo 1
0
votes

1
answer

78
views

Webscrapping Soup + python export to txt and check with shell script

Greetings people, I’m here with a python code that brings me the milliseconds of the E-tax Note Sending ping from the E-tax portal in the NFC-e status portal as below : #!/usr/bin/env python # -*-…

python shell web-scraping beautifulsoup
asked 6 years, 5 months ago Rafael Xavier Suarez 91
0
votes

1
answer

112
views

Data Crawling in Python

Good afternoon, you guys. I decided to start my studies with the Python Crawler technique. I built the following script using lib Selenium : # Importando selenium para realizar o crawling from…

python csv selenium web-scraping web-crawler
asked 6 years, 5 months ago HV Lopes 313
0
votes

1
answer

220
views

Switching pages in an html table with beautifulsoup

I’m collecting the data on this one website, using requests and beautifulsoup. I was able to collect all the data from page 1, but I cannot change the page. Python code variaveis = [] df_list = []…

python web-scraping python-requests beautifulsoup
asked 6 years, 4 months ago Pedro 31
0
votes

0
answers

303
views

Set value in form field (or string input)

I have a form field that contains a built-in Javascript, which works dynamically. <input type="text" id="txtPreferencia" title="Tecla de atalho: Alt+R" name="txtPreferencia" class="infraText…

python python-3.x selenium-webdriver web-scraping
asked 6 years ago Bergo de Almeida 181
0
votes

0
answers

36
views

How to update a key slice (key) of a Python dictionary?

How do I update just a slice of a key in a Python dictionary? I am scraping a page and would like to format the result so that my key is on the same line as my value, for example: Air Conditioners:…

python python-3.x web-scraping dictionary scraping
asked 5 years, 11 months ago Alineat 357
0
votes

0
answers

54
views

Problem collecting website information

I am trying to collect the data number of people helps in SOPT, ie my impact, to put in an api later, but is not extracting the information. Spider: import scrapy class StackOverflow(scrapy.Spider):…

python web-scraping scrapy
asked 5 years, 11 months ago David 4,330
0
votes

2
answers

823
views

Extracting Data with Beautiful Python Soup

I made a Python script to access the TJ-SP website to do a certain search and make a Web Scraping with the search result. this is the HTML: i want to pick up the text that is contained in that tag:…

python selenium web-scraping
asked 5 years, 9 months ago Hero_0 13
0
votes

0
answers

183
views

HTTP Error 429: Too Many Requests in Web scraping in repl

When executing the code below, find: HTTP Error 429: Too Many Requests the server must have a time limit between the requests. #Imports necessários do bs4 import bs4 from urllib.request import…

python web-scraping beautifulsoup
asked 5 years, 8 months ago Gustavo William 11
0
votes

1
answer

160
views

Python Encoding Problem

I’m trying to pull the hashtags used in some Instagram profiles using the code: import pandas as pd import requests import re req = requests.get("https://www.instagram.com/globorural/") texto =…

python character-encoding web-scraping utf-8 encode
asked 5 years, 8 months ago André Schuck 11
0
votes

1
answer

101
views

Constant click using Selenium does not work

Goal: Click 4 times on the same button "See More" Problem: Click command works the first 2 times, after that, it no longer works and returns TimeoutException Code: ops = webdriver.ChromeOptions()…

python selenium web-scraping
asked 5 years, 6 months ago Daniel Santos 555
0
votes

2
answers

330
views

Web scraping of a microsoft form Forms returns None [python]

Hello, I’m having difficulties in making a web scraping of a form made by microsoft Forms. (NOTE: The form was made by me). I have the following code: from bs4 import BeautifulSoup import requests…

python web-scraping
asked 5 years, 6 months ago Jonathan Cardoso 105
0
votes

1
answer

35
views

Dict with repeated Python attributes

Good afternoon! I’m putting together a formdata for a post, formdata = { 'data': '', 'controle': 'ADMIN', 'g-recaptcha-response': recaptcha_response } for numero in nDams: formdata['nu_dam[]'] =…

python-3.x web-scraping scrapy
asked 5 years, 6 months ago Henrique Lemes Baron 107
0
votes

0
answers

20
views

How to make a web scraping on an aspx site using python

I’m trying to scrap a page that uses aspx. The point is that when I go to inspect element the data I need is there, but when I give the requests on the page comes everything from html, except the…

python asp.net post web-scraping python-requests
asked 5 years, 6 months ago Samp 1
0
votes

1
answer

124
views

How to get a person’s friends and followers on Twitter using the tweepy library?

The function getting_friends_follwers() below works if I remove the value 100 from (cursor2.items(100)) . My goal is to take these names (followers and friends) and save in a file "friends.txt". The…

python python-3.x web-scraping twitter
asked 5 years, 3 months ago Laurinda Souza 291
0
votes

2
answers

345
views

Web scraping with Beautifulsoup - find_next does not return text

I want to extract the text from the section below: <div class="matchDate renderMatchDateContainer" data-kickoff="1583784000000">Mon 9 Mar 2020</div> the text would be "Mon 9 Mar 2020".…

html python web-scraping beautifulsoup
asked 5 years, 3 months ago Otávio Simões Silveira 3
0
votes

1
answer

711
views

Scroll Down Python and Selenium

I’m running a boot to collect information from Facebook for a political survey. However, when I collect the comments I’m not getting the Scroll Down of the page. I’ve tried several code formats, but…

python selenium web-scraping
asked 5 years, 3 months ago Adnan Jebailey 1
0
votes

1
answer

237
views

Loop Selenium Phyton

Hello! I started studying Python and I’m trying to make web scraping on the OLX site. I can search and filter. But how can I make a loop for him to click on all the ads so I can pick up the phones?…

python selenium web-scraping
asked 5 years, 3 months ago Fábio Siqueira 11
0
votes

1
answer

166
views

Python: Web Scraping with dynamic values

I am learning about Web Scraping, I have already managed to do some actions but I came across problem in a dynamic page where values are changed every refresh. Unfortunately I can not pass the…

python selenium-webdriver web-scraping
asked 5 years, 2 months ago Leandro de Matos 61
0
votes

1
answer

262
views

Selecting two options in the drop down menu with Selenium + Python does not work

from selenium import webdriver from time import sleep from urllib.request import Request,urlopen import pandas as pd from selenium.webdriver.support.ui import Select from…

python selenium selenium-webdriver web-scraping
asked 5 years, 1 month ago user162889
0
votes

0
answers

119
views

Web scraping with python, how to print the class of a div

I would like to print the class of a div in Phyton, the code of the site and: <div class="history-feed__collection"> <div class="history-feed__card h-card h-card_sm h-card_spades"…

web-scraping
asked 5 years, 1 month ago Gustavo 19
0
votes

2
answers

116
views

Why are the commas of the numbers being deleted when importing data with Pandas?

I’m racking my brain to understand why this is happening when I take numerical data from a table on the web. In this table contain the values of the quotations of the coins, the problem occurs that,…

python pandas web-scraping
asked 5 years ago Matheus 1
0
votes

1
answer

53
views

How to move the title from one column to another?? (web scraping-python)

I’m trying to make a web scraping, but if you view the site you notice that certain titles are on certain columns. What my program does is take the table, create two full columns of Nan and assign…

python pandas selenium-webdriver web-scraping beautifulsoup
asked 5 years ago Frybii 43
0
votes

0
answers

115
views

Chunkedencodingerror when making requests in python

Good afternoon guys. I’m having a problem with requests for data scraping. The main function is the one that follows: def raspa_dados(lista_de_links, ministerio): links = [] autores = [] chamadas =…

web-scraping python-requests urllib
asked 5 years ago Marcelo 1
0
votes

1
answer

89
views

Python Selenium capturing only 1 link

Good afternoon to everyone, I’m starting to learn Selenium and have already passed me a project to do that I’m half lost. I need to capture all the URL summaries…

python selenium selenium-webdriver web-scraping
asked 4 years, 11 months ago user201087
0
votes

1
answer

341
views

Problem navigating with Selenium (using Python) in search results presented in dynamic HTML

I am performing a scraping of articles of a newspaper from Pernambuco (Diário de PE) according to a search I did with some keywords on the subject of interest. The journal search returns 10 results…

python-3.x selenium-webdriver web-scraping scraping
asked 5 years ago Victor Heuer 1
0
votes

1
answer

79
views

How to get the page after authentication with requests?

I am trying to make a web scraping in python. My code is as follows: import requests from bs4 import Beautifulsoup Session = requests. Session() payload = {'username':'[xxxxx]', 'password':'[xxxxx]'…

web-application web-scraping python-requests
asked 4 years, 10 months ago dr798 1
0
votes

0
answers

43
views

Pull a list with multiple search names with Rselenium

Good evening, I’m trying to pull data from google scholar with Rselenium but I’m having a hard time getting the information from the magazines I’m looking for. Playing the code below: #Primeiro…

r selenium-webdriver web-scraping
asked 4 years, 9 months ago Décio Vieira da Rocha 73
0
votes

1
answer

714
views

Writing a csv file on Google Drive using Colab

I’m writing a code in Python to scrape information off Facebook. I would like to save this information in a file on Google Drive, since I am working with other people and we use Colaboratory.…

python google web-scraping ipython-notebook
asked 4 years, 7 months ago Clara Mendes 13
0
votes

0
answers

24
views

Python - Scrapy - Return nested Json (Json’s List)

Hello, I’m having a problem generating a dictionary answer within another dictionary. I have a home page that contains 8 bimonthly programs, in these picked up their respective links and properties.…

python json web-scraping scrapy yield
asked 4 years, 5 months ago LeandroSouza 1
0
votes

1
answer

30
views

Transform each site’s Node into an array element

I want to turn every headline on this site into an element of an array. I’ve tried several ways but none of them work, so if you could help me, I’d be grateful. I am using Htmlagilitypack using…

c# web-scraping html-agility-pack
asked 4 years, 5 months ago Jordan Nunes 11
0
votes

0
answers

34
views

Selenium error while loading specific "Selenium Failed to resolve address" page

I am creating a python bot and when loading a page it shows in the console the following errors however it continues running however when I will try to execute anything involving the page it hangs…

python selenium selenium-webdriver web-scraping webdriver
asked 4 years, 5 months ago OlopesMaster 1
0
votes

0
answers

44
views

Why does the comma disappear when converting an HTML to str in Python?

I’m extracting some data from a scraping page following a tutorial I saw on Youtube. Follow the code: import requests import pandas as pd from bs4 import BeautifulSoup from selenium import webdriver…

html python string pandas web-scraping
asked 4 years, 5 months ago Rodrigo Junior 115
0
votes

1
answer

60
views

How to use Selenium to select a specific tag with a class common to other tags?

I’m wanting to make Crapping the most up-to-date data of this page. Every day she makes available the data of the current day and the next day. My main interest is in the data of the next day. When…

html python selenium web-scraping
asked 4 years, 4 months ago Rodrigo Junior 115
0
votes

0
answers

10
views

How to extract a text within a <dd> using Jsoup?

Talk, you guys, baby? I’ve been trying for a couple of hours, and I’ve already researched everything that is a place to fix it, question is as follows: I need to know if an X text is like "OK" or…

java web-scraping parse scraping jsoup
asked 4 years, 4 months ago Lucas Gaiotto 11
0
votes

1
answer

180
views

urllib.error.Httperror: HTTP Error 404: Not Found

I have the following code, simple. from urllib.request import urlopen from bs4 import BeautifulSoup word_site =…

python web-scraping urllib
asked 4 years, 3 months ago João Caetano 53
0
votes

0
answers

12
views

Help with Indexerror error: list index out of range

class Scraping: def pesquisar_nome(self): while True: try: self.browser.find_element_by_id('search-key').send_keys(self.Keys.CONTROL, 'a')…

python selenium web-scraping
asked 4 years, 3 months ago GripeBrabo 1
0
votes

2
answers

68
views

Join cells with python

I am making a web scraping to take data from the best actions of the day and join in a table in an excel file. I am trying by code: from selenium import webdriver from webdriver_manager.microsoft…

python pandas selenium selenium-webdriver web-scraping
asked 4 years, 2 months ago André Leite 15
0
votes

0
answers

38
views

Scraping site ANBIMA on R

Talk people! I’m trying to extract the information from the investment bonds on ANBIMA’s website, but I’m not getting it. This is the example of the page I want to get the information.…

r web-scraping rvest
asked 4 years, 1 month ago Mauricio Medeiros 9