Most voted "web-scraping" questions
It’s the process of extracting information from websites. It is typically used by third-party applications to extract information or interact with a website that does not expose an API.
Learn more…191 questions
Sort by count of
-
0
votes2
answers468
viewsConcatenation of multiple lists with Python
Good afternoon! I have a problem and need help, I am working with 3 distinct lists that should be added to a dictionary, but so I can capture all values without one overwriting the other, I need to…
-
0
votes1
answer62
viewsI can’t do the "web scraping" properly from a Python comic strip site
Well, I was making a code that would check the day of each strip/gif of the page and, if the day is the same as the current day (in the code I put 14 only because the site does not update weekend…
-
0
votes1
answer485
viewsHow can I use Scrapy in Anaconda
Hi, I’m having trouble creating a project with Scrapy. I’m studying data science in college and I have to use Scrapy. I’m using Anaconda. First through the Spider IDE (Anaconda Navigator), now I’m…
-
0
votes1
answer91
viewsAdjust columns csv with Scrapy
I’m having a problem, python by default when it generates the csv file separates the columns by comma, but I need the created items to turn into the respective columns, but I’m not able to do the…
-
0
votes1
answer161
viewsCrawler - how to access several pages
I created a code on Ode to search for the version of the system and the name of the municipality of a portal, however I am not able to make it search for the information of another municipality only…
-
0
votes1
answer400
viewsFix Encoding Problem while exporting to csv from a scrapy file
How can I fix encoding problem while saving file in csv? this problem is happening only when saved in csv. from scrapy import * from projeto_iruan.items import * import csv class…
-
0
votes1
answer59
viewsHow to name each row of a url list
I can name every line on the url list, to return the nickname I gave her? Like that was the result: Prefeitura Municipal de Bocaiúva do Sul | PRONIM TB 518.01.07-013 | Prefeitura Municipal de…
-
0
votes2
answers1265
viewsHow to convert CSV to XLSX with python?
How I Convert a File .csv generated by python to .xlsx? I’m in two trouble: One of them is that I couldn’t figure out how to make this conversion The second is that even passing the command crawl…
-
0
votes1
answer94
viewsData scraping with jsoup and saving in txt
Whoa, way to go, guys. I’m trying to learn data scraping on my own, and as my English doesn’t help, I’m turning 30. Basically this is it. In executing my code, he lists the athletes of the…
-
0
votes1
answer342
viewsDoubt how to scrape data like Python using Beautifulsoup <Table>
I’m beginner and I’m trying to get a table of the website of the portal of transparency, but I’m not getting only comes to tag with no data. When I open the developer tool I visualize the data I…
-
0
votes1
answer33
viewsAccess Tag via beautifulsoup
Hello, I’m having difficulty accessing the price that is in the third line of the code via beautifulsoup. Does anyone have any idea how to access? <span id="ctl00_Conteudo_ctl01_spanPrecoPor"…
-
0
votes1
answer3418
viewsRequest API with Javascript
I am making a web application in which the goal will be to use an API to only list some information (GET) and for this I would like to use only Java and html. The API is this:…
-
0
votes1
answer78
viewsWebscrapping Soup + python export to txt and check with shell script
Greetings people, I’m here with a python code that brings me the milliseconds of the E-tax Note Sending ping from the E-tax portal in the NFC-e status portal as below : #!/usr/bin/env python # -*-…
-
0
votes1
answer112
viewsData Crawling in Python
Good afternoon, you guys. I decided to start my studies with the Python Crawler technique. I built the following script using lib Selenium : # Importando selenium para realizar o crawling from…
-
0
votes1
answer220
viewsSwitching pages in an html table with beautifulsoup
I’m collecting the data on this one website, using requests and beautifulsoup. I was able to collect all the data from page 1, but I cannot change the page. Python code variaveis = [] df_list = []…
-
0
votes0
answers303
viewsSet value in form field (or string input)
I have a form field that contains a built-in Javascript, which works dynamically. <input type="text" id="txtPreferencia" title="Tecla de atalho: Alt+R" name="txtPreferencia" class="infraText…
-
0
votes0
answers36
viewsHow to update a key slice (key) of a Python dictionary?
How do I update just a slice of a key in a Python dictionary? I am scraping a page and would like to format the result so that my key is on the same line as my value, for example: Air Conditioners:…
-
0
votes0
answers54
viewsProblem collecting website information
I am trying to collect the data number of people helps in SOPT, ie my impact, to put in an api later, but is not extracting the information. Spider: import scrapy class StackOverflow(scrapy.Spider):…
-
0
votes2
answers823
viewsExtracting Data with Beautiful Python Soup
I made a Python script to access the TJ-SP website to do a certain search and make a Web Scraping with the search result. this is the HTML: i want to pick up the text that is contained in that tag:…
-
0
votes0
answers183
viewsHTTP Error 429: Too Many Requests in Web scraping in repl
When executing the code below, find: HTTP Error 429: Too Many Requests the server must have a time limit between the requests. #Imports necessários do bs4 import bs4 from urllib.request import…
-
0
votes1
answer160
viewsPython Encoding Problem
I’m trying to pull the hashtags used in some Instagram profiles using the code: import pandas as pd import requests import re req = requests.get("https://www.instagram.com/globorural/") texto =…
-
0
votes1
answer101
viewsConstant click using Selenium does not work
Goal: Click 4 times on the same button "See More" Problem: Click command works the first 2 times, after that, it no longer works and returns TimeoutException Code: ops = webdriver.ChromeOptions()…
-
0
votes2
answers330
viewsWeb scraping of a microsoft form Forms returns None [python]
Hello, I’m having difficulties in making a web scraping of a form made by microsoft Forms. (NOTE: The form was made by me). I have the following code: from bs4 import BeautifulSoup import requests…
-
0
votes1
answer35
viewsDict with repeated Python attributes
Good afternoon! I’m putting together a formdata for a post, formdata = { 'data': '', 'controle': 'ADMIN', 'g-recaptcha-response': recaptcha_response } for numero in nDams: formdata['nu_dam[]'] =…
-
0
votes0
answers20
viewsHow to make a web scraping on an aspx site using python
I’m trying to scrap a page that uses aspx. The point is that when I go to inspect element the data I need is there, but when I give the requests on the page comes everything from html, except the…
-
0
votes1
answer124
viewsHow to get a person’s friends and followers on Twitter using the tweepy library?
The function getting_friends_follwers() below works if I remove the value 100 from (cursor2.items(100)) . My goal is to take these names (followers and friends) and save in a file "friends.txt". The…
-
0
votes2
answers345
viewsWeb scraping with Beautifulsoup - find_next does not return text
I want to extract the text from the section below: <div class="matchDate renderMatchDateContainer" data-kickoff="1583784000000">Mon 9 Mar 2020</div> the text would be "Mon 9 Mar 2020".…
-
0
votes1
answer711
viewsScroll Down Python and Selenium
I’m running a boot to collect information from Facebook for a political survey. However, when I collect the comments I’m not getting the Scroll Down of the page. I’ve tried several code formats, but…
-
0
votes1
answer237
viewsLoop Selenium Phyton
Hello! I started studying Python and I’m trying to make web scraping on the OLX site. I can search and filter. But how can I make a loop for him to click on all the ads so I can pick up the phones?…
-
0
votes1
answer166
viewsPython: Web Scraping with dynamic values
I am learning about Web Scraping, I have already managed to do some actions but I came across problem in a dynamic page where values are changed every refresh. Unfortunately I can not pass the…
-
0
votes1
answer262
viewsSelecting two options in the drop down menu with Selenium + Python does not work
from selenium import webdriver from time import sleep from urllib.request import Request,urlopen import pandas as pd from selenium.webdriver.support.ui import Select from…
-
0
votes0
answers119
viewsWeb scraping with python, how to print the class of a div
I would like to print the class of a div in Phyton, the code of the site and: <div class="history-feed__collection"> <div class="history-feed__card h-card h-card_sm h-card_spades"…
web-scrapingasked 4 years, 5 months ago Gustavo 19 -
0
votes2
answers116
viewsWhy are the commas of the numbers being deleted when importing data with Pandas?
I’m racking my brain to understand why this is happening when I take numerical data from a table on the web. In this table contain the values of the quotations of the coins, the problem occurs that,…
-
0
votes1
answer53
viewsHow to move the title from one column to another?? (web scraping-python)
I’m trying to make a web scraping, but if you view the site you notice that certain titles are on certain columns. What my program does is take the table, create two full columns of Nan and assign…
-
0
votes0
answers115
viewsChunkedencodingerror when making requests in python
Good afternoon guys. I’m having a problem with requests for data scraping. The main function is the one that follows: def raspa_dados(lista_de_links, ministerio): links = [] autores = [] chamadas =…
-
0
votes1
answer89
viewsPython Selenium capturing only 1 link
Good afternoon to everyone, I’m starting to learn Selenium and have already passed me a project to do that I’m half lost. I need to capture all the URL summaries…
-
0
votes1
answer341
viewsProblem navigating with Selenium (using Python) in search results presented in dynamic HTML
I am performing a scraping of articles of a newspaper from Pernambuco (Diário de PE) according to a search I did with some keywords on the subject of interest. The journal search returns 10 results…
-
0
votes1
answer79
viewsHow to get the page after authentication with requests?
I am trying to make a web scraping in python. My code is as follows: import requests from bs4 import Beautifulsoup Session = requests. Session() payload = {'username':'[xxxxx]', 'password':'[xxxxx]'…
-
0
votes0
answers43
viewsPull a list with multiple search names with Rselenium
Good evening, I’m trying to pull data from google scholar with Rselenium but I’m having a hard time getting the information from the magazines I’m looking for. Playing the code below: #Primeiro…
-
0
votes1
answer714
viewsWriting a csv file on Google Drive using Colab
I’m writing a code in Python to scrape information off Facebook. I would like to save this information in a file on Google Drive, since I am working with other people and we use Colaboratory.…
-
0
votes0
answers24
viewsPython - Scrapy - Return nested Json (Json’s List)
Hello, I’m having a problem generating a dictionary answer within another dictionary. I have a home page that contains 8 bimonthly programs, in these picked up their respective links and properties.…
-
0
votes1
answer30
viewsTransform each site’s Node into an array element
I want to turn every headline on this site into an element of an array. I’ve tried several ways but none of them work, so if you could help me, I’d be grateful. I am using Htmlagilitypack using…
-
0
votes0
answers34
viewsSelenium error while loading specific "Selenium Failed to resolve address" page
I am creating a python bot and when loading a page it shows in the console the following errors however it continues running however when I will try to execute anything involving the page it hangs…
python selenium selenium-webdriver web-scraping webdriverasked 3 years, 9 months ago OlopesMaster 1 -
0
votes0
answers44
viewsWhy does the comma disappear when converting an HTML to str in Python?
I’m extracting some data from a scraping page following a tutorial I saw on Youtube. Follow the code: import requests import pandas as pd from bs4 import BeautifulSoup from selenium import webdriver…
-
0
votes1
answer60
viewsHow to use Selenium to select a specific tag with a class common to other tags?
I’m wanting to make Crapping the most up-to-date data of this page. Every day she makes available the data of the current day and the next day. My main interest is in the data of the next day. When…
-
0
votes0
answers10
viewsHow to extract a text within a <dd> using Jsoup?
Talk, you guys, baby? I’ve been trying for a couple of hours, and I’ve already researched everything that is a place to fix it, question is as follows: I need to know if an X text is like "OK" or…
-
0
votes1
answer180
viewsurllib.error.Httperror: HTTP Error 404: Not Found
I have the following code, simple. from urllib.request import urlopen from bs4 import BeautifulSoup word_site =…
-
0
votes0
answers12
viewsHelp with Indexerror error: list index out of range
class Scraping: def pesquisar_nome(self): while True: try: self.browser.find_element_by_id('search-key').send_keys(self.Keys.CONTROL, 'a')…
-
0
votes2
answers68
viewsJoin cells with python
I am making a web scraping to take data from the best actions of the day and join in a table in an excel file. I am trying by code: from selenium import webdriver from webdriver_manager.microsoft…
-
0
votes0
answers38
viewsScraping site ANBIMA on R
Talk people! I’m trying to extract the information from the investment bonds on ANBIMA’s website, but I’m not getting it. This is the example of the page I want to get the information.…