How to pick up element xpath in span

Asked

Viewed 677 times

-1

I am using lxmlalong with Python 3.8and I’m needing to take the xpath of a text that’s inside that part of html:

<span class="text-down">7424.65</span>

The xpath is:

//*[@id="root"]/div/div/div[2]/div/div[3]/div[1]/div/div[2]/p[2]/span

I’m trying to use:

//*[@id="root"]/div/div/div[2]/div/div[3]/div[1]/div/div[2]/p[2]/span/text()

But just return to me: []

Can anyone tell me what I’m doing wrong? My code is this:

Pagina = requests.get('https://www.bitrue.com/trade/btc_usdt')
Pagina = html.fromstring(Bitrue_pagina.content)
Pagina_Valor = Bitrue_pagina.xpath('//*[@id="root"]/div/div/div[2]/div/div[3]/div[1]/div/div[2]/p[1]/text()')

print(Pagina_Valor)

It usually works on other elements, only I don’t know if it’s because this element is constantly updated (it’s not a fixed value) but it’s never getting it..

I found this How to pick up text from a span? but it is using Selenium, and the intention is to optimize the code, I do not want to spend time opening the browser

If anyone can inform me I’d be very grateful :)

1 answer

0


The first step in scraping is to search for an official API. In the case of this site, has a link at the bottom of the page informing the request routes.


If I hadn’t:

You cannot get the text because it is not in HTML; it is loaded, so I see, initially by Javascript and then updated by Websockets.

If you press F12 to open your browser’s developer tools and reload the site, you’ll see the requests it makes. One of them is for URL https://www.bitrue.com/exchange-web/web/coin/coinRateList, that returns a JSON:

{"code":"200","msg":"suc","params_num":null,"params":null,"data":{"BTC":{"BTC":"1","USDT":"7513.53","time":"1572000295393"},"XRP":{"BTC":"0.00003746","USDT":"0.28","time":"1572000295412"},"ETH":{"BTC":"0.02182030","USDT":"164.01","time":"1572000295432"},"USDT":{"BTC":"0.00013307","USDT":"1","time":"1572000295400"}}}

Therefore, the relevant code is:

import requests

r = requests.get('https://www.bitrue.com/exchange-web/web/coin/coinRateList')
dados = r.json()

print(dados['data']['BTC']['USDT'])  # 7535.36

If you upgrade too often, it is worth doing a reverse engineering to use the websockets connection, which saves your band and theirs.

  • Excellent, thank you very much for the support

  • A question: Where did you see the link that he makes the request ? I searched here and could not find

  • @user3602803 has to open the developer tools tab before loading the page, because requests are not saved before the tab is open, and this is done immediately after the page is loaded.

  • But on which page of F12? Because I don’t see it in "Elements" or in "Console" =\

  • In the "Network tab".

Browser other questions tagged

You are not signed in. Login or sign up in order to post.