How to get data from a specific web page?

Asked

Viewed 45 times

0

I aim to create a script that gets data like bugs,issues among others from the following page of Spring Framework, unfortunately I have no code to show because in fact I do not have much idea of how to get this data.

A doubt and: How to get the data shown on the page and generate a json file with the same? preferably using python or javascript.

1 answer

1


You’d better contact the site and ask for access to some API to get this data.
Using a "parser" on their website can generate many requests, so they can block access to your IP. But if you have to, in python you can use the library Beautifulsoup.

A very simple example:

from bs4 import BeautifulSoup
import urllib.request

fp = urllib.request.urlopen(
    "https://jira.spring.io/browse/spr/?selectedTab=com.atlassian.jira.jira-projects-plugin:issues-panel")
html = fp.read().decode("utf8")
fp.close()

soup = BeautifulSoup(html, 'html.parser')
table = soup.find(id='fragstatussummary')
nome = table.h3
print(nome.contents[0])
for linha in table.find_all('tr'):
    name = linha.a
    count = linha.find('td', class_='cell-type-collapsed')
    if name:
        print('{}: {}'.format(name.contents[0], count.contents[0]))


Upshot:

Status Summary
Open: 1542
In Progress: 12
Reopened: 49
Resolved: 4740
Closed: 9299
Waiting for Feedback: 45
Investigating: 56

Browser other questions tagged

You are not signed in. Login or sign up in order to post.