"Missing Scheme" error using Scrapy

Asked

Viewed 45 times

0

When I run my spider scrapy returns me the following error:

Valueerror: Missing Scheme in request url h

import scrapy

class QuotesSpider(scrapy.Spider):
    name = "Mineracao"

    def start_requests(self):

        link = "http://www.jornalpanorama.com.br/site/data-policia.php?page="
        y=1
        for x in range(240):
            urls=link+str(y)
            y=y+1
            print urls  

        for url in start_urls:
            yield scrapy.Request(url=url, callback=self.parse)

    def parse(self, response):

        url = "http://www.jornalpanorama.com.br/site/"
        for x in response.xpath("//*[contains(@class, 'listar-noticias-titulo')]/a/@href").extract():
            print url + x
  • 3

    Where the variable is defined start_urls? You are using, but it has not been defined anywhere in the code. And it has the variable urls that you iterate on and then don’t use. They should be the same thing?

1 answer

0

Assuming you want to visit all pages (from 1 to 240), you probably wanted to do:

def start_requests(self):
    link = "http://www.jornalpanorama.com.br/site/data-policia.php?page="
    for x in range(1, 240):
        yield scrapy.Request(url=link + str(x), callback=self.parse)

If you really want to start from page 1 and skip the even pages (as in your code), you can use range(1, 240, 2) in place of range(240).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.