python scrapy rules webscraping through google results -

i trying scrapy (1.0) go through google results, , have no problem scraping first page of results, cannot scraper go through following number of pages (i think it's called traversing?).

i attempted "rules":

from scrapy.linkextractors import linkextractor  ...  rules = (rule(linkextractor(restrict_xpaths=('//div[@class="pnnext"]')), callback='parse_item', follow=true))

but keep getting error:

nameerror: name 'rule' not defined

i need follow "next" pages , crawl results, until there no more pages.

thank you.

you should import rule scrapy.spiders, so:

from scrapy.spiders import rule

check scrapy crawlspider example if miss other imports.

WIKI

Search This Blog

python scrapy rules webscraping through google results -

Comments

Post a Comment