web crawler - Import IO- Using XPath to show "more" content -

i'm totally stumped on , reaching our help!

i'm using import.io crawler extract reviews tripadvisor. when training crawler, "more" button inactive.

here's example of page: [http://www.tripadvisor.co.uk/hotel_review-g295424-d306662-reviews-hilton_dubai_jumeirah_resort-dubai_emirate_of_dubai.html#reviews][1]

here xpath review in full: //*[@id="ur288083139"]/div[2]/div/div[3]

and more button: //*[@id="review_288083139"]/div[1]/div[2]/div/div/div[3]/p/span

is possible have xpath full review included in import.io?

one way can using crawler extractor. split process 2 parts.

create crawler you'd train capture links every review on page. make sure select link column.

sample review website
create extractor capture full review links got crawler.
voila! got reviews!

note: if have links pages need reviews from, better make extractor instead of crawler. way, can chain api other extractor. you'd need crawler if don't know links.

hope helps!

WIKI

Search This Blog

web crawler - Import IO- Using XPath to show "more" content -

Comments

Post a Comment