how can scrape data there? writing php script scrape data website having dynamic loader . using html dom parser , scoopy scrape following website https://www.lyoness.com/au/search/partner/ . beginner , not able identify how parse infinite scroller.
<input id="btnnextpage" type="button" class="btn btn-primary" style="width: 100%" value="next page">
this link used pull content using ajax,
https://www.lyoness.com/au/search/loadpage?cp=1&area=2&st=&rz=&rzc=&f=&ft=basic&c=au&r=12&la=en-au&s=default&ispreviouspageclick=false&_= the cp variable page number loading. means can loop through numbers if there still content returned.
you can't access link php because accessing through browser not possible, tried ajax , works, here ajax code can type in page console , change cp print ajax content, can add loop delay
$.ajax({ url:'https://www.lyoness.com/au/search/loadpage?cp=5&area=2&st=&rz=&rzc=&f=&ft=basic&c=au&r=12&la=en-au&s=default&ispreviouspageclick=false&_=', success:function(data){ console.log(data); } }) you can post returned data after scrape using jquery (which easy using php libraries) server post or request , save database using sort of api or disable cross domain security option browser.
edit:
here php code retrieve first page using curl
if (!function_exists('curl_init')){ die('sorry curl not installed!'); } $url = 'https://www.lyoness.com/au/search/loadpage?cp=1&ft=basic&c=au&r=12&la=en-au&s=default'; $ch = curl_init(); curl_setopt($ch,curlopt_encoding , "gzip"); curl_setopt($ch, curlopt_ssl_verifyhost, 0); curl_setopt($ch, curlopt_ssl_verifypeer, 0); curl_setopt($ch, curlopt_url, $url); curl_setopt($ch, curlopt_useragent, "mozilla-djokage/1.0"); curl_setopt($ch, curlopt_header, 0); curl_setopt($ch, curlopt_httpheader, array( 'x-requested-with: xmlhttprequest' )); curl_setopt($ch, curlopt_returntransfer, true); curl_setopt($ch, curlopt_timeout, 10); $output = curl_exec($ch); echo $output; //echo 'curl error: ' . curl_error($ch); curl_close($ch); you need loop through cp variable in url can parse pages , need scrape $output html variable , save them db, have tried code , works fine. hope accept solution
Comments
Post a Comment