hello trying betting coupon web via scrapping casperjs , phantomjs. page loads via ajax , prints table data .the web site : https://www.pamestoixima.gr/uk/1/print#market-group=12924.1&marketgroup-template=eventsperday&marketgroup-longlist=1 data i'm interested in lying in table class : 'markets' i've tried every code in internet , still cant results. page getting scrapped prints 'the browser must have javascript enabled'
my code far :
var casper = require('casper').create(); casper.start('https://www.pamestoixima.gr/uk/1/print#market-group=12924.1&marketgroup-template=eventsperday&marketgroup-longlist=1', function() { this.wait(5000, function() { console.log(this.gethtml() ); }); }); casper.run(); and console output :
c:\users\bampis\desktop\phantom>casperjs test.js <!doctype html public "-//w3c//dtd xhtml 1.0 strict//en" "http://www.w3.org/tr/x html1/dtd/xhtml1-strict.dtd"><html lang="en" xmlns="http://www.w3.org/1999/xhtml " class="ua-dom ua-strict ua-secure ua-windows ua-likegecko ua-safari ua-webkit" ><head><meta http-equiv="content-language" value="en"> <meta http-equiv="content-type" content="text/html; charset=utf- 8"> <title></title> <!--meta http-equiv="x-ua-compatible" content="ie=10"/--> <link href="/areas/print/template_1_uk/template.css?ts=201504081 530" rel="stylesheet" type="text/css" media="screen, tv, projection" charset="ut f-8"> <link href="/debug.css?ts=201504081530" rel="stylesheet" type="t ext/css" media="screen, tv, projection"> <link href="/areas/.css/jquery-plugins.css?ts=201504081530" rel= "stylesheet" type="text/css"> <link href="/areas/print/template_1_uk/print.css?ts=201504081530 " rel="stylesheet" type="text/css" media="print"> <script async="" src="//www.google-analytics.com/analytics.js">< /script><script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js" ></script> <script type="text/javascript">window.jquery || document.write(' <script src="/common/js/jquery/jquery.min.js"><\/script>');</script> <script src="//ajax.googleapis.com/ajax/libs/jqueryui/1.10.1/jqu ery-ui.min.js"></script> <script type="text/javascript">window.jquery.ui || document.writ e('<script src="/common/js/jquery/jquery-ui.min.js"><\/script>');</script> <script src="/common/js/jquery/jquery-plugins.js"></script> <script src="/common/js/script.js"></script> <script type="text/javascript"> /*<![cdata[*/ if(window.location.search != "" && window.location.search.indexof('?debug') == 0 ) { document.write(unescape("%3cscript type='text/javascript' src='/common/j s/runtime-debug-201504021456.js'%3e %3c/script%3e")); document.write(unescape("%3cscript type='text/javascript' src='/areas/pr int/template_1_uk/components-debug-201504081530.js'%3e %3c/script%3e")); } else { document.write(unescape("%3cscript type='text/javascript' src='/common/j s/runtime-201504021456.js'%3e %3c/script%3e")); document.write(unescape("%3cscript type='text/javascript' src='/areas/pr int/template_1_uk/components-201504081530.js'%3e %3c/script%3e")); } /*]]>*/ </script><script type="text/javascript" src="/common/js/runtime-201504021456.js" > </script><script type="text/javascript" src="/areas/print/template_1_uk/compon ents-201504081530.js"> </script></head> <body class=" print col1 lang-uk"> <div class="c"> <div class="bg-content clearfix"> <div class="cc wrapper clearfix"> <noscript> <div class="noscriptdiv"> <div class="top">& amp;nbsp;</div> <div class="middle"&g t; <p>javascr ipt not active in browser. javascript must enabled website.</p> <p>javascr ipt ╬┤╬╡╬╜ ╬╡╬ψ╬╜╬▒╬╣ ╬╡╬╜╬╡╧Β╬│╬χ ╧Δ╧Ε╬┐╬╜ browser ╧Δ╬▒╧Γ. ╬Ω javascript ╧Α╧Β╬φ ╧Α╬╡╬╣ ╬╜╬▒ ╬╡╬ψ╬╜╬▒╬╣ ╬╡╬╜╬╡╧Β╬│╬┐╧Α╬┐╬╣╬╖╬╝╬φ╬╜╬╖ ╬│╬╣╬▒ ╬▒╧Ζ╧Ε╬χ ╧Ε╬╖╬╜ ╬╣╧Δ╧ Ε╬┐╧Δ╬╡╬╗╬ψ╬┤╬▒.</p> </div> <div class="bottom"&g t;&nbsp;</div> </div> </noscript> <div id="plchcentre" class="centre place holder"> <div id="plchflash"></div> <div id="plchcentrebox2"></div> <div class="hidden" id="bodyclassoverrid ecomponent1"></div><div class="market-list" id="marketlistcontentcomponent2" sty le="display: block; "><img src="/indicator.gif" alt="loading"></div></div><!-- . centre --> <div class="print-buttons"> <a class="button" href="/areas/p rint/template_1_uk/#" onclick="window.print(); return false;">print</a> </div> </div><!-- .c --> </div> </div> <!-- google analytics --> <script src="/static/common/analytics/analytics.js" type="text/javascrip t"></script> <script type="text/javascript"> //built pagebuilder v. 1.0.0.0 var autowiring = new framework.autowiring(); autowiring.init(serviceconfiguration, componentconfiguration, dynamiccom ponentconfiguration, componentplacementmap, encodedxsltdocumentsmap); autowiring.run(); </script><div id="garbagecollector" style="display: none; "></div> <div class="ui-tooltip ui-widget ui-corner-all ui-widget-content" id="warp-toolt ip" style="position: absolute; left: 0px; top: 0px; z-index: 200000; display: no ne; "></div></body></html>
as far can see page mention above fed data xml file:
https://www.pamestoixima.gr/cache/evenuemarketgrouplimited/en/12924.1-0.xml?1457949428105 (probably timestamp in query string has kind of logging, since if remove it, same result).
so it's better grab directly file rather mimic browser's behaviour , rendered html.
Comments
Post a Comment