python - Why is cURL returning "additional stuff not fine"? -


i writing python application queries social media apis via curl. of different servers query (google+, reddit, twitter, facebook, others) have curl complaining:

additional stuff not fine transfer.c:1037: 0 0

the unusual thing when application first starts, each service's response throw line once or twice. after few minutes, line appear several several times. curl identifying doesn't like. after half hour, servers begin time out , line repeated many tens of times, showing real problem.

how might diagnose this? tried using wireshark capture request , response headers search anomalies might cause curl complain, wireshark's complexity there not seem way isolate , display headers.

here relevant part of code:

output = cstringio.stringio() c = pycurl.curl() c.setopt(c.url, url) c.setopt(c.useragent, 'mozilla/5.0 (x11; ubuntu; linux x86_64; rv:17.0) gecko/20100101 firefox/17.0') c.setopt(c.writefunction, output.write) c.setopt(c.connecttimeout, 10)  c.setopt(c.timeout, 15)  c.setopt(c.failonerror, true) c.setopt(c.nosignal, 1)  try:     c.perform()     toreturn = output.getvalue()     output.close()     return toreturn  except pycurl.error, error:     errno, errstr = error     print 'the following curl error occurred: ', errstr 

i'm 99.99% sure not in http headers, rather being printed stderr libcurl. possibly happens in middle of logging headers, why confused.

anyway, quick search "additional stuff not fine" curl transfer.c turned a recent change in source description is:

curl_readwrite: remove debug output

the text "additional stuff not fine" text added debug purposes while ago, isn't helping , reason linux distributions provide libcurls built debug info still present , (far many) users read info.

so, harmless, , reason you're seeing got build of libcurl (probably linux distro) had full debug logging enabled (despite curl author thinking that's bad idea). have 3 options:

  1. ignore it.
  2. upgrade later version of libcurl.
  3. rebuild libcurl without debug info.

you can @ libcurl source transfer.c (as linked above) try understand curl complaining about, , possibly threads on mailing list around same time—or email list , ask.

however, suspect may not relevant real problem @ all, given you're seeing right start.

there 3 obvious things going wrong here:

  1. a bug in curl, or way you're using it.
  2. something wrong network setup (e.g., isp cuts off making many outgoing connections or using many bytes in 30 minutes).
  3. something you're doing making servers think you're spammer/dos attacker/whatever , they're blocking you.

the first 1 seems least likely. if want rule out, capture of requests make, , write trivial script uses other library replay exact same requests, , see if same behavior. if so, problem can't in implementation of how make requests.

you may able distinguish between cases 2 , 3 based on timing. if of services time out @ once—especially if when start hitting them @ different times (e.g., start hitting google+ 15 minutes after facebook, , yet both time out 30 minutes after hit facebook), it's case 2. if not, case 3.

if rule out 3 of these, can start looking other things wrong, i'd start here.

or, if tell more app (e.g., try hit servers on , on fast can? try connect on behalf of slew of different users? using dev key or end-user app key? etc.), might possible else more experience services guess.


Comments