Buzzmarker can not pick up our link

  • Question
  • Updated 5 years ago
buzzmarker can not pick up our link (psprint.com) on this page? i.e. http://joeypinkney.com/african-americ...
Photo of tom

tom

  • 13 Posts
  • 0 Reply Likes

Posted 5 years ago

  • 2
Photo of Douglas Ferguson

Douglas Ferguson, Official Rep

  • 55 Posts
  • 2 Reply Likes
Official Response
Hi Tom,

I got to the bottom of this. This is a bit technical, but the basic issue is that there's a problem with the html on this page. Specifically, the page has an "end of body" tag (</body>) before the document actually ends (it actually has two of them). BuzzStream stops reading from the url connection once it sees the </body&gt tag. The reason we do this is so that the Buzzmarker returns results to you quickly (we've seen a lot of cases where servers don't properly close their connections).

Basically there's a trade-off between accuracy and performance -- we can always wait for connections to close (which will mean that the Buzzmarker takes a long time if the server isn't closing connections) or we can stop reading when we see the </body> tag (which will mean we might miss links when there are multiple </body> tags). So far, we've only seen this problem twice, so we've chosen performance. We'll keep a very close eye on this to see if we need to rethink that decision.

Please let us know if you have any questions.

Thanks,

Douglas
Photo of Douglas Ferguson

Douglas Ferguson, Official Rep

  • 55 Posts
  • 2 Reply Likes
Official Response
Tom,

We have found a better solution for this problem. We can now detect if the page has a premature or tag and we keep reading from the page, but we can also quickly timeout if a server is keeping the connection open indefinitely. So we have the best of both worlds...

I tested your link and it works better now.

Thanks again for bringing this to my attention.

Douglas