login about faq

Hi, how to we know the URL of a failedURL coming from?

In my code below, I need to know the getfailedURL came from which page so that I can fix the URL if needed:

 For i = 0 To spider.NumFailed - 1
                    sb_failed_url.Append(spider.GetFailedUrl(i) 
                    sb_failed_url.Append(vbCrLf)

 Next

Thanks.

asked Jul 29 '13 at 21:47

huislaw's gravatar image

huislaw
1124

edited Jul 29 '13 at 21:49


1) The LastUrl property provides the last URL spidered (i.e. the URL downloaded in the most recent call to CrawlNext).

2) The ClearFailedUrls method can be called to clear the internal list of failed URLs.

3) Use the above to solve the problem. Call ClearFailedUrls prior to each call to CrawlNext. After the call to CrawlNext, if NumFailed is greater than zero, then the URLs in the failed list came from the last URL crawled, which is available in LastUrl.

link

answered Jul 31 '13 at 11:45

chilkat's gravatar image

chilkat ♦♦
11.8k316358420

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×1

Asked: Jul 29 '13 at 21:47

Seen: 712 times

Last updated: Jul 31 '13 at 11:45

Related questions

powered by OSQA