Archived Forum Post

Index of archived forum posts

Question:

Source of spider.GetFailedUrl

Jul 31 '13 at 11:45

Hi, how to we know the URL of a failedURL coming from?

In my code below, I need to know the getfailedURL came from which page so that I can fix the URL if needed:

 For i = 0 To spider.NumFailed - 1
                    sb_failed_url.Append(spider.GetFailedUrl(i) 
                    sb_failed_url.Append(vbCrLf)

 Next

Thanks.


Answer

1) The LastUrl property provides the last URL spidered (i.e. the URL downloaded in the most recent call to CrawlNext).

2) The ClearFailedUrls method can be called to clear the internal list of failed URLs.

3) Use the above to solve the problem. Call ClearFailedUrls prior to each call to CrawlNext. After the call to CrawlNext, if NumFailed is greater than zero, then the URLs in the failed list came from the last URL crawled, which is available in LastUrl.