login about faq

Hi, I am writing an application where in I need to crawl a html website & scrape the data from that website. I know I can write a spider using CkSpider class which can crawl the website but I wanted to know how can I scrape data from the website ? Which class can be used to scrape data from the website ? Any help would be highly appreciated. Thank you!!

asked May 06 '14 at 02:37

jo_himan's gravatar image


It depends. If the site being scraped uses XHTML, then each web page is technically XML and you can use any XML parser to help pick out the pieces of information you want. (Chilkat XML is one such XML API that could be used.)

If the site returns HTML, which is typically not valid XML, then you could use the Chilkat HTML-to-XML API to convert the HTML to well-formed XML for programmatic digestion...


answered May 06 '14 at 13:13

chilkat's gravatar image

chilkat ♦♦

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: May 06 '14 at 02:37

Seen: 2,179 times

Last updated: May 06 '14 at 13:13

powered by OSQA