Archived Forum Post

Index of archived forum posts

Question:

Extract Parameters from Certain HTML Tags?

Apr 16 '13 at 09:05

Do any of the controls have the ability to extract parameters from a certain HTML tags? Like if I wanted to extract all the href parameters in <a> tags for an HTML document, is there a way to do that with MHT, etc?


Answer

One possible solution is to convert the HTML to well-formed XML by using the Chilkat HTML-to-XML component/class, and then use an XML API (Chilkat XML if desired) to traverse the XML and get the href's.

If using the Chilkat .NET API, there is also an undocumented (freeware) class named Chilkat.HtmlUtil which provides the following method:

Chilkat.StringArray HtmlUtil.GetHyperlinkedUrls(String html);
You may pass in the HTML and it returns a Chilkat.StringArray object containing the collection of URL's found in the href attribute of the <a> tags.