Skip to content Skip to sidebar Skip to footer

Scraping Using Html Agility Package

I am trying to scrape data from a news article using HtmlAgilityPackage the link is as follows http://www.ndtv.com/india-news/vyapam-scam-documents-show-chief-minister-shivraj

Solution 1:

Do a 'View Source' on the page and search for com_user_text. The user comments don't appear at all. They are loaded via javascript after the page is loaded. So when you load the page content via getHtmlWeb.Load(), you don't get user comments.

As this answer says, HTML Agility is not a tool capable of emulating a browser and running javascript. Instead, you need something like WatiN that "allows programmatic access to web pages through a given browser engine and will load the full document."

Post a Comment for "Scraping Using Html Agility Package"