Retrieving Specific Data From A Website
I am currently building a scraper to scrape certain information from a website. For example, I would like to get a restaurant name, address, opening hours & telephone number fr
Solution 1:
Use SimpleHTMLDom parser for php:http://simplehtmldom.sourceforge.net/
Download here: http://sourceforge.net/projects/simplehtmldom/files/
Documentation here: http://simplehtmldom.sourceforge.net/manual.htm
That is as I have experience with parsing the best tool for parsing HTML with php...
Also you don't need to use curl for getting content if it is not necessary, for simpleHTMLDom parser just use:
$remote_html = file_get_html("http://www.somesite.com/");
Solution 2:
Take a look at XPath querying: http://php.net/manual/en/domxpath.query.php
I use the equivalant method for website scraping in C#. Same standard employed here. Most excellent.
Post a Comment for "Retrieving Specific Data From A Website"