The Code
copy/* gets the data from a URL */ function get_data($url) { $ch = curl_init(); $timeout = 5; curl_setopt($ch,CURLOPT_URL,$url); curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout); $data = curl_exec($ch); curl_close($ch); return $data; }
The Usage
$returned_content = get_data('http://davidwalsh.name');copy$returned_content = get_data('http://davidwalsh.name');Alternatively you can use the PHP DOM:
$keywords = array();
$domain = array(‘http://davidwalsh.name‘);
$doc = new DOMDocument;
$doc->preserveWhiteSpace = FALSE;
foreach ($domain as $key => $value) {
@$doc->loadHTMLFile($value);
$anchor_tags = $doc->getElementsByTagName(‘a’);
foreach ($anchor_tags as $tag) {
$keywords[] = strtolower($tag->nodeValue);
}
}
Keep in mind this is not a tested piece of code, I took parts from a working script I have created and cut out several of the checks I’ve put in to remove whitespace, duplicates, and more.
$keywords = array();
$domain = array(‘http://davidwalsh.name‘);
$doc = new DOMDocument;
$doc->preserveWhiteSpace = FALSE;
foreach ($domain as $key => $value) {
@$doc->loadHTMLFile($value);
$anchor_tags = $doc->getElementsByTagName(‘a’);
foreach ($anchor_tags as $tag) {
$keywords[] = strtolower($tag->nodeValue);
}
}
Keep in mind this is not a tested piece of code, I took parts from a working script I have created and cut out several of the checks I’ve put in to remove whitespace, duplicates, and more.
No comments:
Post a Comment