How to create a link checker

by ArulKumar 2008-04-09 11:50:03

Creating a basic link checker script is not very complicated task. If we think about a bit we can summarize the important sub-tasks as follows:



1. Create a HTML form to get URL to check
2. Open the main URL and store its content in a string
3. Analyze the string and collect all URL and store them in an array
4. Go through all of the array elements (URLs) and check the validity
5. Display the result to the visitor



Step 1.



As first step we focus on URL processing with PHP. The HTML part (1. and 5. steps) will be at the end of the tutorial.

So you can open an URL using the PHP built in function fopen and then you can read the content with fread. In this tutorial we create a new function let's call it getPage() and this function accepts an URL string as parameter. Inside this new function we first try top open the URL and after that we read it's content in 1kbyte steps. The result will be stored in the $content variable. At the end we have the complete HTML code of the requested URL in the $content variable. This string will be returned by the function. The PHP code looks like this:



<?php
function getPage($link){

if ($fp = fopen($link, 'r')) {
$content = '';

while ($line = fread($fp, 1024)) {
$content .= $line;
}
}

return $content;
}
?>



Step 2.

Now we have the HTML code so the next step is to create a function which can analyze this string and collects all URL reference inside it. In HTML code the URLs are present inside an

So we need to find all

The function looks like this:



<?php
function checkPage($content){
$links = array();
$textLen = strlen($content);

if ( $textLen > 10){
$startPos = 0;
$valid = true;

while ($valid){
$spos = strpos($content,'
if ($spos < $startPos) $valid = false;
$spos = strpos($content,'href',$spos);
$spos = strpos($content,'"',$spos)+1;
$epos = strpos($content,'"',$spos);
$startPos = $epos;
$link = substr($content,$spos,$epos-$spos);
if (strpos($link,'http://') !== false) $links[] = $link;
}
}

return $links;
}
?>



Step 3.

The last PHP function we need is to check a link validity. To do this we again use the fopen function. However in this case we don't want to get the HTML content of the link so if the function returns true then we can say that the link is alive. The realization is quite simple and a bit similar to our first function:



<?php
function pingLink($domain){
$file = @fopen($domain,"r");
$status = -1;

if (!$file) {
$status = -1; // Site is down
}
else {
$status = 1;
fclose($file);
}
return $status;
}
?>



Step 4.

The only missing part is to make an environment for our new functions. So we need to create a HTML page with a form where the visitor can provide the requested URL. After submit the code checks the URL and and calls our first and second functions to get the URLs list. With this list we can build a table where each row represents a link and it's status. To avoid long waiting after each link we display the actual status by calling the ob_flush function. This function force PHP to send the actual output buffer to the browser.



That's it!
2354
like
0
dislike
0
mail
flag

You must LOGIN to add comments