How to Make Simple Web Crawler Using PHP And MySQL

There are many ways to do this, and many languages you can build your spider or crawler in. PHP is one of the most popular and widely supported programming language for website-related programming. There are numerous libraries and tools for building crawlers and spiders using PHP. These tutorials show ways to build a crawler using this language. You may also like How to get latitude and longitude from google map using PHP and How to create zip file after upload file using PHP.

PHP Code

<?php
    $main_url="http://sample.com";
    $str = file_get_contents($main_url);

    if(strlen($str)>0)
    {
       $str = trim(preg_replace('/\s+/', ' ', $str)); 
       preg_match("/\<title\>(.*)\<\/title\>/i",$str,$title); 
       $title=$title[1];
    }


    $b =$main_url;
    @$url = parse_url( $b );
    @$tags = get_meta_tags($url['scheme'].'://'.$url['host'] );
    $description=$tags['description'];


    $doc = new DOMDocument; 
    @$doc->loadHTML($str); 

    $items = $doc->getElementsByTagName('a'); 
    print_r($items);



    foreach ($items as $value) {
       $attrs = $value->attributes;
       $sec_url[] = $attrs->getNamedItem('href')->nodeValue;
   }

   $all_links = implode(",", $sec_url);
   mysqli_query($con,"insert into webpage (main_url, title, description,link) values('$main_url','$title','$description','$all_links')");

?>

Leave a Reply

Your email address will not be published. Required fields are marked *