Google Search in an AJAX’ed way

I have a good news and a bad news for today. The bad news is, I got hit by a f-king drunker several days ago in a traffic accident (Saigon’s traffic is pure shit) and had to take some macro surgeries. The good news is, due to this accident I have some free time to think and do something - sense and nonsense, just for my habits.

So yesterday, this pal of mine showed (off) a web project he’s taking part in - a site called “HeyGoo“. This site uses Google’s AJAX Search API (and Yahoo’s too, I’m not sure) to find and display result in an AJAX’ed way - that is, no pagination involved, instead users have to scroll to bottom for the next results to be shown up. Rather interesting, eh?

Well, my friend he seemed to be very proud of this project and putting all his hope to its success, so I didn’t want to discourage him. But honestly, I don’t see it as any revolution. Not talking about the old design with tables everywhere, the annoying target=”_blank” anchors and the so-cliché logo, I pointed out for him some problems the site was having:

  • The search results are not so good. I searched for something on Google and it returned some hundreds of results, when HeyGoo told me it couldn’t find any. I don’t know why, hey, at least it is using Google Search API no?
  • There are certainly times that a user want to go to page, say, 3571, to see if his new forum ever reaches to Google. He can do that very easily with any “normal” search engine out there, but not with this HeyGoo where he must scroll 3570 times.
  • Biggest problem: the site currently doesn’t care about the loaded results. New results are appended, appended, appended, when old ones are still there in the top. Imagine how much memory the poor browser has to use up for this curious guy who’s scrolling over 317 pages and counting.

With all these drawbacks vs. the only ajax advantage (personally I don’t see it as an advantage but another failure however), I doubt this site will do anything big. Anyhow as a friend I still wish it the best.

Now to the main part: despite of my skeptical thoughts, I still see the AJAX dynamic scroll content something interesting, and might be useful sometimes. So I sit down and wrote my own (very basic) version of the AJAX Search, where no pagination is involved and you must scroll for something new ;) It took me 30 minutes for the core, and some 30 more for the layout tweaks. Before getting into the code, you can see the final working result here.

It’s rather simple. The whole “trick” was done in 2 part: the client JavaScript, and the server PHP.

1. The JavaScript

As usual, jQuery is my choice. First, bind to the “submit” event of the search form to handle searching.

$("#f").submit(function(e){
    e.preventDefault(); 
    // don't search if there's a search in action, or the form is not valid
    if (busy || !validateForm(this)) return false;
    $("#result").html("");
    $("#loading").fadeIn();
    var q = $("#q").val();
    // indates that there's a search on the stage
    busy = true;
    // do an ajax call
    $.ajax({
        url: "search.php",
        data: $("#f").serialize(),
        success: function(msg){
            $("#loading").fadeOut("fast", function(){
                $("#result").html(msg);
                $(document).data("start", 0);  // the start position of result is 0, since this is the first page
                $(document).data("q", q);      // save the last used keywords
                expired = false;                    // the search has not expired (just started ;) )
                busy = false;                       // not busy anymore, ready for the next search
            });
        },
        error: function(){
            $("#loading").fadeOut("fast", function(){
                // do something here?
                busy = false;
            });
        }
    });
});

With the Dimension plugin now natively integrated into jQuery version 1.2.6, it’s 4 lines of code to detect if the user has scrolled to the bottom of the page and go get the next results.

$(window).scroll(function(){
    if ($(window).scrollTop() == $(document).height() - $(window).height()){
        searchMore();
    }
});

Now, the searchMore() function:

function searchMore()
{
    // if the site is busy with another search, or if the search has expired
    // (no more results can be found), don't search
    if (busy || expired) return;
    // if we don't have a previous search, don't search
    if ($(document).data("q") == "") return;
    // i'm working, i'm working...
    busy = true;
    // the start page is increased by 1
    $(document).data("start", $(document).data("start") + 1);
    $("#loading").fadeIn();
    // make another AJAX call
    $.ajax({
        url: "search.php",
        data: "q=" + $(document).data("q") + "&start=" + $(document).data("start") * 10, // multiplied by 10
        success: function(msg){
            $("#loading").fadeOut("fast", function(){
            	// we have the next result, perfect
            	// append it into the current result
                $("#result").append(msg);
                busy = false; // not so busy anymore
            });
        },
        error: function(){
            $("#loading").fadeOut("fast", function(){
                // do something here?
                busy = true;
            });
        }
    });
};

There, the JavaScript is almost done. Of course there is a “validateForm()” function, but I guess it’s not necessary to include it into this post…

2. The server stuff

In order to retrieve the search results from Google, there are at least 2 ways:

  1. Using its AJAX API. Unfortunately, for some mysterious reasons, this didn’t work so well with me, as only 8 first pages can be retrieved. Any attempt to get the remaining results were answered with a 400 bad request error. I guess I missed something, maybe API or ATM ;) but hell with it.
  2. Making use of the traditional and very powerful cURL. My choice :)

Here is the entire content of my search.php page.

$url = 'http://www.google.com/search?hl=en&q=' . urlencode($_GET['q']) . '&start=' . intval($_GET['start']);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $_SERVER['REQUEST_URI']);
$body = curl_exec($ch);
curl_close($ch);
// do a simple preg_match to get the results
// let's pray that Google won't change its HTML structure for search results so soon ;)
preg_match('/<ol>(.*?)</ol>/i', $body, $matches)
// no match?
if (empty($matches) || empty($matches[1]))
{
    if (intval($_GET['start']))
        die('<script>expired = true;</script><li>No more results were found.</li>'); // set the "expired" JS var to true if this is called from searchMore() function
    else
        die('<li>Your search did not match any documents.</li>');
}
// Google uses relative path for the image sources and video links on its search pages
// so we turn them into absolute links
echo utf8_encode(str_replace('"/', '"http://www.google.com/', $matches[1]));

That’s all of the core part. Of course we must prepare the index HTML with all the form, div, ol, img and so on before getting into action, but it’s not that tricky. So you can visit the result here and do some tests. Remember, this search tool is very basic, and I don’t have any intention to turn it into anything advanced ;)


You can follow any responses to this entry through the RSS 2.0 feed.