Resurrecting the Google SOAP API


Original Photo courtesy of Magicfantasy on Flickr: http://www.flickr.com/photos/manicfantasy/889984960/
I know what you’re thinking, Let Sleeping Corpses Lie right?
Well let me first start off by saying I am not a huge fan of SOAP or anything, in fact, I much prefer REST about 90% of the time. However, there are still some cases where a fully controllable SOAP API which simply lists results as XML wrapped in a SOAP envelope, is a better way of interacting with data; particularly when that data is updated less frequently and is larger in size. Seeing Google results as XML can make them easier to manipulate and exchange, oh and lets not forget, that XML is what the X in AJAX stands for anyway, though JSON is used interchangeably these days, if not more often. With the debate between the SOAP and RESTful crowd all but won by REST and its simplicity, lets go ahead and awake the sleeping dinosaur that is the Google Search SOAP API for experimentation’s sake. For a quick history lesson, in 2006 the headlines rang to the predictable tune of “Google Drops the SOAP“, which Google eventually responded to by effectively saying “move on” and former managers for Google’s SOAP APIs came out and explained Why SOAP sucks.
In choosing between web scrapping and using the new Search API, I’ve opted to go for the more advisable Search API approach. The wrapper code for the Google AJAX Search API was done in PHP, using a simple client/server implementation, and referring to an old archive of the SOAP API’s WSDL here:
http://bcmoney-mobiletv.com/widgets/google/GoogleSearch.wsdl (here’s a sample SOAP request)
Using this as the basis, the client code looks like the following:
<?php $term = ($_REQUEST['q'] != null && $_REQUEST['q'] != "") ? $_REQUEST['q'] : 'Mobile TV'; //query term (default to Mobile TV if none passed in) /** * CONTRACT-FIRST (WSDL) * Google Search API - PHP SOAP Web Service */ try { $client = new SoapClient('GoogleSearchSOAP.wsdl'); $result = $client->__soapCall('doGoogleSearch', array(array('q' => $term))); $results_array = json_decode($result->GoogleSearchResults); $timestamp = date("D, d M Y H:i:s O"); //get today's date (metadata and logging) header('Content-Type: text/xml; charset=utf-8'); echo " <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:urn="http://google.com/search/"> <soap:Header /> <soap:Body> <urn:GoogleSearchResult> <documentFiltering>false</documentFiltering> <searchComments>". $results_array->responseData->responseDetails ."</searchComments> <estimatedTotalResultsCount>". $results_array->responseData->cursor->estimatedResultCount ."</estimatedTotalResultsCount> <estimateIsExact>false</estimateIsExact> <searchQuery>". $term ."</searchQuery> <startIndex>". $results_array->responseData->cursor->currentPageIndex ."</startIndex> <endIndex>". $results_array->responseData->cursor->currentPageIndex ."</endIndex> <searchTips>". urlencode($results_array->responseData->cursor->moreResultsUrl) ."</searchTips> <directoryCategories> <DirectoryCategory> <fullViewableName>". $results_array->responseData->results[0]->GsearchResultClass ."</fullViewableName> <specialEncoding>UTF-8</specialEncoding> </DirectoryCategory> </directoryCategories> <searchTime>". $timestamp ."</searchTime> <resultElements>"; $i=0; foreach ($results_array->responseData->results as $results => $rez) { echo" <ResultElement> <summary>". $rez->title ."</summary> <URL>". $rez->unescapedUrl ."</URL> <snippet>". $rez->content ."</snippet> <title>". $rez->titleNoFormatting ."</title> <cachedSize>". getCacheSize($rez->unescapedUrl, $i) ."</cachedSize> <relatedInformationPresent>false</relatedInformationPresent> <hostName>". $rez->visibleUrl ."</hostName> <DirectoryCategory> <fullViewableName>".$rez->GsearchResultClass."</fullViewableName> <specialEncoding>UTF-8</specialEncoding> </DirectoryCategory> <directoryTitle>".$rez->GsearchResultClass."</directoryTitle> </ResultElement>"; $i++; } echo "</resultElements> </urn:GoogleSearchResult> </soap:Body> </soap:Envelope>"; } catch (Exception $e) { printf("Message = %sn",$e->__toString()); } function getCacheSize($url,$name="temp.html") { $unique_name = $name.".html"; $content = file_get_contents($url); $handle = fopen($unique_name, "w+"); fwrite($handle, $content); fclose($handle); return filesize($unique_name); } ?> $server = new SoapServer('GoogleSearchSOAP.wsdl'); $server->addFunction('doGoogleSearch'); $server->handle(); ?>
The server-side to wrap the required results looks like this:
<?php /** * google_search_api() * Query Google AJAX Search API * * @param array $args URL arguments. For most endpoints only "q" (query) is required. * @param string $referer Referer to use in the HTTP header (must be valid). * @param string $endpoint API endpoint. Defaults to 'web' (web search). * @return object or NULL on failure */ function google_search_api($args, $referer = 'http://bcmoney-mobiletv.com/widgets/google/', $endpoint = 'web'){ $url = "http://ajax.googleapis.com/ajax/services/search/".$endpoint; if ( !array_key_exists('v', $args) ) { $args['v'] = '1.0'; } //default to version 1.0 if none specified $url .= '?'.http_build_query($args, '', '&'); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // note that the referer *must* be set curl_setopt($ch, CURLOPT_REFERER, $referer); $body = curl_exec($ch); curl_close($ch); //decode and return the response: json_decode($body) return $body; } /** * doGoogleSearch * SOAP operation * Performs a search using the new Google AJAX Search API JSON response format and wrapping it into the previously expected SOAP/XML "ResultElement" * @param query String a UTF-8 compliant search term query string * @return */ function doGoogleSearch($param) { $result = google_search_api(array( 'q' => $param->q, )); $results = array('GoogleSearchResults' => $result); return $results; } $server = new SoapServer('GoogleSearchSOAP.wsdl'); $server->addFunction('doGoogleSearch'); $server->handle(); ?>
You can download it below and experiment for yourself, but do not misuse the Google AJAX Search API and be sure to follow the Terms of Service. Just in time for Halloween, it’s back from the dead… booo!!!
-OR-
DISCLAIMER TO GOOGLE:
If this is a misuse of my AJAX API please don’t ban my account, but send me a message on here… I will take this down immediately! Based on the discussions here and here, as well as the Terms of Service themselves, I understood that as long as we do not change the ranking of the results, we may present the end user Google results only via the AJAX Search API. If this is a misunderstanding, or, if the AJAX Search API includes the specific CSS/styling, in HTML format, with no metadata, then I did misunderstand and again will promply remove these demo scripts, which were merely meant to explore the possibility of resurrecting the Google SOAP API in theory, not providing a separate endpoint from Google.com for viewing search results or an alternative API from the official Google Search API.
