RSS

Neil Crookes

Learnings and Teachings on Web Application Development & CakePHP

Jan

30

CakePHP Site Search with Yahoo! BOSS

A complete turnkey solution for integrating Yahoo! BOSS powered site search functionality into your CakePHP application.

Share and Enjoy:

  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Slashdot

Providing the ability for users to search your whole site for information is pretty damn important.

You can build your own search functionality by searching your database, but as the number of datatypes or tables in your db increases, and their relationships get more complex, the harder this is to do, and it’s even harder to do well.

An alternative is something like Lucene, but is there not a better solution, easier to implement? Of course there is.

What you need is a search functionality with the search and ranking technology of search a search engine, but with results restricted to your own site and ideally without having to include sponsored links or any branding or anything like that.

What you need to do is Build your Own Search Service.

Well, actually, you don’t, because Yahoo! has done it for you!

BOSS (Build your Own Search Service) is Yahoo!’s open search web services platform. The goal of BOSS is simple: to foster innovation in the search industry. Developers, start- ups, and large Internet companies can use BOSS to build and launch web-scale search products that utilize the entire Yahoo! Search index. BOSS gives you access to Yahoo!’s investments in crawling and indexing, ranking and relevancy algorithms, and powerful infrastructure. By combining your unique assets and ideas with our search technology assets, BOSS is a platform for the next generation of search innovation, serving hundreds of millions of users across the Web.

So, that’s the hard part covered, the next step is integrating it into your CakePHP application. The good news is, I’ve done this for you as well.

I’ve written a CakePHP datasource for the Yahoo! BOSS service that uses CakePHP’s built-in HttpSocket class to make requests and provides both web search and spelling suggestion functionality.

The web search can be limited to one or more sites, and the results contain key terms related to the result.

The other cool things is it can easily be used in conjunction with custom paginateCount and paginate model methods to make use of CakePHP’s buit-in pagination controller logic and helpers.

The datasource itself, and all the files you need to integrate the functionality into your site are available on my github account, the files are:

  • app/config/database.php
    Merge with your existing database.php file.
  • app/config/routes.php
    Merge with your existing routes.php file. Gives you nice urls like http://domain.com/search/<term>
  • app/controllers/searches_controller.php
    Contains the results() action
  • app/models/datasources/yahoo_boss_source.php
    Where the magic (it’s pretty simple actually) happens
  • app/models/search.php
    Calls methods in the datasource
  • app/views/searches/results.ctp
    Search results view

So, to add it to your app:

  1. Copy the files into your app
  2. Register for a Yahoo! developer app ID
  3. Add it to your the yahooBoss config array in app/config/database.php
  4. Set the value of the ‘sites’ key in the config array to your own site

As follows:

var $yahooBoss = array(
  'datasource' => 'yahoo_boss',
  'sites' => 'http://your.site.here',
  'app_id' => 'your_app_id_here',
);

Now point browser to http://your.site.here/search

See it in action. It’s configured to search http://www.neilcrookes.com. Try a search for CakePHP and to try the spelling suggestion, try searching for CakePHO. Note, I’ve hidden the key terms in the search results, but you can view source to see what they look like.

Are there are disadvantages? A couple I’ve noticed, but you can live with them – the Ts and Cs of Yahoo! BOSS say you have to use the click url they send you in the search results (they send you the real URL too), which is a link to Yahoo!, who then redirect the user to the proper page on your site – I think it’s for link tracking or something. The other – it highlights how crap your page titles are!

Share and Enjoy:

  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Slashdot
(1 votes, average: 4.00 out of 5)
Loading ... Loading ...

Comments are closed.