CakePHP Site Search with Yahoo! BOSS
A complete turnkey solution for integrating Yahoo! BOSS powered site search functionality into your CakePHP application.
Providing the ability for users to search your whole site for information is pretty damn important.
You can build your own search functionality by searching your database, but as the number of datatypes or tables in your db increases, and their relationships get more complex, the harder this is to do, and it’s even harder to do well.
An alternative is something like Lucene, but is there not a better solution, easier to implement? Of course there is.
What you need is a search functionality with the search and ranking technology of search a search engine, but with results restricted to your own site and ideally without having to include sponsored links or any branding or anything like that.
What you need to do is Build your Own Search Service.
Well, actually, you don’t, because Yahoo! has done it for you!
BOSS (Build your Own Search Service) is Yahoo!’s open search web services platform. The goal of BOSS is simple: to foster innovation in the search industry. Developers, start- ups, and large Internet companies can use BOSS to build and launch web-scale search products that utilize the entire Yahoo! Search index. BOSS gives you access to Yahoo!’s investments in crawling and indexing, ranking and relevancy algorithms, and powerful infrastructure. By combining your unique assets and ideas with our search technology assets, BOSS is a platform for the next generation of search innovation, serving hundreds of millions of users across the Web.
So, that’s the hard part covered, the next step is integrating it into your CakePHP application. The good news is, I’ve done this for you as well.
I’ve written a CakePHP datasource for the Yahoo! BOSS service that uses CakePHP’s built-in HttpSocket class to make requests and provides both web search and spelling suggestion functionality.
The web search can be limited to one or more sites, and the results contain key terms related to the result.
The other cool things is it can easily be used in conjunction with custom paginateCount and paginate model methods to make use of CakePHP’s buit-in pagination controller logic and helpers.
The datasource itself, and all the files you need to integrate the functionality into your site are available on my github account, the files are:
- app/config/database.php
Merge with your existing database.php file. - app/config/routes.php
Merge with your existing routes.php file. Gives you nice urls like http://domain.com/search/<term> - app/controllers/searches_controller.php
Contains the results() action - app/models/datasources/yahoo_boss_source.php
Where the magic (it’s pretty simple actually) happens - app/models/search.php
Calls methods in the datasource - app/views/searches/results.ctp
Search results view
So, to add it to your app:
- Copy the files into your app
- Register for a Yahoo! developer app ID
- Add it to your the yahooBoss config array in app/config/database.php
- Set the value of the ‘sites’ key in the config array to your own site
As follows:
var $yahooBoss = array( 'datasource' => 'yahoo_boss', 'sites' => 'http://your.site.here', 'app_id' => 'your_app_id_here', );
Now point browser to http://your.site.here/search
See it in action. It’s configured to search http://www.neilcrookes.com. Try a search for CakePHP and to try the spelling suggestion, try searching for CakePHO. Note, I’ve hidden the key terms in the search results, but you can view source to see what they look like.
Are there are disadvantages? A couple I’ve noticed, but you can live with them – the Ts and Cs of Yahoo! BOSS say you have to use the click url they send you in the search results (they send you the real URL too), which is a link to Yahoo!, who then redirect the user to the proper page on your site – I think it’s for link tracking or something. The other – it highlights how crap your page titles are!


(1 votes, average: 4.00 out of 5)
16 Responses so far
January 30th, 2009
5:47 am
[...] A complete turnkey solution for integrating Yahoo! BOSS powered site search functionality into your CakePHP application. Share and Enjoy:Providing the ability for users to search your whole site for information is pretty damn important.You can build your own search functionality by searching your database, but as the number of datatypes or tables in your db increases, and their relationships get more complex, the harder this is to do, and it’s even harder to do well.An alternative is something Read the original here: CakePHP Site Search with Yahoo! BOSS [...]
January 30th, 2009
2:43 pm
That’s very nifty! Thanks for sharing :-)
January 31st, 2009
6:01 am
[...] Neil Crookes » CakePHP Site Search with Yahoo! BOSS A CakePHP datasource and associated controllers to create a simple (to implement), paginated Yahoo powered site search. (tags: cakephp yahoo search datasource) [...]
February 10th, 2009
3:01 am
[...] through the site and can be used to redirect them back to previous pages. Next is Neil’s CakePHP implementation of Yahoo!’s BOSS, which right now is implemented as a set of files that you drop into your [...]
February 10th, 2009
8:18 am
Nice work, Neil. I hacked an implementation together of this as well last summer – and well, your is much better and Cake-ish. I can learn from this, thanks!
February 10th, 2009
9:26 pm
Hey Marc, cheers mate, glad I can return the favour.
February 11th, 2009
5:57 am
[...] the CakePHP digest I posted the other day I linked to Neil Crookes’ CakePHP datasource for using Yahoo! Search BOSS. BOSS stands for Build your Own Search Service and is a cool way to [...]
May 7th, 2009
4:55 am
Really great work Neil! I was looking to implement BOSS on cakephp as I previously used a searchable behavior but that gets quite cumbersome when trying to index multple models. Then saw that you’d already done the hard yards. Thanks!
May 7th, 2009
6:38 am
Multiple sites did not work for me as an array. I fixed this. If you want the updated yahoo_boss_source.php file email me and I’ll send to you as I couldn’t submit to github.
July 1st, 2009
9:10 am
Thanks for this! I am looking forward to start using it!
I have got a question. For this to work, does my site needs to be in the Yahoo!’s index allready? I guess so, or is it crawling my site as soon as I start using this?
Thanks!
July 13th, 2009
10:35 pm
Thanks, this is excellent! Rarely does anything work so easily and with so little setup. The only issue I had was with a $startQuote / $endQuote being undefined but I just added those variables to the dataSource and all behaved just fine.
Thanks again, this saved a lot of time!
July 22nd, 2009
10:52 pm
If I wanted to use BOSS to search the web instead of my site, should I just leave the ‘sites’ value blank?
August 9th, 2009
12:14 am
Does yahoo need to index your site for this to work? I have it working on all sites but my newest that isn’t being read on yahoo search either.
Thanks
August 9th, 2009
11:12 pm
Thanks for your comments everyone.
Yes, Yahoo needs to index your site first for this to work.
To use it to search the web, I think you can probably just leave the sites value blank. Check the Yahoo BOSS documentation for more information.
October 1st, 2009
10:31 am
Thank you very much, this will help us a lot and will save us a lot of time, effort, and resources in terms of implementation!
Thanks again neil!
November 21st, 2009
10:51 am
[...] and provided a single results set from multiple models/sources. Normally I’d use the CakePHP Yahoo BOSS site search I wrote and blogged about previously, but this particular app requires users to login to access [...]
Leave a comment