- 20% of the input creates 80% of the result
- 20% of the workers produce 80% of the result
- 20% of the customers create 80% of the revenue
- 20% of the bugs cause 80% of the crashes
- 20% of the features cause 80% of the usage
Friday, December 25, 2009
The Pareto principle - 80-20 rule
Pareto Principle is the observation (not law) that most things in life are not distributed evenly.
Monday, December 14, 2009
Webdriver with Lift
A briefing on webdriver is here:
Applying lift makes tests more readable:
LiFT allows writing automated tests in a style that makes them very readable, even for non-programmers. Using the LiFT API, we can write tests that read almost like natural language, allowing business requirements to be expressed very clearly. This aids communication amongst developers and customers, helping give all stakeholders confidence that the right things are being tested.
check https://lift.dev.java.net/

Applying lift makes tests more readable:
LiFT allows writing automated tests in a style that makes them very readable, even for non-programmers. Using the LiFT API, we can write tests that read almost like natural language, allowing business requirements to be expressed very clearly. This aids communication amongst developers and customers, helping give all stakeholders confidence that the right things are being tested.
check https://lift.dev.java.net/
Wednesday, December 2, 2009
Troubleshooting Oracle connections related problem
/* Display the number of sessions created for this db connection */
select username,schemaname, osuser,machine, terminal, program, type, module, event,service_name sid, command, status from v$session
SELECT name, value FROM gv$parameter WHERE name = 'resource_limit';

select username,schemaname, osuser,machine, terminal, program, type, module, event,service_name sid, command, status from v$session
SELECT name, value FROM gv$parameter WHERE name = 'resource_limit';
maven-release-plugin
I think one of the nicer things about Maven is pluggability ( apart from ability to have mulptiple profiles).
Maven release plugin abstracts a bunch of task to be performed pre-release. This plugin comes default with maven 2.x.
This simple command does all the hard work for you:
mvn release:clean release:prepare release:perform
This cleans any previous release config, tags the release, increments the snapshot version , commits themand finally sends artefacts to proxy maven repository(like archiva).
Gotchas:
1. This happened when I tried using release feature for the first time.
It would fail saying cannot tag the release.. inspite of able to commit the code with version changes.
Solution:
<plugin>
<artifactId>maven-release-plugin</artifactId>
<version>2.0-beta-9</version>
</plugin>
Also check the scm config in your pom
<scm>
<connection>scm:svn:svn://svn.company.com/my_project/trunk</connection>
<developerConnection>scm:svn:svn://svn.company.com/my_project/trunk</developerConnection>
<url>http://svn.company.com/my_project/trunk</url>
</scm>

Maven release plugin abstracts a bunch of task to be performed pre-release. This plugin comes default with maven 2.x.
This simple command does all the hard work for you:
mvn release:clean release:prepare release:perform
This cleans any previous release config, tags the release, increments the snapshot version , commits themand finally sends artefacts to proxy maven repository(like archiva).
Gotchas:
1. This happened when I tried using release feature for the first time.
It would fail saying cannot tag the release.. inspite of able to commit the code with version changes.
Solution:
<plugin>
<artifactId>maven-release-plugin</artifactId>
<version>2.0-beta-9</version>
</plugin>
Also check the scm config in your pom
<scm>
<connection>scm:svn:svn://svn.company.com/my_project/trunk</connection>
<developerConnection>scm:svn:svn://svn.company.com/my_project/trunk</developerConnection>
<url>http://svn.company.com/my_project/trunk</url>
</scm>
Tuesday, November 24, 2009
Spring interceptor ordering
SimpleUrlHandlerMapping uses a hashMap to hold the interceptors. Ordering can only be guaranteed by setting order property.. By default it does a random get from the hashMap.
http://kickjava.com/src/org/springframework/web/servlet/handler/SimpleUrlHandlerMapping.java.htm
extends
http://kickjava.com/src/org/springframework/web/servlet/handler/AbstractHandlerMapping.java.htm
I was wondering if such things goes live by getting a right value...how do they get caught as their order is random without explicitly saying so.
Maybe spring should warn us upfront..(or maybe it does?)
Useful link : http://forum.springsource.org/showthread.php?t=51281

http://kickjava.com/src/org/springframework/web/servlet/handler/SimpleUrlHandlerMapping.java.htm
extends
http://kickjava.com/src/org/springframework/web/servlet/handler/AbstractHandlerMapping.java.htm
I was wondering if such things goes live by getting a right value...how do they get caught as their order is random without explicitly saying so.
Maybe spring should warn us upfront..(or maybe it does?)
Useful link : http://forum.springsource.org/showthread.php?t=51281
Wednesday, November 4, 2009
Subversion patch release
Releasing a patch to an already released (tagged) trunk would need some steps to be done.
Though it would take time it seems like a best practice to keep the code in sync across tags/ branches/ trunk.
The details:
a. Create copy of latest released tag into your branch B.
b. Fix bug in branch B
c. Copy fixed branch B to tag+1
d. Merge branch to trunk.
e. Remove branch.

Though it would take time it seems like a best practice to keep the code in sync across tags/ branches/ trunk.
The details:
a. Create copy of latest released tag into your branch B.
b. Fix bug in branch B
c. Copy fixed branch B to tag+1
d. Merge branch to trunk.
e. Remove branch.
Thursday, September 10, 2009
Ubuntu - linux
Setup ubuntu today.
First impressions:
- Very simple to install (very simple partition manager).
- GUI looks similar to fedora.
- Music started playing without any messy configuration( had to struggle to get it working on fedora)
Bundled radio stations started singing the melody....
- Was looking around for a cool applications equivalent for ubuntu like "easylife" installer in Fedora.
sudo aptitude install [application name] looks reasonable but not so cool like eaylife.
- Windows XP gets automatically detected and drives get auto mounted.
- More to come..
Useful links:
Linux problem determination
Show all running process

First impressions:
- Very simple to install (very simple partition manager).
- GUI looks similar to fedora.
- Music started playing without any messy configuration( had to struggle to get it working on fedora)
Bundled radio stations started singing the melody....
- Was looking around for a cool applications equivalent for ubuntu like "easylife" installer in Fedora.
sudo aptitude install [application name] looks reasonable but not so cool like eaylife.
- Windows XP gets automatically detected and drives get auto mounted.
- More to come..
Useful links:
Linux problem determination
Show all running process
OSGi - Open Services Gateway Initiative
Basics:
Distributed in the form of a bundle.
Ref: http://www.javaworld.com/javaworld/jw-03-2008/jw-03-osgi1.html?page=1
Take it to the server side
An interesting read to start off web app development using osgi with tomcat
http://www.javaworld.com/javaworld/jw-06-2008/jw-06-osgi3.html?page=1
Spring DM (dynamic module) + osgi

Distributed in the form of a bundle.
Ref: http://www.javaworld.com/javaworld/jw-03-2008/jw-03-osgi1.html?page=1
Take it to the server side
An interesting read to start off web app development using osgi with tomcat
http://www.javaworld.com/javaworld/jw-06-2008/jw-06-osgi3.html?page=1
Spring DM (dynamic module) + osgi
Cleanup dirty code using the power of regular expressions
In eclipse search for all irrelevant comments and clean them up:
/\*\s(\r\n.*\r\n)+(\s)+\*.*(@see)*.*(\r\n)+(\s)+\*/
or
For finding all comments blank use >> (?m:)/\*\*(\r\n)+(\s)+\* @[^/]*/
Note: m tells eclipse to search multi line.
Eg:
/*
*
* @see ...
*/
The above pattern can be cleaned up very easily with pattern above.

/\*\s(\r\n.*\r\n)+(\s)+\*.*(@see)*.*(\r\n)+(\s)+\*/
or
For finding all comments blank use >> (?m:)/\*\*(\r\n)+(\s)+\* @[^/]*/
Note: m tells eclipse to search multi line.
Eg:
/*
*
* @see ...
*/
The above pattern can be cleaned up very easily with pattern above.
Saturday, August 29, 2009
Fun with Typography!!
Want to try out some fonts - use this
A funny vid here[ direct link]
Comic sans is good for non professional usage but it seems to be used in most places(Is it because of microsoft??) : http://bancomicsans.com/home.html
Some sample fonts:
Open Google Serif
Open Google Sans-serif
Open Google Monospace
Open Google Arial
Open Google Comic Sans MS
Open Google Courier New
Open Google Georgia
Open Google Monotype Corsava
Open Google Tahoma
Open Google Times New Roman
Open Google Trebuchet
Open Google Verdana

A funny vid here[ direct link]
Comic sans is good for non professional usage but it seems to be used in most places(Is it because of microsoft??) : http://bancomicsans.com/home.html
Some sample fonts:
Open Google Serif
Open Google Sans-serif
Open Google Monospace
Open Google Arial
Open Google Comic Sans MS
Open Google Courier New
Open Google Georgia
Open Google Monotype Corsava
Open Google Tahoma
Open Google Times New Roman
Open Google Trebuchet
Open Google Verdana
Thursday, August 27, 2009
Behaviour-driven development in Java - Junits evolved ?
As TDD is more of running junits(at a very unit level) and hence we focus very little in writing tests which check code as a functional unit.
BDD(behaviour-driven development) fills up this gap I think and is more closely aligned to developers way of thinking.
I see it more like an organized functional tests in code!
As such system work in unison and not just as a small unit. Hence a boundary has to be defined as to which all parts of the system should be tested and how based on time/money tradeoffs ?
Implementations of BDD for java: http://jbehave.org/
Implementations of BDD for groovy: Easyb
Reference: http://www.jroller.com/DhavalDalal/entry/preferring_bdd_over_tdd

BDD(behaviour-driven development) fills up this gap I think and is more closely aligned to developers way of thinking.
I see it more like an organized functional tests in code!
As such system work in unison and not just as a small unit. Hence a boundary has to be defined as to which all parts of the system should be tested and how based on time/money tradeoffs ?
Implementations of BDD for java: http://jbehave.org/
Implementations of BDD for groovy: Easyb
Reference: http://www.jroller.com/DhavalDalal/entry/preferring_bdd_over_tdd
Wednesday, August 26, 2009
Fast javascript testing
A nice javascript junit test driver to run from eclipse.
http://code.google.com/p/js-test-driver/
http://css.dzone.com/news/fast-javascript-testing

http://code.google.com/p/js-test-driver/
http://css.dzone.com/news/fast-javascript-testing
Friday, August 21, 2009
Speed up your maven build
Imagine a company having 10 modules each depending on the other. Lets say building it would take 15 minutes using maven.So the team would spend roughly in a day - (noOfBuildTimesPerDay x 15 minutes) x developerCount. This is bad.Some options : "mvn clean install" is good for small/medium sized projects but not for bigger ones. Trade-off between time and space is always there. When space is not a major constraint , we can gain time by "mvn clean install -Dmaven.compile.fork=true -Dmaven.junit.jvmargs=-Xmx512m -Dmaven.junit.fork=true" With mvn clean, delete on windows is slower, instead if we rename it could boost the build time. http://bosy.dailydev.org/2009/02/speed-up-your-maven-build-four-times.html This cut the build time by around 30% on my PC. Thats an improvement isnt it.I think it would be a big improvement if maven can reuse the classes generated byeclipse. Also having an exploded war deployment is an option to reduce pack and unpack times.
Friday, August 14, 2009
Real-Time Tracking and Tuning for Busy Tomcat Servers
A very nice article which details on possible options for tomcat server monitoring to tweak its performance.
http://www.devx.com/Java/Article/32730/1954

http://www.devx.com/Java/Article/32730/1954
Tuesday, August 11, 2009
Free your mind with mindmaps
Mindmaps help you quickly organize and load the knowledge into your brain. Also the noise words are minimal in a mindmap and hence our thinking pattern becomes clearer.
Some usefuls:
For presentation
http://www.mindmeister.com/26036305/presentation-zen-powerful-presentations
for vocabulary
http://www.mindmeister.com/1479825/new-words-oh-a-little-tidy
For a free software goto - freemind.org
online - http://www.mindmeister.com

Some usefuls:
For presentation
http://www.mindmeister.com/26036305/presentation-zen-powerful-presentations
for vocabulary
http://www.mindmeister.com/1479825/new-words-oh-a-little-tidy
For a free software goto - freemind.org
online - http://www.mindmeister.com
Friday, June 19, 2009
Eclipse and Idea hot key bindings
Ok our developers use IDEA and Eclipse. And we do pair programming.
Hmmmn I dont want to learn another set of key bindings...!!
Here is the solution:
Using eclipse keys from idea:
Using idea key's from eclipse:

Hmmmn I dont want to learn another set of key bindings...!!
Here is the solution:
Using eclipse keys from idea:
- Idea supports Eclipse key binding built-in;
- you just have to switch to eclipse key bindings using preferences dialog;
Using idea key's from eclipse:
- There is a plugin from http://www.jroller.com/santhosh/entry/intellij_idea_key_scheme_for
- I havent tried it as I am an eclipse guy :)..
Tuesday, June 16, 2009
Tomcat Exploded war - cut deployment time
The time taken for a webapp to be packaged into a war, deploy and then have it unpackaged in tomcat container can be reduced. Simply have the exploded war deployed. This could save time for developer!!
Steps to make exploded war deployment:
1. add this under pom.xml >> plugins
org.codehaus.mojo
tomcat-maven-plugin
1.0-beta-1
admin
admin
<url>
Steps to make exploded war deployment:
1. add this under pom.xml >> plugins
<url>
http://localhost:8080/manager
2. mvn war:exploded tomcat:undeploy tomcat:exploded
--This keeps the war exploded, undeploys the webapp context and copies over the exploded build to tomcat using the tomcat manager.
Also a better option to deploy to different servers and also generating the war ( instead of configuration above):
mvn clean install -Dmaven.test.skip=true war:exploded tomcat:undeploy tomcat:exploded -Dmaven.tomcat.url=http://localhost:8080/manager -Dtomcat.password=admin -Dtomcat.username=admin
Refer for more http://mojo.codehaus.org/tomcat-maven-plugin/exploded-mojo.html

Thursday, June 4, 2009
Monday, June 1, 2009
Etags - request header tags to speed up your site .
An ETag (entity tag) is an HTTP response header returned by an HTTP/1.1 compliant web server.
There are 2 possible implementations :
1. Shallow Etag
- An MD5 hash is computed for first request from the response content and set in the response header by the web filter.For subsequent requests, the filter retreives the previous etag and checks with the new etag computed using the response content.
Minuses:
-etag is computed after page is rendered on the server.
- Doesnt work for pages with dynamic content per request (like date/time).
Spring 3 has this support. It is called ShallowEtagHeaderFilter.
2. Deep Etag
-This avoids page computation at a much granular level. Uses hibernate interceptors and is not so straightforward to implement. Spring 3 is yet to support this approach.
References:
infoq article
Saturday, May 23, 2009
Texter - An auto text expander
I just cant imagine how many times I would have typed the same sentence again and again...Texter to the rescue.... thanks to Adam Pash!!As the source is open for viewing, there is no nasty spyware in it I suppose!!
It is built using autohotkey for windows. Its scripting mode is excellent. Check this for syntax.
Thursday, May 21, 2009
XSLT caching Transformers
The usage of cached transformer objects is recommended here
A sample implementation of CachingTransformerFactory is here
The above code abstracts the caching of Transformer objects using HashMap's.
In brief it says to override the TransformerFactoryImpl and cache the transformer objects. The xsl updates are dealt by checking the timestamp on the XSL file.
Hence no need of a web service to update XSL transformer cache!!
Using TransformerFactory.newInstance() ,
there will be absolutely no code change. The Services API will look for a classname in the file META-INF/services/javax.xml.transform.TransformerFactory
in jars available to the runtime. More here...
Thursday, May 14, 2009
Wednesday, May 13, 2009
JSP implicit objects printer
copy paste the below and include it using <%@ include file="jspPrinter.jsp" %>
<%@ page
errorPage="ErrorPage.jsp"
import="java.io.*"
import="java.util.*"
%>
<%
Enumeration enames;
Map map;
String title;
// Print the request headers
map = new TreeMap();
enames = request.getHeaderNames();
while (enames.hasMoreElements()) {
String name = (String) enames.nextElement();
String value = request.getHeader(name);
map.put(name, value);
}
out.println(createTable(map, "Request Headers"));
// Print the session attributes
map = new TreeMap();
enames = session.getAttributeNames();
while (enames.hasMoreElements()) {
String name = (String) enames.nextElement();
String value = "" + session.getAttribute(name);
map.put(name, value);
}
out.println(createTable(map, "Session Attributes"));
map = new TreeMap();
enames = request.getAttributeNames();
while (enames.hasMoreElements()) {
String name = (String) enames.nextElement();
String value = "" + session.getAttribute(name);
map.put(name, value);
}
out.println(createTable(map, "Request Attributes"));
%>
<%-- Define a method to create an HTML table --%>
<%!
private static String createTable(Map map, String title)
{
StringBuffer sb = new StringBuffer();
// Generate the header lines
sb.append("");
sb.append("");
sb.append("");
sb.append("");
// Generate the table rows
Iterator imap = map.entrySet().iterator();
while (imap.hasNext()) {
Map.Entry entry = (Map.Entry) imap.next();
String key = (String) entry.getKey();
String value = (String) entry.getValue();
sb.append("");
sb.append("");
sb.append("");
sb.append("");
}
// Generate the footer lines
sb.append("");
sb.append(title);
sb.append(" ");
sb.append(key);
sb.append(" ");
sb.append(value);
sb.append("
");
// Return the generated HTML
return sb.toString();
}
%>
Tuesday, April 28, 2009
Wednesday, April 22, 2009
Tuesday, April 21, 2009
Rewrite rules in apache and IIS
Well we can control how the server serves stuff to clients by defining rewrite rules.
As servers are dumb, its important to explain well about the rewrite rules. For example you should explicity say when a rule matches a URL pattern, no need to seek for further rules('L' flag).
The most interesting part is the regular expressions used for rewrite rules. This I will discuss sometime later...But to do some hit and try use me.
Okie, first with apache:
The rewrite rules are handled by module mod_rewrite. This reads the rewrite conf file configured in httpd.conf.
Include the rewrite rule path in httpd.conf >> VirtualHost >>
Include /path/to/rewrite.conf
Define a rewrite rule file /path/to/rewrite.conf
All set. Ok now the details on rewrite rules:
#This says that rewrite engine is on. This can be configured in httpd.conf as well.
RewriteEngine On
#for debugging only
RewriteLog logs/rewrite.log
RewriteLogLevel 9
#check if a condition is true:
RewriteCond %{HTTP_HOST} ^local.xyz.com$
#if true above then only apply the below rule(s).
RewriteRule ^/?$ /journal/index.html [PT]
#RewriteRule has the syntax RewriteRule
Pattern Substitution [flags]
The important flags are:
L - Stop the rewriting process here and don't apply any more
rewriting rules
R - Redirect. Must have 'L' flag usually with this(to stop processing further)
N - Re-run the rewriting process
P -This flag forces the substitution part to be internally
forced as a proxy request and immediately
NC - No case .This makes the Pattern case-insensitive
PT - pass through to next handler . Works with alias..Used rarely??.
Lots more on Configuration Directives for apache rewrite rules go here:
Gotcha : To apply a rewrite rule based on the request parameter you *have* to use
RewriteCond %{QUERY_STRING} ^param=value
By default the RewriteRule does not apply for request parameter
On ISAPI rewrite rules for IIS:
Though it is supposed to be a clone of apache mod_rewrite to help IIS do the redirections there are some differences. There are more ISAPI_Rewrite directives for IIS like RewriteHeader ,etc
More or less the syntax match (thank god) and should be easy if u know apache rewrites.
More on IIS rewrite rules here
Monday, March 30, 2009
SMTP mail from windows using telnet
Reference here
To summarize:
telnet host port
helo myMachineName
mail from: testFrom@test.com
rcpt to: testTo@test.com
data
This is a sample mail. [Press . and enter to complete message.]
quit
Friday, March 13, 2009
Liferay - web 2.0 made simple!
Like Drupal or Joomla CMS/portal solutions in the PHP world we have few good open source java solutions. Liferay is one of them.
This product was chosen by my previous employers client( a large online media publishing house of USA). This is when I got a chance to play around with liferay for around 1 year. This client wanted to upgrade their media sites to latest technology but with minimum costs and the so called web2.0. Liferay was evaluated and given a choice by our onsite/offshore Archtects.
"Brian Chan, Chief Software Architect and founder, began development on Liferay Portal in 2000 to provide nonprofit organizations with an open source solution to facilitate collaboration on the Internet. He has since steered Liferay to become a leader in innovative open source enterprise solutions. With a strong foundation in software architecture and economics, Brian has solidified open source as a low-cost, high performance solution for the enterprise. His expertise in portal architecture and design has garnered him a seat on the JSR-286 portlet specification committee "
What liferay has is - a well organized and fusion of the best java/open source technologies.
Most organizations end up reinventing the wheel- doing a portal after spending time in evaluating, building and testing different open source technologies. For example web/service/persistence tiers, RSS feed, blogs, theme design, etc would need enough effort . A complete lifecycle is required to get things going. If they are not well organized they end up with maintenance nightmares or issues when trying to extend it features..
The technicalities:You name a buzzword in Java/J2ee (No EJB please) and they have it http://www.liferay.com/web/guest/products/portal/techspecs
To list a few java buzzwords what Liferay Portal has: Web Services, REST, WebDAV
Architecture: SSO(CAS), ESB support, modular, pluggable
Performance and scalability: Clustering, Caching (ehcache), page/portlet caching.
Supported standards: AJAX, JSR-168/286(for portlets)
Uses best of the breed open source technologies To name a few - jquery, lucene, spring, hibernate, struts, velocity, ehcache SOA, SSO, Web2.0 features - wiki, blog
Most databases supported.(Persistence tier uses hibernate)
Content Management: Document Library - JSR-170 Versioning/workflow/webdav/image gallery,etc. SEO/Site map, rich text editors, friendly urls
Themes and layout: jQuery standardized hot deployable velocity based.
More hereA video to create a journal article on a site so easily is here
A few numbers to show its popularity:Google returns 363,000 results for "Liferay portal" (exact search including quotes) 1,710,000 for liferay
Started in 2000 9th year of development.Liferay Community forums:# of Categories: 41# of Posts: 77,263# of Participants: 8,524
Wiki:300+ articles
Pluses- This open source product is being used by many users. Hence well tested and best practises incorporated. Also there is a large active community support.
- Most importantly by embracing an open source product like liferay, it would keep the organisation up to date with latest technologies. It is just a matter of upgrading to never version.
- Also most common problems like single sign on, RSS, indexing/searching, etc are already solved by applying the best practises. With such a readily available integrated technology stack, migrating to newer technologies becomes much more easier.
The worries:- As it is bundled with lots of technologies, it becomes heavy weight. But with liferay's a modular architecture, this can be customized and hence minimized.
- Like any new technology, a steeper learning curve is involved. Once a transition happens it should be worth the investment of time and money.
- Customizing liferay to the organizational needs would take time as we need to match the companies infrastructure to liferay's architecture.Eg: The database schema is designed for liferay. How do we customize it? This is where you got to burn your midnight oil...
- Liferay has virtual hosting support. But to integrate with the existing infrastructure, it would take enough time and research.
- "Its a black box with lot of complex things going in it. It may go out of control as we dont know what is happening inside.. " As it is opensource it is transparent and you are free to fix it for yourself.
Some good links:Videos : http://www.liferay.com/web/guest/community/documentation/5_1
An issue tracking system : issues.liferay.com
Saturday, February 28, 2009
Jooming with Joomla
The amount of work needed to get a site up and running especially the front end XHTML design part i A LOT.
I came across one excellent CMS solution which can very well be used to develop plain vanilla websites.
The results are really amazing.
I feel it is much better than Drupal in terms of ease of use and its rich admin features..
It has millions of plugins and themes to choose from. Once you get the knack of placing the plugins and
using themes into your site, virtually you can achieve amazing results with very less coding effort.
Here are a few notes for Joomla 1.5:
Positioning in CSS is the key for each module. You need to understand the positioning after reading the CSS.
The module name is the key in css that needs to be mentioned in the new module's postion attribute. Eg: left, top, right, user3, use4 , etc
To add a search box in home page
Extensions --> Module Manager -->new--> search
To add a menu
Extensions -->Module manager --> new --> menu.
Be careful to give the position of the menu as the one in HTML theme...eg:
News in home page
Define a section news, then a category for news. Add article for this category.
Extensions--> Module manager --> new --> news flash
Choose the category which holds articles for news..thats it.
Contact us
Menus --> new --> Internal link--contacts
Gallery
Download a plugin extension like MorfeoShow. Add this module and configure..
Watch out for this space. more to come....joom jooom joooooomla
Wednesday, February 25, 2009
Spring MVC
Spring has many controllers.
SimplefFormController is an interesting one to work with.
I try to disect this in this blog posting.
TODO: More to come here:
A good source for understanding is available at
http://raibledesigns.com/wiki/Wiki.jsp?page=SpringControllers
http://www.springify.com/
Wednesday, February 4, 2009
Lucene
Lucene is an excellent text search engine.
Solr a sub project of lucene provides web based interface handling XML requests and executing XML commands .
"Solr is more of a general-purpose search server, and it assumes you already have structured data (like catalog data, music collections,etc)."
Nutch again a sub project of lucene is an excellent web crawler.
"Nutch is more like an open-source google... it's for crawling, converting, indexing, and searching websites."
Assuming you have an existing J2ee application with struts and hibernate. The following major components/classes would be required if we consider to use lucene or solr:
1. Search Index writer: This will use existing hibernate methods/APIs to read the records (which need to be searched) and create lucene indeces. The index can be stored on the disk or in a database via JDBC. Lucene calls these index records as documents and will have relevant information to display in search result.
2. Search index reader: This will read the index stored(on disk or database by step 1.) and return the search result. There is no hibernate method calls involved as the lucene index is separate from database. To display the search result, the struts action classes would need to be altered/added. However on click of the search result for details of the record, the existing struts/hibernate(if it is available) functionality will be used to display the details of a record.
Basically you can think of it as a search engine implementation(like google) where you index the records(like websites) and search results will contain only the basic information. The detailed information is delivered on click of a link on search result.
Coming to Solr, this is basically a web service wrapper on top of lucene. Under the hood it also builds lucene index. The advantage of Solr being it is web service based so it is easy to sync index in a distributed environment. Also it provides caching, index syncing etc out of the box.
Gotchas and Tips:
Sorting:
Need to maintain a duplicate field which is NOT_ANALYZED.
**** sorting is case sensitive *****
sorting field value can be having a fixed max length say 20 ... this improves performance.
Indexing:
Analyzer used for indexing and searching should be the same.
StandardAnalyzer can be extended to have HTMLStripReader and ISOLatin1AccentFilter.
Links:
Syntax supported:
http://lucene.apache.org/java/1_4_3/queryparsersyntax.html
Solr + Jquery sample:
http://solrjs.solrstuff.org/test/reuters/
http://www.theserverside.com/news/thread.tss?thread_id=43617
http://www.xml.com/pub/a/2006/08/09/solr-indexing-xml-with-lucene-andrest.html
http://www.ibm.com/developerworks/java/library/j-solr-update/?S_TACT=105AGX01&S_CMP=HP

Subscribe to:
Posts (Atom)