I just got out of Mike Cannon-Brookes' talk on using Lucene, and it was rather interesting. He never quite said why you should use Lucene instead of some of the other products out there (Java Content Repository implementations, for example) but in the end, he ended up justifying Lucene for his use fairly well.
The problem is that he talked about storing stuff and how lucene worked - good for a tech audience, I guess, but really not so good from a technology investigation standpoint. If you knew if you wanted to use Lucene already, the talk was great! But if you were considering Jackrabbit or any other content system... he only talked about what Lucene provided, not contrasting it with anything from any other project.
That said, I have to admit that Lucene looks like the right tool for Atlassian. He went into a lot of detail about what they needed and how they used it, and I can't say JCR would have fulfilled what they needed at all. He left out why Lucene should only use derived data (except in pointing out how Lucene sucks rocks at updating stuff in place) but overall, good talk.
Did he happen to mention how poorly Lucene scaled? Or how to handle it
over clusters? Or how it really sucks at updating very large indexes?
There is a whole "shell script" management that has to go along side any
indexes of any reasonable size.
Perhaps Atlassian thinks Lucene is good for Atlassian, but their customers
have other ideas:
http://jira.atlassian.com/browse/JRA-5567