Nice news from Nutch’s developer mailing. Nutch 1.6 RC will be available in next few day. More than 40 bugs/feature requests closed!
Now Nutch is growing up in 2 branches concurrently: 1.x and 2.x. Now 1.x seems to be more stable, and more plugins implemented, but 2.x branch has implemented Apache Gora so it’s possible to write crawled data to a bunch of SQL/NoSQL datastores, not just to SOLR(as with 1.x). Latest 2.x version, 2.1 was released on October, 5th.