3.5
该版本进行了大量优化、改进和Bug修复,包括:
- 大大降低了控制开放的IndexReader上的协议索引的RAM占用(3~5倍)。
- 新增IndexSearcher.searchAfter,可在指定ScoreDoc后返回结果(例如之前页面的最后一个文档),以支持deep页用例。
- 新增SearcherManager,以管理共享和重新开始跨多个搜索线程的IndexSearchers。基本的IndexReader实例如果不再进行引用,则会被安全关闭。
- 新增SearcherLifetimeManager,为跨多个请求(例如:paging/drilldown)的索引安全地提供了一个一致的视图。
- 将IndexWriter.optimize重命名为forceMerge,以便去阻止使用这种方法,因为它的使用代价较高,且也不需要使用。
- 新增NGramPhraseQuery,当使用n-gram分析时,可提升30%-50%的短语查询速度。
- 重新开放了一个API(IndexReader.openIfChanged),如果索引没有变化,则返回空值,而不是旧的reader。
- Vector改进:支持更多查询,如通配符和用于产生摘要的边界分析。
- 修复了若干Bug。
3.4
此次发布包括了大量的bug修复、优化及改进。主要改进如下:
- 修复了一个重要的bug(LUCENE-3418
):操作系统或电脑崩溃,或是断电时Lucene索引文件很容易受到损坏。
- 增加一个新的faceting模块(contrib/facet),以便计算检索时间内的分面统计(包括hierarchial和non-hierarchical的)(LUCENE-3079
)。
- 增加一个新的join模块(contrib/join),能够使用BlockJoinQuery/Collector对内嵌(parent/child)文档进行索引及检索(LUCENE-3171
)。
- 现在索引文档可以包含词频,而不带地址了(LUCENE-2048
);先前的omitTermFreqAndPositions总是将两者都省略。
-
http://w
ww.71pic.com
- QueryParser模块(contrib/queryparser)现在可以创建NumericRangeQuery。
- 在contrib/analyzers中增加了一个SynonymFilter,可以进行多关键词索引或查询,其中包含可以读取wordnet及solr同义词格式的分析程序。(LUCENE-3233
)。
- 现在能够控制缺少排序字段的文档的排序操作,可使用SortField.setMissingValue实现(LUCENE-3390
)。
- 修复了在使用addIndexes方法后,term vector会被从索引中静默删除的问题(LUCENE-3402
)
3.3-2011.7
Highlights of the Lucene release include:
- The spellchecker
module now includes suggest/auto-complete functionality,
with three implementations: Jaspell, Ternary Trie, and Finite State.
- Support for merging results from multiple shards
, for both "normal"
search results (TopDocs.merge) as well as grouped results using the
grouping module (SearchGroup.merge, TopGroups.merge).
- An optimized implementation of KStem, a less aggressive stemmer
for English
- Single-pass grouping implementation based on block document indexing.
- Improvements to MMapDirectory (now also the default implementation
returned by FSDirectory.open on 64-bit
Linux
).
-
NRTManager
simplifies handling near-real-time search with multiple
search threads, allowing the application to control which indexing
changes must be visible to which search requests.
-
TwoPhaseCommitTool
facilitates performing a multi-resource
two-phased commit, including IndexWriter.
- The default merge policy, TieredMergePolicy, has a new method
(set/getReclaimDeletesWeight) to control how aggressively it
targets segments with deletions, and is now more aggressive than
before by default.
-
PKIndexSplitter
tool splits an index by a mid-point term.
3.2-2011-6
- A new grouping
module, under lucene/contrib/grouping, enables
search results to be grouped by a single-valued indexed field 原来这版本才出来
- A new IndexUpgrader
tool fully converts an old index to the
current format.
- A new Directory implementation, NRTCachingDirectory
, caches small
segments in RAM, to reduce the I/O load for applications with fast
NRT reopen rates.
- A new Collector implementation, CachingCollector
, is able to
gather search hits (document IDs and optionally also scores) and
then replay them. This is useful for Collectors that require two
or more passes to produce results.
- Index a document block using IndexWriter's new addDocuments
or
updateDocuments
methods. These experimental APIs ensure that the
block of documents will forever remain contiguous in the index,
enabling interesting future features like grouping and joins.
- A new default merge policy, TieredMergePolicy
, which is more
efficient due to being able to merge non-contiguous(邻近的,连续) segments.
See http://w
ww.71pic.com
for details.
- NumericField is now returned correctly when you load a stored
document (previously you received a normal Field back, with the
numeric value converted string).
- Deleted terms are now applied during flushing to the newly flushed
segment, which is more efficient than having to later initialize a
reader for that segment.
3.1-2011.3
ConcurrentMergeScheduler
is more careful about setting priority of
merge threads.
ReusableAnalyzerBase
makes it easier to reuse TokenStreams
correctly.
ConstantScoreQuery
now allows directly wrapping a Query.
IndexWriter
is now configured with a new separate builder API,
IndexWriterConfig. You can now control IndexWriter's previously
fixed internal thread limit by calling setMaxThreadStates.
IndexWriter.getReader is replaced by IndexReader
.open(IndexWriter)
MultiSearcher is deprecated; ParallelMultiSearcher
has been
absorbed directly into IndexSearcher.
- New TotalHitCountCollector
just counts total number of hits.
-
ReaderFinishedListener
API enables external caches to evict entries
once a segment is finished.
据说是已经实现了grouping,但还是没说出来。。。
3.0.3-2010-12
a memory leak in IndexWriter
exacerbated by frequent commits
这也说明还不是很稳定
fixed:NumericRangeQuery
/ NumericRangeFilter
sometimes returning incorrect results
with bounds near Long.MIN_VALUE
and Long.MAX_VALUE
various thread safety issues
3.0.2-2010-6
Fixed memory leaks in IndexWriter
when large documents are indexed.
It also uses now shared memory pools for term vectors and stored fields.
IndexWriter
now releases Fieldable
s and
Reader
s on close
.
Performance improvements in ParallelMultiSearcher
(3.0.2 only).
分享到:
相关推荐
lucene7.7.2稳定版本
lucene5.X与lucene3.60的版本差异
Lucene各版本间变化较大,lucene官方的turtial里面很多还是lucene-3.x.x的版本,这是Lucene实战(中文版第二版)对应Lucene版本,有需要的拿去用。
Lucene.dll各版本集合,提供各种版本选择,避免出现方法无法执行的错误,有不足的欢迎联系补充Lucene.dll各版本集合,提供各种版本选择,避免出现方法无法执行的错误,有不足的欢迎联系补充
Lucene 4.1 最新版本 源码 修复诸多BUG 含英文API 新增AnalyzingSuggester和FuzzySuggester等,性能优化 欢迎下载
lucene各个版本的jar包,还有solr视频教程下载地址
信息检索系统Lucene v2.3.3.4的C++开源代码。是学习信息检索,搜索引擎的杀手锏。
由于林良益先生在2012之后未对IKAnalyzer进行更新,后续lucene分词接口发生变化,导致不可使用,所以此jar包支持lucene6.0以上版本
lucene实例lucene实例lucene实例lucene实例lucene实例lucene实例lucene实例lucene实例lucene实例
支持lucene3.0以上版本的分词器paoding的jar包还没有出来,只有源代码。我通过将源代码打包得到此jar.并且测试过。可以和当前最新版本的lucene3.6整合
Lucene3.0特性Lucene3.0特性
Lucene.net 2.1版本源码,dll
由于林良益先生在2012之后未对IKAnalyzer进行更新,后续lucene分词接口发生变化,导致不可使用,所以此jar包支持lucene6.0以上版本
lucene,lucene教程,lucene讲解。 为了对文档进行索引,Lucene 提供了五个基础的类 public class IndexWriter org.apache.lucene.index.IndexWriter public abstract class Directory org.apache.lucene.store....
lucene3.0 lucene3.0 lucene3.0 lucene3.0 lucene3.0
包括所有的、最新的jar包(59个),lucene-analyzers-common-4.10.2,lucene-core-4.10.2等
lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习...
Lucene.net下载 Lucene.net3.03,最新版的Lucene.net下载。
lucene in action 是lucene3.0的版本
包括了lucene-core-3.0.2.jar,IKAnalyzer3.2.0Stable.jar,lucene-analyzers-2.3.0.jar,lucene-highlighter-3.0.2-sources.jar,lucene-memory-3.0.2.jar,最新的停词字典stopword.rar