| Commit message (Collapse) | Author | Age |
|
|
|
|
|
| |
This increases the indexer version in order to force a rebuild of the
search index in order to "repair" the search index that might contain
uppercase words
|
|
|
|
|
|
| |
On certain PHP installations (it has been reproduced with PHP version
5.2.0-8+etch11) the indexer failed to lowercase words in the indexer
so the fulltext search was partially broken.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new rendering limit of currently 5 pages to the
p_get_metadata function. This means that in one request not more than 3
pages will be parsed/rendered. Pages for which the cache can be used
aren't counted. This should make the new cache modes safe to use and
should provide backwards compatibility while keeping the advantage of
rendering metadata on demand (i.e. imagine one included page out of 10
is updated, then the metadata for that page can be rendered, but when
you request a purge of the cache not 10 pages are rendered).
In this commit most of the changes to the p_get_first_heading function
are reverted and the title index is no longer used. This makes the first
heading functionality no longer depends on the search index of DokuWiki.
Maybe it can be added again later when the indexer provides a proper API
for getting metadata values for all or selected pages. The performance
of the p_get_first_heading function should be almost back to the
performance in Anteater as the simple cache of p_get_metadata is used
and also the limit of p_get_metadata is of course applied.
|
|
|
|
|
|
| |
With this test it should be possible to detect if the search index has
been corrupted by using Rincewind RC or a git version of the weeks
before the RC release.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug that is fixed here may have corrupted your search index in a way
that it produces wrong or missing results and won't be fixed
automatically. This occurs when you have deleted the last occurrence of
a word that has been on the last line of one of the word indexes. A
functionality for checking for a broken search index will be added.
The index can be fixed by deleting it completely (remove all .idx files
in data/index/) and recreating it using bin/indexer.php -c. The
searchindex plugin will be updated to be able to do the same, soon.
|
|
|
|
| |
FS#2242
|
|
|
|
| |
Metadata is rendered now in the indexer when it's cache is invalid.
|
|\ |
|
| | |
|
| | |
|
|/
|
|
| |
ugly underscores
|
|
|
|
|
| |
as discussed at
http://www.freelists.org/post/dokuwiki/tokenizer-cmd-in-indexer,1
|
| |
|
|
|
|
|
|
|
| |
This merges the INDEXER_PAGE_ADD and INDEXER_METADATA_INDEX events and
introduces the new string keys 'page', 'body' and 'metadata' in the
event data. All plugins that use INDEXER_PAGE_ADD need to be adjusted to
use the key 'page' instead of 0 and 'body' instead of 1.
|
| |
|
| |
|
| |
|
| |
|
|\
| |
| |
| |
| |
| |
| | |
Conflicts:
inc/fulltext.php
inc/indexer.php
lib/exe/indexer.php
|
| |
| |
| |
| |
| | |
This makes it possible to find words that include soft-hyphens. However,
search higlighting will not work and I have no idea how to make it work.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
p_get_metadata has a $render parameter that has been disabled by the
restructuring of metadata rendering. This change reactivates it so
rendering metadata can be prevented. This is e.g. used in the search and
in some plugins like indexmenu that use p_get_first_heading. The default
of the parameter has been changed to true as otherwise the new caching
structure won't work as almost all calls to p_get_metadata don't set the
$render parameter.
The indexer call to p_get_first_heading has been changed to set $render
to true as in the indexer only one page will be rendered and the title
in the index should really be the current one.
This does not fix the problem that rendering pages with lots of links or
displaying the index can cause the parsing/rendering of a lot of pages.
|
| |
| |
| |
| |
| |
| | |
As of VIM 7.3 it is no longer possible to specify the encoding in the
modeline. This gives an error message whenever such a file is opened,
thus this commit removes the enc setting from the modeline.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| | |
index.
|
| | |
|
| | |
|
|\ \ |
|
| | |
| | |
| | |
| | | |
result
|
| | | |
|
|/ /
| |
| |
| |
| |
| | |
This allows plugins to add their own version strings like
plugin_tag=1 so pages can be reindexed when plugins update their index
content.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
INDEXER_METADATA_INDEX event
This new event allows plugins to add or modify the metadata that will be
indexed. Collecting this metadata in an event allows plugins to see if
other plugins have already added the metadata they need and leads to
just one single indexer call thus fewer files are read and written.
Plugins could also replace/prevent the metadata indexer call using this
event.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This fixes addMetaKeys so it actually removes values. This also changes
the functionality of the function: It now updates the key for the page
with the current value instead of adding new values as this will be the
default use case. A new parameter could be added to restore the "old"
behavior when needed.
addMetaKeys now only saves the index when the content has really been
changed.
Furthermore no empty number is added anymore to the reverse index when
it has been empty previously.
addMetaKeys now releases the lock again and really fails when the lock
can't be gained.
|
| |
| |
| |
| |
| | |
Saving and looking up metadata key/value pairs seems to work now at
least with some basic tests.
|
| |
| |
| |
| |
| |
| |
| | |
Now _saveIndexKey inserts empty lines when the index isn't long enough.
This is necessary because the page ids are taken from the global page
index, but there is not every page in the metadata key specific index
so e.g. line 10 might be the first entry in the index.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The indexer functions have been converted to a class interface.
Use the Doku_Indexer class to access the indexer with these public methods:
addPageWords
addMetaKeys
deletePage
tokenizer
lookup
lookupKey
getPages
histogram
These functions are provided for general use:
idx_get_version
idx_get_indexer
idx_get_stopwords
idx_addPage
idx_lookup
idx_tokenizer
These functions are still available, but are deprecated:
idx_getIndex
idx_indexLengths
All other old idx_ functions are unsupported and have been removed.
|
|\ \ |
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
An external tokenizer inserts extra spaces to mark words in the input text.
The text is sent through STDIN and STDOUT file handles.
A good choice for Chinese and Japanese is MeCab.
http://sourceforge.net/projects/mecab/
With the command line 'mecab -O wakati'
|
| | | |
|
| | | |
|