summaryrefslogtreecommitdiff
path: root/inc/fulltext.php
Commit message (Collapse)AuthorAge
...
* enhanced full-text search functionKazutaka Miyasaka2009-09-20
| | | | | | | | | Ignore-this: cb05f50ca4de12e1cdf3a6cfb0e1b8bc - better search experience in Asian language - sophisticated search query syntax (OR, grouping, etc.) darcs-hash:20090920121116-9b77a-2718be7a043374669037b10d94101fc70efb95e3.gz
* better search snippets FS#1669Chuck Kollars2009-05-04
| | | | | | | | Ignore-this: fdf33ea5a6c50a597bd432c0da98e927 Snippets containing more of the seached words are preferred over ones containing less search keywords darcs-hash:20090504183835-a07b1-7b0da249fcb5680019fc3032dfd6fb063e94576a.gz
* Search function updatedaniel.lindgren2009-03-11
| | | | | | | | | Ignore-this: 4cd6bddacb795ef15f133559c223ac1f * Adds the possibility to exclude namespace(s) from search, by preceding them with "^". * Changed handling of search parameters to allow any order of words and namespaces. darcs-hash:20090311160255-13810-c2e00cc7764d180967b4c6f22e17b1c0dafe36f4.gz
* FS#1505, correct an issue where page name search results could include links ↵Chris Smith2009-01-19
| | | | | | to pages to which the user had no access darcs-hash:20090119062124-f07c6-5d761a76a50c6c9bcc124fa89feb2fb7b0a9a9b5.gz
* removed some illogical path setupsAndreas Gohr2008-12-13
| | | | darcs-hash:20081213090400-7ad00-4e21cd75978bb07513f32f5d750658e8d777c59e.gz
* Better search for pagename quick searchesAndreas Gohr2008-08-12
| | | | | | | | | The pagename matching search (AJAX and "real" search) now sort results based on the namespace hierarchy levels before doing an alphabetically search. This means pages with fewer namespace (ie. higher up in the hierarchy) wil be shown first. darcs-hash:20080812200649-7ad00-b58f152923864c3440e6412be58fb6fb25373583.gz
* Add SEARCH_QUERY_FULLPAGE & SEARCH_QUERY_PAGELOOKUP eventsChris Smith2008-08-11
| | | | | | | | | | SEARCH_QUERY_FULLPAGE event wraps around ft_pageSearch() call, the function which handles the search action and feed searching. The event data is the parameters of this function: data['query'] darcs-hash:20080811110656-f07c6-a149de6cd0ebc997541fa6e3f4bc6788d663dbd3.gz
* another change in highlight handlingAndreas Gohr2008-03-10
| | | | | | | | | | Now higlighting phrases are passed as an array which then is quoted correctly when used in a regexp. This should make phrase highlighting work completely correct. Please everyone test it. darcs-hash:20080310214939-7ad00-1abefb02dde40edeead50b4fa5c866c46b95ca3a.gz
* use fulltext index to search for used media files FS#1336 FS#1275Andreas Gohr2008-02-23
| | | | | | | | | | | | | | | This changes how DokuWiki looks for reference toa media file which is about to deleted. Instead of doing a full grep through all pages it now uses the fulltext index first, then does an exact match on the found pages. This speeds up the search significantly on larger wikis. However the fulltext search limits now apply: images with names shorter than 3 charcters may not be found. This needs extensive testing! darcs-hash:20080223205254-7ad00-486de0a4125d51b4e7999827f710d1d9de8bc60d.gz
* better highlighting for phrase searches FS#1193Andreas Gohr2008-02-15
| | | | | | | | | This patch makes the highlighting of phrases in search snippets and on the pages itself much better. Now a regexp gets passed to the ?s darcs-hash:20080215174653-7ad00-cd2d6f7d408db7b7dd3cb9974c3eb27f3a9baeac.gz
* add page_exists function (inc/pageutils.php)Chris Smith2007-09-30
| | | | | | bool page_exists($id, $rev darcs-hash:20070930021040-d26fc-e3847bfdd20a36154685262eca94211cfd461e83.gz
* don't use realpath() anymore (FS#1261 and others)Andreas Gohr2007-09-30
| | | | | | | | | | | The use of realpath() to clean up relative file names caused some trouble in certain setups relying on symlinks or having restricitve file structure setups. This patch replaces all realpath() calls with a PHP only replacement which should solve those problems. darcs-hash:20070930184250-7ad00-512ff04c95f57fc9eaf104f80372237a3c94286f.gz
* fulltext search fixes FS#1191 FS#1192Andreas Gohr2007-08-04
| | | | darcs-hash:20070804081226-7ad00-a8e7127c7122a96f9817158d87e1a364d8cdbc9f.gz
* fix for phrase search FS#1189Andreas Gohr2007-07-18
| | | | darcs-hash:20070718104839-7ad00-50348c1834c78e891f049023d2e8894d6bb0a00b.gz
* FS#744 (template developers, heed the changes)Anika Henke2007-05-15
| | | | darcs-hash:20070514222527-d5083-53ed619daf07d0a84c52161465d163abf1400529.gz
* Fix backlinks - See FS#1040Guy Brand2007-03-30
| | | | darcs-hash:20070330215042-19e2d-3528f2412ff044eb45158f349db5bbb5e32d907b.gz
* fixed warning whith no search results FS#1088Andreas Gohr2007-03-03
| | | | darcs-hash:20070303220143-7ad00-5d592dbebaae371c03102b20ae7e0d9e433b378b.gz
* fix for slashes in phrase search #1066Andreas Gohr2007-02-05
| | | | darcs-hash:20070205191848-7ad00-77ad5a398534a7a64884e155c4607350e0f25a7c.gz
* trim pagename returned by ft_pageLookupAndreas Gohr2006-11-24
| | | | darcs-hash:20061124215413-7ad00-f2bd46b7edf70660cc3e0274bd222eafba1edbc6.gz
* Word-Length IndexerTNHarris2006-11-12
| | | | | | | | | | | | | | | | | A modification to the indexer that sorts words based on length. This should make searching a little bit more efficient. After the patch is applied, your old index will be automatically converted to the new format (when you visit a page). The new index format is: 1. Index files are stored in savedir/index 2. Word lists are stored as wlen.idx. This used to be word.idx. 3. Word indexes are stored as ilen.idx. This used to be index.idx. 4. The page list, page.idx, is simply copied to the new location. Any plugins you have, such as the blog plugin, that read the index files need to be updated. darcs-hash:20061112194900-2b9f0-a975498ccf0a1d39c6df73b79bcd028d5e81c389.gz
* backlinks fixes (bugs #795 & #937)chris2006-11-05
| | | | | | | - add deaccented and romanised page names to index word list - remove stop words from tokens used in backlink search darcs-hash:20061105195453-9b6ab-6c4989eb75782af60a3de3bddbc99a83de2b4c80.gz
* remove unused codeAndreas Gohr2006-10-08
| | | | | | | This patch removes some commented code fragments and alternative snippet generators darcs-hash:20061008090624-7ad00-14bfee2ded6c6c8ef43ad02a4c02a5d95ee9daf7.gz
* more utf8_substr improvements (re FS#891 and yesterday's patch)chris2006-09-28
| | | | | | | - rework utf8_substr() NOMBSTRING code to always use pcre - remove work around for utf8_substr() and large strings from ft_snippet() darcs-hash:20060928165122-9b6ab-0eefc216f07f9d7e7d8eb62ce26605c28ee340fa.gz
* parser caching updatechris2006-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch primarily updates p_cached_xhtml() and p_cached_instructions() to allow their caching logic to be surrounded by an event trigger. p_cached_xhtml() has been rewritten as the more general p_cached_output() to support other render output formats besides 'xhtml'. All calls to p_cached_xhtml() have been changed to refer to the new function. New event: name: PARSER_CACHE_USE data: cache object (see below) action: determine if cache file can be used preventable: yes result: bool, true to use cache file, false otherwise Cache operations have been generalised in a new class, cache, extended to cache_parser, cache_renderer & cache_instructions. Details can be found in inc/cache.php For handling of above event, key properties are: - page, if present the wiki page id, may not always be present, e.g. when called for locale xhtml files - file, source file - mode, renderer mode (e.g. 'xhtml') or 'i' for instructions Other changes: - cache class counts cache hits against attempts, results are stored in {cache_dir}/cache_stats.txt - adds metadata dependency to renderer page cache - replaces purgefile dependency for renderer cache with metadata 'relation references' (internal link) dependency for wiki pages only darcs-hash:20060911021418-9b6ab-19601ed194b8c8e45236ab72c3e23d78bf777e6c.gz
* update backlink search to use metadatachris2006-09-01
| | | | darcs-hash:20060901002016-9b6ab-716518138edf541a869510d7c2934b9474547fc3.gz
* add unittests for bug#891chris2006-08-31
| | | | darcs-hash:20060831092146-9b6ab-b00aa29c982ab18117f476b3d01d5111915c9d4b.gz
* search improvementschris2006-08-31
| | | | | | | | | | | | | | | | | | | ft_snippet() - make utf8 algorithm default - add workaround for utf8_substr() limitations, bug #891 - fix some indexes which missed out on conversion to utf8 character counts - minor improvements idx_lookup() - minor changes to wildcard matching code to improve performance (changes based on profiling results) utf8 - specifically set mb_internal_coding to utf-8 when mb_string functions will be used. darcs-hash:20060831003413-9b6ab-712021eda3c959ffe79d8d3fe91d2c9a8acf2b58.gz
* ft_snippet() updatechris2006-08-27
| | | | | | | | | | | | - correct "opt1" algorithm for multibyte utf8 - minor improvement to "opt2" for short pages - add "utf8" algorithm, this algorithm endeavours to work with whole utf8 character as much as possible. The resulting snippet will tend to 100 characters, rather than the 100 bytes of "opt1" and "opt2". darcs-hash:20060826234333-9b6ab-ae4c60c8855a92b133cb8d5a230098203f610e7b.gz
* ft_snippet() update, fix utf8 problemschris2006-08-26
| | | | darcs-hash:20060826095311-9b6ab-9a6f272cc7c7532eb2bad8f7b4404c5a16b71109.gz
* code to remove bad UTF-8 bytes addedAndreas Gohr2006-08-26
| | | | | | | This adds code to remove or replace invalid UTF-8 bytes and uses it in the ft_snippets function. darcs-hash:20060826082919-7ad00-a94004de159ae93ff5b7270fd3e631ff467233cd.gz
* update to previous ft_snippet() patch, improve snippet text selectionchris2006-08-25
| | | | darcs-hash:20060825134730-9b6ab-086ee0647af39c4398cf1726324d8215722a39db.gz
* ft_snippet optimisationschris2006-08-25
| | | | | | | | | | | This patch includes two alternative algorithms for ft_snippet(), the code which prepares the snippets seen on the search page - and the most time consuming part of the production of that page. If you have $conf['allowdebug'] on, you can specify the search algorithm to use by adding &_search darcs-hash:20060825104046-9b6ab-942d81a43cf0f85bfd235cabf6c35dd4b20e0b71.gz
* namespace-restricted fulltext-search part2Michael Klier chi@chimeric.de2006-05-18
| | | | | | | | | | | | | - now its possible to restrict the fulltext-search to multible namespaces Examples: searchword @ns1 @ns2 @ns3 "exact phrase" @ns1 @ns2 @ns3 darcs-hash:20060518204647-484ab-061521a81f13360e33496e5163e3cd263a9c1ad6.gz
* namespace restricted fulltext-searchMichael Klier chi@chimeric.de2006-05-18
| | | | | | | - The fulltext-search can now be restricted to a given namespace seperated by an "@" darcs-hash:20060518161855-484ab-1617b6d2c3593525f4d29a789b0a32ebf414b9ae.gz
* file cleanupsAndreas Gohr2006-02-17
| | | | | | | | | | This patch cleans up the source code to satisfy the coding guidelines (see http://wiki.splitbrain.org/wiki:development#coding_style) It converts files to UNIX lineendings and removes tabs and trailing whitespace. Not all files were cleaned yet. darcs-hash:20060217222040-7ad00-bba3d2bee3b5aa7cbb5184258abd50805cd071bf.gz
* Wildcardsearch added #552 #632Andreas Gohr2005-11-27
| | | | | | | | | | | Now searching for word parts is possible by adding or prepending a * character to the searchword: 'foo*' searches for words beginning with 'foo' eg. 'foobar' '*foo' looks for words ending in 'foo' eg. 'barfoo' '*foo*' gets anything with 'foo' in it eg. 'barfoobaz' darcs-hash:20051127180723-7ad00-1eb29e812ddaf38d9812697bb1cffffe9a5fb330.gz
* hidepages configoptionAndreas Gohr2005-11-03
| | | | | | | | | | | | | This new option accepts a RegExp to filter certain pages from all automatic listings (RSS, recent changes, search results, index). This is useful to exclude certain pages like the ones used in the sitebar templates. The regexp is matched against the full page ID with a leading colon. If it matches the page is assumed to be a hidden one. IMPORTANT: this is not related to ACL. A hidden page is still visible to all users (if not restricted by ACL) when linked or called directly. darcs-hash:20051103101726-6e07b-8d45912a1b4f6cfc9e3fce147c15f84a58ea7ca2.gz
* ignore regexp failures when handling asian charsAndreas Gohr2005-10-09
| | | | | | | | | The new handling of asian chars as single words needs a recent PCRE library (PHP 4.3.10 is known work). If this support isn't available the regexp compilation will fail. This patch adds a workaround - this means the search will not work as expected with asian words on older PHP versions. darcs-hash:20051009124833-7ad00-1319829be5cb73246e13eb65e4c950d43c6ce5bf.gz
* asian language support for the indexer #563Andreas Gohr2005-09-25
| | | | | | | | | | | | | | Asian languages do not use spaces to seperate words. The indexer however does a word based lookup. Splitting for example Japanese texts into real words is only possible with complicated natural language processing, something completely out of scope for DokuWiki. This patch solves the problem by treating all asian characters as single words. When an asian word (consisting of multiple characters) is searched it is treated as a phrase search, looking up each charcter by it self first, then checking for the phrase in found documents. darcs-hash:20050925175451-7ad00-933b33b51b5f2fa05e736c18b8db58a5fdbf41ce.gz
* fix for backlinksAndreas Gohr2005-09-25
| | | | darcs-hash:20050925102211-7ad00-200edd676ba3956f03ec5bcc5149d4aa4bd15e24.gz
* backlinkfix for pages with special characters #548Andreas Gohr2005-09-21
| | | | darcs-hash:20050921195118-7ad00-9070166cbaa26e3f27f7b92382346a70f5c479a1.gz
* added missing ACL checks for new index based searchesAndreas Gohr2005-09-12
| | | | darcs-hash:20050912143027-7ad00-b2f3165d8db7122a453ecc63ad031af4467f691f.gz
* backlinks now use the new index based searchAndreas Gohr2005-09-12
| | | | darcs-hash:20050912141042-7ad00-5ef43525c9fd7ba44206720c54bb566450f93250.gz
* the search now uses the indexAndreas Gohr2005-09-04
| | | | darcs-hash:20050903220229-7ad00-5d95f905eaeb3f6b867aa3ee43c2a8bccc533c00.gz
* new fulltext search function using the indexAndreas Gohr2005-08-28
The new search function was added but is not yet integrated into DokuWikis interface. darcs-hash:20050828152821-7ad00-a6e79a9dc5aaf41c547cf42dccdbc3b5bc8d303e.gz