FAQ Database Discussion Community


Sitecore reindexing fails with no error/log after SC7 upgrade

lucene,sitecore
I've spent a few days working on our SC7 upgrade (from Sitecore 6.6) and I'm running into an issue rebuilding the indexes (Sitecore desktop > control panel > indexing > indexing manager > web > rebuild). I've stopped our various scheduled tasks and am using a private instance of SC,...

Scripted Fields for if/else condition in Kibana 4

elasticsearch,lucene,kibana,kibana-4
I have some numeric fields in elasticsearch, I have to implement some logic for which I need to create some scripted fields. I am new to kibana 4's scripted fields feature, so I need some help regarding a basic format that could be used for writing a basic if else...

Lucene acting up on OrientDB when confronted with fuzzy queries

indexing,lucene,orient-db,fuzzy-search
I have indexed a property on OrientDB using Lucene's keyword analyzer: CREATE INDEX Snippet.ssdeep ON Snippet (ssdeep) FULLTEXT ENGINE LUCENE METADATA {"analyzer":"org.apache.lucene.analysis.core.KeywordAnalyzer"} The filed contains simhashes that I have indexed for test. Now when I search using Lucene, I get a response for the exact queries, but not for the...

Use * (asterix) as a term query in Elastic search

php,elasticsearch,lucene
I have a document with a tag '*' Yet when I construct a term query it returns no results. How can I query documents with the tag '*'. My guess is it's a special character that needs to be escaped. Update with answer I needed to set the property to...

Solr custom UpdateRequestProcessorFactory fails with “Error Instantiating UpdateRequestProcessorFactory”

java,solr,lucene,config,solrcloud
I have a custom class extending UpdateRequestProcessorFactory doing some work on a document when it gets added to the index. This was working fine in v4.10.3 in standalone Solr. I moved to SolrCloud v5.2 and it throws this error when adding the Collection (node): ERROR - 2015-06-14 12:25:11.071; [ docs_shard1_replica1]...

How to insert multiple index at a time on a solr update using json

java,solr,lucene
I have refer different related web page for getting how can i post multiple index to solr in a single request. I have gone through the solr link http://wiki.apache.org/solr/UpdateJSON#Example but the link explain feature not that much clearly. Also i have found that create a json like this: { "add":...

Elasticsearch: hyphen in PrefixQuery on Keyword-analyzed field

elasticsearch,lucene,elastic
I have a situation where I'm putting metadata for invoices into an Elasticsearch 1.5.2 index, running on Ubuntu Linux 15.04 with Oracle JDK 8u45. One of the fields is poNumber, which often has values that look like "123-R45678" or "123-4Q5678". I'm trying to use a PrefixQuery (via the query parser)...

How to add analyzer settings in ElasticSearch?

java,indexing,elasticsearch,lucene,analyzer
I am using ElasticSearch 1.5.2 and I wish to have the following settings : "settings": { "analysis": { "filter": { "filter_shingle": { "type": "shingle", "max_shingle_size": 2, "min_shingle_size": 2, "output_unigrams": false }, "filter_stemmer": { "type": "porter_stem", "language": "English" } }, "tokenizer": { "my_ngram_tokenizer": { "type": "nGram", "min_gram": 1, "max_gram": 1 }...

Sorting on multivalued field in Solr

solr,lucene,solr-multy-valued-fields
I know multivalued field sorting is not supported in Solr . But Is there any way we can sort multivalued field in Solr. I have two documents with field custom_code and values are as below, Doc 1 : 11, 78, 45, 22 Doc 2 : 56, 74, 62, 10 When...

How to use all the cores of Solr in solrj

java,indexing,solr,lucene,solrj
I have downloaded solr 5.2.0 and have started using $solr_home/bin/solr start The Logs stated: Waiting to see Solr listening on port 8983 [/] Started Solr server on port 8983 (pid=17330). Happy searching! Then I visited http://localhost:8983/solr and created a new core using Core Admin / new Core as Core1 (...

How do I get rid of “.” at the end of a token when using solr whitespacetokenizer and worddelimiterfilterfactory

solr,lucene
I have this text to tokenize : "let's buy a PowerShot-100 camera." I am using whitespace tokenizer and then word delimiter factory . The worddelimiterfilterfactory is creating tokens like "lets", "let's", "buy" , "a" ,"Power" ,"PowerShot", ,"Shot", "100" , "PowerShot100","camera." and also "camera" . when I try to run a...

ElasticSearch Analyzer and Tokenizer for Emails

email,elasticsearch,lucene,tokenize,analyzer
I could not find a perfect solution either in Google or ES for the following situation, hope someone could help here. Suppose there are five email addresses stored under field "email": 1. {"email": "[email protected]"} 2. {"email": "[email protected], [email protected]"} 3. {"email": "[email protected]"} 4. {"email": "[email protected]} 5. {"email": "[email protected]"} I want to...

Lucene: Search for documents that dont have specific field

java,lucene,booleanquery
I need to select all documents that dont have specific field, and have right value for one field. I am trying to avoid using "null" string as value for fields that are null, so by lucene, those fields are not saved for those documents. Document structure looks like this class...

Matching elasticsearch data indexed by Titan

solr,elasticsearch,lucene,graph-databases,titan
I have indexed titan data in elasticsearch, it worked fine and indexed but when i see the data in elasticsearch using REST API. the column/property name looks different than from Titan. For example i have indexed age while inserting data to Titan final PropertyKey age = mgmt.makePropertyKey("age").dataType(Integer.class).make(); mgmt.buildIndex("vertices",Vertex.class).addKey(age).buildMixedIndex(INDEX_NAME); and if...

Lucene Analyzer tokenizer for substring search

java,lucene,tokenize,analyzer
I need a Lucene Tokenizer that can do the following. Given the string "wines bottle caps", the following queries should succeed wine bott cap ottl aps wine bottl Here is what I have so far. How might I modify it to work? No query less than three characters should work....

Special Characters that can't be indexed using lucene

apache,indexing,lucene
Can some one tell me which are the special characters that cannot be indexed using Apache lucene ?

Understanding Apache Lucene's scoring algorithm

search,solr,lucene,full-text-search,hibernate-search
I'm working with Hibernate Search for months now, but still I'm not able to digest the relevance it brings. I'm overall satisfied with the results it returns, but even simplest test does not satisfy my expectation. First test was using the term frequency(tf). Data: word word word word word word...

Using FrenchAnalyzer with Neo4J

neo4j,lucene,analyzer
I am trying to use the Lucene FrenchAnalyzer with Neo4J: final GraphDatabaseService graphDatabaseService = new GraphDatabaseFactory().newEmbeddedDatabase("..."); final IndexManager index = graphDatabaseService.index(); final Index<Node> frenchIndex = index.forNodes("Entry", stringMap(IndexManager.PROVIDER, "lucene", "type", "fulltext", "to_lower_case", "true", "analyzer","org.apache.lucene.analysis.fr.FrenchAnalyzer" )); but this throws java.lang.NoSuchMethodException:...

How to query this data with elasticseach

elasticsearch,lucene
I am trying to find the oldest male person in each family. Each person in the results must be at least 18. Here is the data: Data as csv id FamilyId LastName FirstName Age Gender 1 1 Smith John 20 M 2 1 Smith Joan 20 F 3. 1 Smith...

Get results with exact match

lucene,hibernate-search
I want to do a query like that : "banana apple cherry" on a "fruit" field. All the fruits in the desserts needs to be in the query, but not all the fruits in the query needs to be in the desserts.. Here's an example.. NAME        ...

How to filter Inner Objects in Elasticsearch?

json,search,filter,elasticsearch,lucene
I have a contacts field in my documents in Elasticsearch. each element inside the contacts field is an Object itself. I want to use term or terms filter on contacts field so that it matches documents where contacts.province_id is X. I have tried contacts.province_id as search field but it doesn't...

How to create new core in Solr 5?

solr,lucene,core
Currently we are using Apache Solr 4.10.3 OR Heliosearch Distribution for Solr [HDS] as a search engine to index our data. Now after that, I got the news about Apache Solr 5.0.0 release in last month. I'd successfully installed Apache Solr 5.0.0 version and now its running properly on 8983...

Elasticsearch postfiler cancel filter

java,filter,elasticsearch,lucene,elastic
In the following query I want to filter the query results to size medium and color blue but I want aggregations to ignore that the color blue is applied. { "query":{ "bool" { "must": { "query_string": { "query": "foo" } }, "should": { // deferred } } }, "filter": {...

Force Solr not to use _version_ field during search

java,solr,lucene,full-text-search
I have a default declaration for _version_ field in my schema.xml: <field name="_version_" type="long" indexed="true" stored="true" multiValued="false"/> I've read that this field is only internally managed by Solr for concurrency management, however when I search for "002219" in return I get: "docs": [ { "person_street_t": [ "<streetName>", ], "id": "123",...

WildcardQuery Lucene does not work properly

java,lucene
I am trying to use WildCardQuery: IndexSearcher indexSearcher = new IndexSearcher(ireader); Term term = new Term("phrase", QueryParser.escape(partOfPhrase) + "*"); WildcardQuery wildcardQuery = new WildcardQuery(term); LOG.debug(partOfPhrase); Sort sort = new Sort(new SortField("freq", SortField.Type.LONG,true)); ScoreDoc[] hits = indexSearcher.search(wildcardQuery, null, 10, sort).scoreDocs; But when I insert "san " (without quotes), I want to...

Solr subquery merging issue

apache,solr,lucene,subquery
I have an issue to search with SOLR in following scenario, I'd like to get all products within my favorite tag, categories and user. I want all products which created by my favorite user without any filter but products from favorite tag or categories must be filtered with in a...

Error : Could not find the main class: org.apache.solr.util.SolrCLI

java,php,solr,lucene
When I starting solr-5.1.0 in Ubuntu by, /bin/var/www/solr-5.0.0/bin ./solr start I get an error as below, Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/solr/util/SolrCLI : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:643) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at...

Sitecore 6.6 - Setting up a Lucene index

lucene,sitecore,sitecore6
I'm working on learning how to setup and configure a Lucene search index for Sitecore 6.6. I've pieced together a base config file that indexes all items that are of type "Article" template starting at my desired location in the tree and am able to pull all the items out...

Elasticsearch boost per field with function score

elasticsearch,lucene,solr-boost
I have a query with different query data for different fields and ORed results. I also want to favor hits with certain fields. Ideally this would only increase ranking but would not cause results that did not contain some of the terms in the other fields. This would skew results...

Search for nodes in Neo4j with schema index

indexing,neo4j,lucene
I have a graph that has only Schema indexes and not legacy indexes as Neo4j documentation recommends. I want to search for nodes like in this example described under the legacy indexing section (exact match, start queries etc). I am wondering if this is possible with schema indexes and if...

How to wisely combine shingles and edgeNgram to provide flexible full text search?

regex,elasticsearch,lucene,odata,analyzer
We have an OData-compliant API that delegates some of its full text search needs to an Elasticsearch cluster. Since OData expressions can get quite complex, we decided to simply translate them into their equivalent Lucene query syntax and feed it into a query_string query. We do support some text-related OData...

How can I store an array of values into one index document in Zend Lucene?

php,indexing,lucene,zend-search-lucene
I've got a database with a list of registrations. Each of the registrations has 0..* tickets codes. How can I store it into an index document to be able to find a registration by one of the codes? ...

difference between index and meta in search and lucene's support

search,indexing,solr,lucene,metadata
Are there any differences between metadata and index in terms of search? m My understanding is that metadata for a document can be something such as author, keyword, etc. Index operation can be performed against the content body itself, against the metadata itself, against the metadata body+ metadata. Is this...

Matching && and || special character

java,regex,lucene
I am writing a Lucene application to match && (double ampersand) and || (OR or double pipe). I want to write a regex to match any presence of && and || in the input text. If I write something like below, it only matches for the presence or absence of...

Array included in array search with elasticsearch

arrays,elasticsearch,lucene
I have users indexed with categories as follows { id: 1 name: John categories: [ { id: 1 name: Category 1 }, { id: 2 name: Category 2 } ] }, { id: 2 name: Mark categories: [ { id: 1 name: Category 1 } ] } And I'm trying...

Combining regex of characters and Strings in JAVA

java,regex,lucene
I have a set of special characters, for ex. like ?. ^,! etc and also special string like && and | |, using these special characters and strings I have to write a regex that will escape all these special characters and string. Output must be something like this: \^\!\?\&\&\|\|....

Lucene 5.0.0 - search string with special characters

lucene,special-characters
I am using Lucene version 5.0.0. In my search string, there is a minus character like “test-”. I read that the minus sign is a special character in Lucene. So I have to escape that sign, as in the queryparser documentation: Escaping Special Characters: Lucene supports escaping special characters that...

How to search with multiple parameters in Hibernate Search 3.0.0.ga

java,lucene,hibernate-search
Using: Hibernate 3.2.7.ga Hibernate-Search 3.0.0.ga Hibernate-Anotations 3.3.0.ga Hibernate-Commons-Cnnotations 3.0.0.ga Lucene-Core 2.9.4 Lucene-Analyzers 2.9.4 Lucene-Queryparser 2.9.4 How can search with multiple parameters like: SELECT * FROM example WHERE column1 = "text1" AND (column2 = "text2" OR column2 ="text3") With Hibernate-Search documentation I only found that example of searching: Session session =...

elasticsearch update operation on not-indexed fields

elasticsearch,lucene
If I update a field in my document that is mapped as NOT indexed, will ES still re-index the whole document? If so is it because _source need to be re-indexed? Is it possible not to index _source?

Attempting to overcome Android Java jdk limitations

java,android,lucene
So I'm trying to come up with an Android app that can read a text file of Japanese text, and provide insightful information to the reader about the vocab and grammar being utilized. To do this I need a Japanese morphological analyzer to parse the non-spaced text into individual words....

Hibernate Search sorting

hibernate,lucene,hibernate-search
Hibernate search is sorting results depending on relevance, it is normal. In addition to that, if two documents are having the same score, they are ordered by their primary keys. For example, book1 : id=1, bookTitle = "hibernate search by example". book2 : id=2, bookTitle = "hibernate search in action"...

Solr 5.0.0 is not starting properly in CentOS

solr,lucene,centos,solrcloud
When I running command bin/solr start -e cloud it is not asking me to collection name and other information like no of replicas and configuration settings. I got following output Welcome to the SolrCloud example! This interactive session will help you launch a SolrCloud cluster on your local workstation. To...

ElasticSearch/Lucene query string — select “field X exists”

elasticsearch,lucene,kibana
How do I query ElasticSearch through Kibana to select items that have field X? For example, I have a mapping with fields {"a": {"type": "string"}, "b": {"type": "string"}}, and two documents {"a": "lalala"} {"a": "enoheo", "b": "nthtnhnt"} I want to find the second document without knowing what its b actually...

Is it possible to use multiple index data directory in Apache Solr?

solr,elasticsearch,lucene
I'm new in Apache Lucene/Solr. I try to move from Elasticsearch to Apache Solr. So, I have a question about following index data location configuration. in Elasticsearch # Can optionally include more than one lo # the locations (a la RAID 0) on a file l # space on creation....

Field names with the same name across types having different index/type in Elasticsearch

elasticsearch,lucene,elasticsearch-mapping
I have been reading a lot on mappings in Elasticsearch and here's something interesting that I found Field names with the same name across types are highly recommended to have the same type and same mapping characteristics (analysis settings for example). There is an effort to allow to explicitly "choose"...

Sort in lucene by the start

sorting,lucene
I try to sort a Lucene search, but I can not find the better way to sort the results. I want first the result which start with my expression and the secondary order is alphabetical. There are any way to sort in Lucene by the start? I tried with a...

Recycling app pool each time something is published

.net,iis,lucene,umbraco,application-pool
I'm working on an Umbraco site where I have custom sections, and therefore use the application.config and trees.config files. I have a problem where every time I publish something, the app pool recycles with the following message: w3wp.exe Information: 0 : _shutDownMessage=CONFIG change HostingEnvironment initiated shutdown CONFIG change HostingEnvironment caused...

lucene.net - how to update an index very frequently?

lucene,lucene.net,windows-azure-queues
I have an Azure WebJob with a Queue that receives items to process. There can be many items to process every second. The Queue process around 20 items simultaneously. I want to index the items with Lucene .net. Starting an IndexWriter, calling Optimize() and Disposing it on every item that...

Solr 5.0: Unable to start Solr with Zookeeper Ensemble

solr,lucene,solrcloud
I have 3 Zookeeper servers running at server1:2181, server2:2181 and server3:2181. I want to start 4 Solr servers at server1:8983,server2:8983,server3:8983 and server4:8983 to point to Zookeeper Ensemble above. So at server1, I run a command: bin>solr -c -z server1:2181,server2:2181,server3:2181 -m 2g and I received an error message: Missing operand. Invalid...

Lucene get all non deleted document from index file

java,indexing,lucene
I am trying to get all documents from Lucene Index (which is already not deleted ). I heard that if I delete something from Lucene Index, Lucene will not delete immediately from file. So I wanted to get the documents from Index file which is not deleted....

Analyzer for searching phrases in ElasticSearch

java,elasticsearch,lucene,analyzer,query-analyzer
I am using ElasticSearch 1.5.2. I want to allow searching of phrases in my search engine. Suppose the text is read with section 114 of the Indian Penal Code Using the default analyzer I am not able to get any results on the search query section 114 penal code So,...

Is this known PerFieldPostingsFormat bug in Lucene 4.1 or is this user error

java,lucene
Is this a Lucene (4.1.0) bug or user error, Im assuming bug because user coe is just passing a search to Lucene, but I cant find anything in JIRA java.lang.NullPointerException at java.util.TreeMap.getEntry(TreeMap.java:342) at java.util.TreeMap.get(TreeMap.java:273) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.terms(PerFieldPostingsFormat.java:215) at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:58) at...

Solrcloud multicore configuration

solr,lucene,multicore,sharding,solrcloud
I have a standalone Solr instance with 4 different cores working fine using the embedded Jetty server. I configured the cores for v4.10.3 but since I moved to v5.1 and all seems to work fine without any changes. Before going into production, I need to set it up as a...

Questions about Lucene

java,lucene
I am working on Lucene, and had some questions about some queries which have been giving me different results. The three query are: Q1 = "Java 8 is verified to be compatible" Q2 = "Java 8 not verified as a compatible" Q3 = "Java 8 not verified as compatible" I...

Cannot get results for Solr queries starting with stop words

json,search,data,solr,lucene
I am a newbie to Solr and I am trying to configure the platform allowing queries that start with a stop word. I have the following document { "responseHeader":{ "status":0, "QTime":1, "params":{ "indent":"true", "q":"*:*", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "weight_metric":0.3, "maximumPowerDraw":9, "beamAngle":50, "name_de":"German", "type":["product"], "id":"5dac69a9-7d54-43f9-b815-0a54e519a1f0", "name":"Aloa something" }] }} With a field...

Elasticsearch filter only if no matches to first filter

elasticsearch,lucene
My use case is for searching UK addresses where there is a well defined postal code system however my users may still make mistakes in the postcode. I want to use a filter as in most cases the user will get the postcode right and I do not want to...

Hibernate Search not indexing items from database

java,spring,hibernate,lucene,hibernate-search
I'm trying to integrate Hibernate Search in my application. Rough summary of what needs to be done: Spring Batch reads out an XML file and persists the objects to database. This is done with de JDBCBatchItemWriter. Not the HibernateItemWriter because of slow performance. After all items are inserted I would...

Liferay 6.2 clustering issue with multicast

lucene,cluster-computing,liferay-6,ehcache,jgroups
I am trying to cluster ehcache and lucene with Liferay 6.2 EE sp2 bundle on 2 servers with mutlicast enabled. WE have Apache HTTPD servers fronting tomcat servers using reverse proxy. A valid 6.2 license is deployed on both the nodes. We user the following properties in the portal-ext.properties: cluster.link.enabled=true...

Fast indexing using multiple ES nodes?

java,indexing,elasticsearch,lucene,scalability
All I read and understand about running multiple ES nodes is to enable index replication and scaling. I was wondering if it could help us to make indexing faster for large number of files. I have two questions and they are as follows: Question 1: Would it be accurate to...

Searching a TextField and IntField together seperated by an AND condition In Lucene

java,search,indexing,lucene
I have indexed my documents as: doc.add(new IntField("ID", id, Field.Store.YES)); doc.add(new TextField("First_Name", First_Name, Field.Store.YES)); doc.add(new TextField("Last_Name", Last_Name, Field.Store.YES)); doc.add(new TextField("Address", add, Field.Store.YES)); doc.add(new TextField("City", city, Field.Store.YES)); doc.add(new TextField("State", state, Field.Store.YES)); doc.add(new IntField("Zip_Code", zip, Field.Store.YES)); Where id, FirstName, city, add, state, zip are variables that store the values to be indexed....

How to convert simple groovy script to lucene expression (or use different method)?

elasticsearch,lucene
I am using following options for search: scriptSort = _script: script: "if(doc['user.roles'].value=='contributor') return 1; else return 2;", type: "number", order: "asc" options = query: ... size: ... from: ... aggs: ... sort: [scriptSort] As you can see I am using _script option for sorting results. The problem is that search...

LUCENE_40 cannot be resolved or is not a field

java,lucene
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_40); I am runing lucene search code and I am getting an error in above line saying LUCENE_40 cannot be resolved or is not a field I am using lucene 5.1.0 version. I have removed the version.LUCENE_40 from the standard analyzer parameters, so now there's no...

Pylucene 4.9.0 Ubuntu 14.04 Installation ImportError

python,ubuntu,lucene,pylucene
I've been trying to install Pylucene on my Mac for a little over a week, and have given up on that in favor of trying to install it with Ubuntu through a virtual machine. I thought the installation process had gone well, so I fired up Python in the terminal...

EdgeNGram: Error instantiating class: 'org.apache.lucene.analysis.ngram.EdgeNGramFilterFactory'

apache,solr,lucene
I've set up Solr, so far everything's working just dandy, but now I wanted to add the EdgeNGram functionality to my searches. However, as soon as I throw it into my schema.xml, it starts throwing the error: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load conf for core collection1: Plugin init failure for [schema.xml]...

How to combine neo4j and elasticsearch

maven,elasticsearch,neo4j,lucene
I am developing a Question answering application and for that I need to use neo4j and elasticsearch in the same maven project. I am using elasticsearch to make my application more robust. As we know that neo4j and elasticsearch works on different version of lucene, so whichever version I include...

How to check if document exists in lucene index?

lucene
I have an index of news articles, where i save title,link,description of news.. sometimes its possible that same news from same link is published with different titles by different news sources. it don't want exactly same description articles to be added twice..how to find if document already exists?

Lucene TFIDF does not return 1 for exactly same query with certain document

lucene,tf-idf
I implemented a program to rank documents based on its TFIDF similarity score given a user input. Following is the program: public class Ranking{ private static int maxHits = 10; private static Connection connect = null; private static PreparedStatement preparedStatement = null; private static ResultSet resultSet = null; public static...

Cannot Find Proper solrconfig.xml file for configuration in solr 5.1.0

java,php,solr,lucene,solrcloud
I have setup Solr 4.7 before and I had configured solrconfig.xml file in my core for dataimport requestHandler and it was working fine. But when I setup Solr 5.1.0, what is the location of solrconfig.xml file for particular core? Where is it located?...

How to handle Multiple MySQL Tables by DataImportHandler in Solr?

php,mysql,solr,lucene
I have 33 Tables in MySQL. Around 20 Tables will use in Search. What to do to Handle and Search in All this Tables? I have already Implement this by Importing 1 table and search it clearly. But now I want to search in all tables.. Do I create all...

Lucene search scoring issue

java,lucene,luke
I have two indexes created from directories "test1" and "test2". "test1" directory has "file1.java" whereas "test2" has two files "file1.java" and "file2.java" in it. "file1.java" is identical in both the directories. Let the indexes be index1 and index2 respectively. Now when I analyze these two indexes using luke, I find...

Finding number of unique terms over multiple fields

java,lucene
I need to find number (or list) of unique terms over a combination of two or more fields in Lucene-Java. I am using Java libraries for Lucene 4.1.0. I checked questions such as this and this, but they discuss finding list of unique terms from a single (specific) field, or...

Lucene vs Solr, indexning speed for sampe data

java,indexing,solr,lucene,full-text-search
I have worked upon Lucene before and now moving towards Solr. The problem is that I am not able to do Indexing on Solr as fast as Lucene can do. My Lucene Code: public class LuceneIndexer { public static void main(String[] args) { String indexDir = "/home/demo/indexes/index1/"; IndexWriterConfig indexWriterConfig =...

How can I query lucene based on a lucene search result?

java,lucene
Here's the problem I'm trying to solve: I have multiple lucene indices, each containing a subset of the same data structure (they have the same fields, but the fields may or may not be present in a document in a certain index) There is a global identifier that is shared...

JVM crashes frequently

jboss,elasticsearch,lucene,jvm,jvm-hotspot
JVM crashes surprizingly and frequently on our prod environment and results in Jboss (EAP6.3) going down. We have java7 U72 installed Crash logs has same output where current thread is: Current thread (0x00000000d1d99000): JavaThread "Lucene Merge Thread #0" daemon [_thread_in_Java, id=1144, stack(0x00000000f6a00000,0x00000000f6b00000)] and all the log is full of :...

Lucene: How to search for at least m out of n words

solr,elasticsearch,lucene
Suppose I have 5 words that I'm searching for. Is there a way to specify that the matching documents should have at least 4 of those words?

How to combine a search phrase with a wildcard using Lucene.Net?

c#,lucene,lucene.net
I am passing a search query to the Lucene QueryParser.Parse(string query) method, and then passing the result to Searcher.Search(Query query, int n). A string of: "system cleaner" returns 1 hit. A string of: "system clean*" or: "system clean\*" returns 0 hits. How can I provide a search query that uses...

solrException. XML parser doesn't support XInclude option

xml,tomcat,solr,lucene,xinclude
After configuring solr4.7.2 with tomcat 7, got the error in solrAdmin page stating SolrCore Initialization Failures fran92:org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: XML parser doesn't support XInclude option My solr.xml file contains one core <?xml version="1.0" encoding="UTF-8" ?> <solr persistent="true"> <cores host="${host:}" adminPath="/admin/cores" hostContext="${hostContext:solr}"> <core config="solrconfig.xml" name="fran92" instanceDir="generic" schema="schema.xml"...

Lucene Query not working

java,lucene
I am using Apache Lucene 5.0.0 and ran into problems using QueryParser. I tried to create a Query but I get a ParseException. The following is my code: import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.queryparser.classic.ParseException; import org.apache.lucene.queryparser.classic.QueryParser; public class QueryTest { public static void main(String[] args) { QueryParser parser = new QueryParser("field", new...

How to add multiple suggesters definition in solr search components

java,apache,solr,lucene,autosuggest
I am using solr 5.1. I am trying to configure multiple suggester definition in Solr search component according to Apache solr wiki. I have configured single suggester perfectly and it works perfect but whenever I try to configure multiple suggester it gives me following errors java.lang.NullPointerException at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:190) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)...

Fuzzy search not working with dismax query parser

solr,lucene
There is a field in my schema 'fullText' which is of the 'text_en' type, and multivalued. The term 'tests' is in the fullText field in one document. In solr, when I try to search using the word 'test', with the standard lucene parser with minimal distance 1, its returning the...

Solr - How to search in all fields without passing query field?

php,solr,lucene
I have tried as below, <field name="collector" type="text_general" indexed="true" stored="false" multiValued="true" /> and copy all my fields to copyField as below, <copyField source="fullname" dest="collector"/> <copyField source="email" dest="collector"/> <copyField source="city" dest="collector"/> and also I have put all copyField tags below <fields> </fields> tags. But I cant search in all fields. I...

Lucene doesn't search number fields

lucene
I'm trying to index and then search integer field with lucene. But it doesn't find anything (Text fields search well). Document doc = new Document(); //UserType = 1 doc.add(new IntField("userType", user.getType().getId(), Field.Store.YES)); FSDirectory dir = FSDirectory.open(FileSystems.getDefault().getPath(indexDir)); IndexWriterConfig config = new IndexWriterConfig(new StandardAnalyzer()); writer = new IndexWriter(dir, config); writer.addDocument(doc); For search...

How to resolve NoSuchFieldError exception when testing Lucene 4.0

java,lucene
I want to test my own Analyzer. Following is test code from Lucene in Action 2nd Edition, Code List 4.2, page 121. public class AnalyzerUtils { public static void displayTokens(Analyzer analyzer, String text) throws IOException { TokenStream tokenStream = analyzer.tokenStream("contents", new StringReader(text)); displayTokens(tokenStream); } public static void displayTokens(TokenStream stream) throws...

Linkage Error in Tomcat when 2 webapp instances load lucene classes

java,tomcat,lucene,jvm,java.lang.linkageerror
I'm running a tomcat 8 container with 2 different webapps, 1 prod and 1 sandbox. All the classes/libs and compilation is the same with just some minor differences in the config parameters. I'm using lucene core 4.10.4 (via hibernate search). Both apps startup just fine, now after startup if I...

Can Solr run without Lucene?

oracle,apache,search,solr,lucene
I have an application for Solr that would work great--I'm using it to query an Oracle database and having success with what I'm seeing. However, the way I have it set up today, it imports the data from Oracle into a local database (I gather this is called Lucene) at...

How to create a directory on my cloudfoudry account?

lucene,cloud,command-line-interface,cloudfoundry
I deploy to cloudfoundry an application which need a directory for lucene. This application failed to start because there is not the configured directory. I search on Cloundfoundry forums but i did not found how to create a directory on my server on CloudFoundry. If someone have a documentation or...

What is causing the 'SolrException: Schema Parsing Failed: unknown field' error?

xml,solr,lucene,solr-schema
I am attempting to configure a SOLR 4.7.1 single instance, single core setup and on startup SOLR throws an error: Schema Parsing Failed: unknown field 'INVENTORY_ITEM_ID'. Schema file is /var/solr/cores/intota-inventory/schema.xml I believe SOLR is complaining that the <uniqueKey> has not been defined in schema.xml. I say this because whatever field...

Is it possible to select dynamically a field to use in hibernate search

java,hibernate,lucene,hibernate-search
I have something like this : class A : @Entity @Indexed public class A { @Fields({ @Field(name="a"....) @Field(name="b"....) )} private String someField; .... } } And class B: @Entity @Indexed public class B { @IndexedEmbedded @ManyToOne private A a; ...... } I would like to use @Field 'a' when indexing...

How do you install PyLucene on OpenSuSE or another rpm based distro?

linux,lucene,rpm,opensuse,pylucene
I'm trying to install pylucene on opensuse; is there an rpm package in the repositories, or a repository I could add? On Ubunty this would be: sudo apt-get install pylucene I don't have any experience with rpm based distros, so a basic level explanation would be great. Thanks!...

Lucene query parser to use filters for wildcard queries

java,lucene
My problem is how to parse wildcard queries with Lucene that the query term is passed through a TokenFilter. I'm using a a custom Analyzer with several filers (e.g. ASCIIFoldingFilter, but that's only an example). My problem is that whenever Lucene's QueryParser detects that one of the sub-queries is a...

Greek words stemming Lucene

lucene,stemming
Is there any way to stem single Greek words with Lucene? Do I need to index the String, or there a simpler way? I did some research and I found this link, but I don't really know how to use the Greek Stemming Filter.

Neo4j automatic index doesn't work first time

lucene,neo4j,full-text-search,cypher,spring-data-neo4j
I have a following test: @Test public void testAutoIndexingAndFuzzySearch() { GraphDatabaseService graphDb = template.getGraphDatabaseService(); Index<Node> autoIndex = graphDb.index().forNodes("node_auto_index"); graphDb.index().setConfiguration(autoIndex, "type", "fulltext"); graphDb.index().setConfiguration(autoIndex, "to_lower_case", "true"); graphDb.index().setConfiguration(autoIndex, "analyzer", StandardAnalyzerV36.class.getName()); sampleDataGenerator.generateSampleDataJava(); List<Product> products = //...

Matching data with Lucene.net

c#,lucene,lucene.net
I'm attempting to try and match a term to a list of products in my database. Let's start lucene with some simple data: //Table Products Glue Glue Sticks Crayons Markers Here's the tricky part: I'm attempting to match the best result but there may be junk data involved (Later in...

Customize score for certain condition in Lucene TFIDF

java,sorting,lucene,ranking,tf-idf
I have a program that takes an input query and ranks the similar documents based on its TFIDF score. The thing is, I want to add some keywords and treat them as the "input" as well. These keywords will be different for each query. For example if the query is...

Azure Cloud Service Disappearing From Azure

azure,lucene,azure-worker-roles,azure-cloud-services
I have an Azure Cloud Service Worker Role that I'm using to maintain a Lucene Index. The service has been completely removed from Azure twice. The first time I thought someone may have inadvertently deleted, but I don't believe this to be the case. Has anyone else ran into this...

Is the default CQ5 Search Configuration incorrect?

lucene,cq5,jackrabbit
i need to optimize the CQ5 lucene indexing configuration for my application. I want to provide a custom search configuration but i struggle to really understand the default configuration. Source: https://helpx.adobe.com/experience-manager/kb/SearchIndexingConfig.html) First question: Are the "include"-tags used in the default configuration correct? For example: The default configuration uses the tag...

Lucene how to index in Database (Cassandra)

java,indexing,lucene,cassandra
I am just experimenting with Lucene and want to indexing objects in Database(Cassandra) as a table. But, I didnt realized out, how the indexing does work on Cassandra. Especially searching... When i take a simple Example Indexing in Lucene: Document doc = new Document(); doc.add(new TextField("id", "Hotel-1345", Field.Store.YES)); doc.add(new TextField("description",...

Zend Search Lucene case insensitive search doesn't work

php,lucene,zend-search-lucene
I've got a Search class, wich has public function __construct($isNewIndex = false) { setlocale(LC_CTYPE, 'ru_RU.UTF-8'); $analyzer = new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8_CaseInsensitive(); $morphy = new Isi_Search_Lucene_Analysis_TokenFilter_Morphy('ru_RU'); $analyzer->addFilter($morphy); Zend_Search_Lucene_Analysis_Analyzer::setDefault($analyzer); Zend_Search_Lucene_Search_QueryParser::setDefaultEncoding('utf-8'); //if it's true, then it creates new folder to the path in $_indexFieles; if ($isNewIndex) {...

Do 'reduce' with results from Cloudant search?

lucene,couchdb,cloudant
In Cloudant is it possible to do something like a reduce on a set of results from a search index (as opposed to a view)? In my case, I'd like to find all documents that have a title value that includes 'foo', then for each of these sum the total...

How to optimize indexation on elasticsearch?

indexing,elasticsearch,lucene
I am trying to understand how indexing can be optimized on elasticsearch. Let me clarify my needs; I have two indices rigth now. Lets say, indexA and indexB ( Two indices can be seen approximately same size) I have 6 machines dedicated to elasticsearch (we can say exactly the same...