solr,cassandra,datastax,datastax-enterprise , dse cassandra solr doesnt return _uniqueKey in response


dse cassandra solr doesnt return _uniqueKey in response

Question:

Tag: solr,cassandra,datastax,datastax-enterprise

Im using Datastax 4.6. My solr client queries data by using _uniqueKey. From version 4.6 the limitation about using simple primary key is removed. How can i configure solr or create table in cassandra, so that I receive in solr response information about synthetic key _uniqueKey. There is no problem when i use compound keys, only with simple.

DROP TABLE IF EXISTS unitable;
CREATE TABLE IF NOT EXISTS unitable (
    "depId" INT PRIMARY KEY,
    "parentId" INT,
    "name" text
);

INSERT INTO unitable ( "depId", "parentId", "name" ) VALUES ( 689, 2, 'test' );

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<schema name="autoSolrSchema" version="1.5">
<types>
<fieldType class="org.apache.solr.schema.TrieIntField" name="TrieIntField"/>
<fieldType class="org.apache.solr.schema.TextField" name="TextField">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
</types>
<fields>
<field indexed="true" multiValued="false" name="depId" stored="true" type="TrieIntField"/>
<field indexed="true" multiValued="false" name="name" stored="true" type="TextField"/>
<field indexed="true" multiValued="false" name="parentId" stored="true" type="TrieIntField"/>
</fields>
<uniqueKey>depId</uniqueKey>
</schema>

no _uniqueKey in response

<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">5</int>
</lst>
<result name="response" numFound="1" start="0" maxScore="1.0">
  <doc>
    <int name="depId">689</int>
    <str name="name">test</str>
    <int name="parentId">2</int></doc>
</result>
</response>

Answer:

The synthetic _uniqueKey field is generated and returned only when using compound keys; in case of simple keys, the key field is directly used.


Related:


Cassandra: Insert with older timestamp


cassandra,cql3
(Cassandra 2.0.9, using CQL) I've accidentally updated a row in a table which was managing its own timestamp (100 * a specific sequence number). Now, because my timestamp is the current time, none of the updates are working. I understand why this is, but I'm trying to recover from it....

Is it possible to index views in Apache Solr


sql,view,solr
Let me first give you an example. I have two tables -table1 and table2. table1 has a field id_table2, which is a foreign key and references one of the fields in table2. So, when I want to scan table1, I make a query like: SELECT t1.attr_1_, t1.attr_2_, t2.attr_3_ FROM table1...

How to use all the cores of Solr in solrj


java,indexing,solr,lucene,solrj
I have downloaded solr 5.2.0 and have started using $solr_home/bin/solr start The Logs stated: Waiting to see Solr listening on port 8983 [/] Started Solr server on port 8983 (pid=17330). Happy searching! Then I visited http://localhost:8983/solr and created a new core using Core Admin / new Core as Core1 (...

How to update a field which is indexed?


scala,cassandra,phantom-dsl
I want to update a field in Cassandra which is indexed using phantom scala sdk like: this.update.where(_.id eqs folderId) .and(_.owner eqs owner) .modify(_.parent setTo parentId) the parent field is a indexed field in table. But the operation is not allowed when compile the code, there will have compile exception like:...

SOLR - highlight searching text ? Is this possible


solr,solrj,solr-highlight
I'm beginning with SOLR so please don't flame me if this question is stupid or something like this. I was reading solr documentation and found out that there is something called "highlight". I have really simple query: /select?q=text:test&wt=json&indent=true text is a field in my index and I'm trying to highlight...

How to change the flush queue size of cassandra


cassandra,datastax-enterprise,datastax-java-driver
How to assign more memory for the flush queue between memtable and sstable in Cassandra. I have getting timeout errors and the heap and young region usage seems to within limits. There is no other processing happening except Cassandra in the machine. Also how to find if any requests are...

Fuzzy search not working with dismax query parser


solr,lucene
There is a field in my schema 'fullText' which is of the 'text_en' type, and multivalued. The term 'tests' is in the fullText field in one document. In solr, when I try to search using the word 'test', with the standard lucene parser with minimal distance 1, its returning the...

Preparing Cassandra SELECT Statements in Python


python,cassandra
I'm trying to run prepared select queries against a Cassandra table. The table is defined as such: class EmailAddressLookup(Model, ModelOperations, JSONSerializer): __table_name__ = 'email_address_lookup' email_address = columns.Text(primary_key=True) user_id = columns.Integer(primary_key=True) My INSERT works great. It looks like this: i_email_lookup = session.prepare("""INSERT INTO email_address_lookup (user_id, email_address) VALUES (?, ?)""") session.execute(i_email_lookup, (user_id,...

OutofMemoryErrory creating fat jar with sbt assembly


jar,cassandra,apache-spark,sbt
We are trying to make a fat jar file containing one small scala source file and a ton of dependencies (simple mapreduce example using spark and cassandra): import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import com.datastax.spark.connector._ import org.apache.spark.SparkConf object VMProcessProject { def main(args: Array[String]) { val conf = new SparkConf() .set("spark.cassandra.connection.host", "127.0.0.1") .set("spark.executor.extraClassPath",...

Solr 5.1.0 - Apache TikaEntityProcessor Cannot Find My Files


mysql,solr,tika
Solr, more specifically Tika, is having some problems finding my file whose filepath is retrieved from a database. Whenever I go to index it logs errors saying that this can't find the file. I'm basically doing what this guy is doing here, which is taking a file path from a...

Cassandra node almost out of space, but nodetool cleanup is increasing disk use?


cassandra
One of our nodes was at 95% disk use and we added another node to the cluster to hopefully rebalance but the disk space didn't drop on the node. I tried doing nodetool cleanup assuming that excess keys were on the node, but the disk space is increasing! Will cleanup...

Cassandra data model to store embedded documents


mongodb,database-design,cassandra
In mongodb we can able to store embedded documents into a collection.Then, How do we store embedded documents into cassandra??? For this sample JSON representation??? UserProfile = { name: "user profile", Dave Jones: { email: {name: "email", value: "[email protected]", timestamp: 125555555}, userName: {name: "userName", value: "Dave", timestamp: 125555555} }, Paul...

solrcloud - choosing cores for update and search requests


solr,solrcloud
I have a SolrCloud with one collection configured with compositeId and numShards=3 and replicationFactor=2. there will be about 200K inserts a day and about as many searches. from the SolrCloud documentation: "If the machine is a replica, the document is forwarded to the leader for processing." Does this means that...

Understanding Apache Lucene's scoring algorithm


search,solr,lucene,full-text-search,hibernate-search
I'm working with Hibernate Search for months now, but still I'm not able to digest the relevance it brings. I'm overall satisfied with the results it returns, but even simplest test does not satisfy my expectation. First test was using the term frequency(tf). Data: word word word word word word...

Using partition key along with secondary index


cassandra,nosql,bigdata,cassandra-2.0
Following are the two queries that I need to perform. select * from where dept = 100 and emp_id = 1; select * from where dept = 100 and name = 'One'; Which of the below options is better ? Option 1: Use secondary index along with a partition key....

New Datastax driver for Tableau is not working


cassandra,odbc,tableau,datastax
trying to run Tableau on top of DSE 4.7. It fails. I can't do something in worksheet or preview the data. Get this error: "Missing EOF at 'tablename_i_try_to_query' " What is the right way to fix it?...

What indexer do I use to find the list in the collection that is most similar to my list?


search,indexing,solr,levenshtein-distance
Lets say I have my list of ingredients: {'potato','rice','carrot','corn'} and I want to return lists from a database that are most similar to mine: {'beans','potato','oranges','lettuce'}, {'carrot','rice','corn','apple'} {'onion','garlic','radish','eggs'} My query would return this first: {'carrot','rice','corn','apple'} I've used Solr, and have looked at CloudSearch, ElasticSearch, Algolia, Searchify and Swiftype. These engines only...

dse cassandra solr doesnt return _uniqueKey in response


solr,cassandra,datastax,datastax-enterprise
Im using Datastax 4.6. My solr client queries data by using _uniqueKey. From version 4.6 the limitation about using simple primary key is removed. How can i configure solr or create table in cassandra, so that I receive in solr response information about synthetic key _uniqueKey. There is no problem...

Solr 5.1.0: How to set the unique key via Schema API


solr,schema,unique-key
In Solr 5.1.0, is it possible to set the unique key via the REST schema api? I created a collection with the data driven schema. Solr would guess what the field type and create the field based on the data I upload. I can still define fields beforehand by sending...

Apache Cassandra - cqlsh operation timeout


cassandra,cqlsh
I am trying to start cqlsh and this is what I get: /bin$ ./cqlsh Connection error: ('Unable to connect to any servers', {'127.0.0.1': OperationTimedOut('errors=None, last_host=None',)}) I tried removing ~/.cassandra, did not work. I also compared cassandra.yaml with a version that worked. Any ideas?...

How to store the file path of an indexed document in Apache Solr 5.1.0


mysql,solr
I'm trying to store the file path of an locally stored indexed document in Apache Solr so I can then update the index with metadata that is stored in a DB in MySQL. That file path is how I'm going to relate the document to its corresponding metadata I already...

Select first N rows of Cassandra table


cassandra,cql
As stated in this doc to select a range of rows i have to write this: select first 100 col1..colN from table; but when I launch this on cql shell I get this error: <ErrorMessage code=2000 [Syntax error in CQL query] message="line 1:13 no viable alternative at input '100' (select...

Does Cassandra works with IBM JVM


cassandra,j9
Can I install and start Cassandra into a x-linux OS with a IBM SDK for Java? Will that work? Any specific version? 2.1, 2.0 that will work ? Thanks in advance.

Heap memory Solr and Elasticsearch


solr,elasticsearch
I'm just reading the book Mastering Apache Solr and the writer recommends to set the minimum heap size (-Xms) to 2GB and the maximum heap size (-Xmx) to 12GB. Is 2GB necessary? I just use a 512MB server (which is low, I know) for Solr and I found it already...

Subentity SolrEntityProcessor stops working since SolR 5.x


solr,dataimporthandler,solr5
I use a data import like this <dataConfig> <document name="products"> <entity name="outer" dataSource="my_datasource" pk="id" query="..." deltaQuery="..." deltaImportQuery="..." > <entity name="solr" processor="SolrEntityProcessor" url="http://127.0.0.1:8983/solr/${solr.core.name}" query="Xid:${outer.Xid}" rows="1" fl="Id,FieldA,FieldB" wt="javabin" /> </entity> </document> </dataConfig> The interesting part is the sub entity, which uses SolrEntityProcessor. Until (including) SoLR 4.10 everything...

Django-Haystack with Solr: Searching by page description meta tags


solr,django-haystack,django-cms
I've been digging around and can't seem to find a way to create a search index for the page description meta tags using Haystack and Solr. Does anyone have experience with this, or any tips? I have looked at the page model in cms, but can't figure out how to...

Rails4 + sunspot search


mysql,ruby-on-rails,solr,sunspot
I am trying to use sunspot solr for searching with Rails 4 and mysql. I defined a searchable block in my model(eg XYZ): searchable do text :name, :stored => true string :id, :stored => true end I just want to search in "name". The "id" is the primary key. There...

Dataframe is not saved into Cassandra


java,cassandra,apache-spark,apache-spark-sql,spark-cassandra-connector
I have one application with Spark (version 1.4.0) and Spark-Cassandra-connector (version 1.3.0-M1). In which, I am trying to store one dataframe into Cassandra table which has two columns (total, message). And i already created table into Cassandra with these two columns. Here is my Code, scoredTweet.foreachRDD(new Function2<JavaRDD<Message>,Time,Void>(){ @Override public Void...

How to delete a record in Cassandra?


cassandra,cassandra-2.0,cql3
I have a table like this: CREATE TABLE mytable ( user_id int, device_id ascii, record_time timestamp, timestamp timeuuid, info_1 text, info_2 int, PRIMARY KEY (user_id, device_id, record_time, timestamp) ); When I ask Cassandra to delete a record (an entry in the columnfamily) like this: DELETE from my_table where user_id =...

Error when running job that queries against Cassandra via Spark SQL through Spark Jobserver


cassandra,apache-spark,apache-spark-sql,spark-jobserver,spark-cassandra-connector
So I'm trying to run job that simply runs a query against cassandra using spark-sql, the job is submitted fine and the job starts fine. This code works when it is not being run through spark jobserver (when simply using spark submit). Could someone tell my what is wrong with...

Spark Cassandra SQL can't perform DataFrame methods on query results


scala,cassandra,apache-spark-sql,spark-cassandra-connector
So I have a Spark-Cassandra cluster that I am trying to execute sql queries on. I build a jar with sbt assembly then I submit it with spark-submit. This works fine when I am not using spark-sql. When I am using spark sql I get an error, below is the...

How do I combine Facet and FilterQueries using Spring data Solr?


spring,solr,filtering,facet
Is it possible to combine a facet and field query in spring data solr? Something that would build a query like this: > http://localhost:8983/solr/myCore/select?q=lastName%3AHarris*&fq=filterQueryField%3Ared&wt=json&indent=true&facet=true&facet.field=state In other words, how do I add FilterParameters to a SimpleFacetQuery? Any/all replies welcome, thanks in advance, -- Griff...

Error running spark app using spark-cassandra connector


cassandra,apache-spark,spark-cassandra-connector
I have written a basic spark app that reads and writes to Cassandra following this guide (https://github.com/datastax/spark-cassandra-connector/blob/master/doc/0_quick_start.md) This is what the .sbt for this app looks like: name := "test Project" version := "1.0" scalaVersion := "2.10.5" libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "1.2.1", "com.google.guava" % "guava" % "14.0.1",...

Is it possible to use a timestamp in ms since epoch in select statement for Cassandra?


cassandra,timestamp,cql
I know that using the formats listed here (http://docs.datastax.com/en/cql/3.0/cql/cql_reference/timestamp_type_r.html) work to query cassandra. However, I'm having a hard time determining if it is even possible to use ms since epoch in the select statement. I feel like it should since it data can be sent to cassandra in the ms...

solrException. XML parser doesn't support XInclude option


xml,tomcat,solr,lucene,xinclude
After configuring solr4.7.2 with tomcat 7, got the error in solrAdmin page stating SolrCore Initialization Failures fran92:org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: XML parser doesn't support XInclude option My solr.xml file contains one core <?xml version="1.0" encoding="UTF-8" ?> <solr persistent="true"> <cores host="${host:}" adminPath="/admin/cores" hostContext="${hostContext:solr}"> <core config="solrconfig.xml" name="fran92" instanceDir="generic" schema="schema.xml"...

How can I sort by realtime score in solr?


solr
Now I have a solr collection: question question has some field: id answer_count created_at updated_at now I have the sort rule: score = answer_count * 100 - (the hours now to created_at) * 5 then I need to sort by the score desc. how can i do that because of...

How to un-nest a spark rdd that has the following type ((String, scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,Int]]))


scala,cassandra,apache-spark
Its a nested map with contents like this when i print it onto screen (5, Map ( "ABCD" -> Map("3200" -> 3, "3350.800" -> 4, "200.300" -> 3) (1, Map ( "DEF" -> Map("1200" -> 32, "1320.800" -> 4, "2100" -> 3) I need to get something like this Case...

to alter or create a new table in cassandra to add new columns


database-design,cassandra,datastax,datastax-enterprise
I am using DSE cassandra. I wanted to add new attributes to the existing table. I wanted to know what is the best practice to achieve this? Should i be adding new columns to existing table or creating new table? What are the pros and cons for either approach?...

Solr 4.10.2 MySQL import fails with java.io.EOFException


mysql,solr
I'm trying to migrate a server with Solr 4.7.2 on it. I have a Solr 4.10.2 with 4 cores running which is the new machine. I have an importer running on the old machine that poses no problem. However, when trying to run the importer on the new machine, I...

Getting application/json back from a Solr query


java,json,solr,jersey,jersey-client
I'm calling the Solr REST api using a Jersey client: final ClientResponse resp = client().path(queryPath()) .queryParam("q", query.getQuery()) .queryParam("wt", "json") .accept(MediaType.APPLICATION_JSON_TYPE) .get(ClientResponse.class); resp.getEntity(HttpResponse.class) and when I run it I get: A message body reader for Java class challenger.HttpResponse, and Java type class challenger.HttpResponse, and MIME media type text/plain; charset=UTF-8 was not...

Does Spark from DSE laod all data into RDD before running SQL Query?


cassandra,apache-spark,datastax
Running DSE 4.7 So say I have a 4 node DSE Cassandra/Spark cluster... I have a Cassandra table with say 4,000,000 records in it. On Spark running the following Spark SQL "select * from table where email = ? or mobile = ?" Will Spark load all the data into...

Solr custom UpdateRequestProcessorFactory fails with “Error Instantiating UpdateRequestProcessorFactory”


java,solr,lucene,config,solrcloud
I have a custom class extending UpdateRequestProcessorFactory doing some work on a document when it gets added to the index. This was working fine in v4.10.3 in standalone Solr. I moved to SolrCloud v5.2 and it throws this error when adding the Collection (node): ERROR - 2015-06-14 12:25:11.071; [ docs_shard1_replica1]...

cassandra search a row by secondary index returns null


cassandra,secondary-indexes
I have created a TABLE and index As follows CREATE TABLE refresh_token ( user_id bigint, refresh_token text, access_token text, device_desc text, device_type text, expire_time timestamp, org_id bigint, PRIMARY KEY (user_id, refresh_token) ) WITH CLUSTERING ORDER BY (refresh_token ASC) CREATE INDEX i_access_token ON demodb.refresh_token (access_token); After i insert or delete data...

Exporting Data from Cassandra to CSV file


apache,csv,cassandra,export,export-to-csv
Table Name : Product uid | productcount | term | timestamp 304ad5ac-4b6d-4025-b4ea-8b7991a3fe72 | 26 | dress | 1433110980000 6097e226-35b5-4f71-b158-a1fe39a430c1 | 0 | #751104 | 1433861040000 Command : COPY product (uid, productcount, term, timestamp) TO 'temp.csv'; Error: Improper COPY command. Am I missing something? ...

Timeout using SSTableloader for Cassandra Aws Instance


amazon-ec2,cassandra
I'm trying to use sstableloader to load SSTable (.db) files into a Cassandra Cluster running on an AWS EC2 instance. This error occurrs: Established connection to initial hosts Opening sstables and calculating sections to stream Streaming relevant part of C:\Users\SNCUser\dataquest\CassandraLoader\WrDir\beed5b97-0b52-45d7-be5d-fbbac00ac607\device_data\blob\device_data-blob-ka-1-Data.db to [/172.*.*.*] ERROR 16:08:36 [Stream #1114a0d0-1054-11e5-9ccc-65ee5fdd8902] Streaming error occurred java.net.ConnectException:...

How to index documents with their metadata in a DB using Solr 5.1.0


mysql,solr
I'm using Apache Solr to index documents for a search engine. These documents are stored locally on my file system. In order to do a faceted search I also have to include these documents meta-data which is stored in a MySQL DB. Is there a way to simultaneously index these...

Slicing over partition rows using tuple operation in CQL


cassandra,cql,datastax
I am trying to understand the behavior of tuple operator with clustering keys. Here is what I was trying to do: create table sampletable (a int,b int,c int, d int, e int, primary key((a,b,c),d,e)); insert into sampletable(a,b,c,d,e) values(1,1,1,1,1); insert into sampletable(a,b,c,d,e) values(1,1,1,1,1); insert into sampletable(a,b,c,d,e) values(1,1,1,1,2); insert into sampletable(a,b,c,d,e) values(1,1,2,1,1);...

File Processing with Spark and Cassandra


cassandra,apache-spark
Right now I'm working on loading a table from a Cassandra cluster into a Spark cluster with the Datastax Cassandra Spark Connector. Right now the spark program performs a simple mapreduce job that counts the number of rows in the Cassandra table. Everything is set up and run locally. The...

Lucene vs Solr, indexning speed for sampe data


java,indexing,solr,lucene,full-text-search
I have worked upon Lucene before and now moving towards Solr. The problem is that I am not able to do Indexing on Solr as fast as Lucene can do. My Lucene Code: public class LuceneIndexer { public static void main(String[] args) { String indexDir = "/home/demo/indexes/index1/"; IndexWriterConfig indexWriterConfig =...