Solr is an open source search platform which can index and search multiple sites by returning suggestions for related content based on the search query. Solr makes it easy for programmers to develop sophisticated, high-performance search applications with advanced features such as faceting (arranging search results in columns with numerical counts of key terms). Solr builds on another open source search technology: Lucene, built on Java. It provides indexing then search technology, as well as spell checking, advanced analysis/tokenization and hit highlighting capabilities
A Solr index can accept data from many different sources, including comma-separated value (CSV) files, XML files, files in common file formats such as Microsoft Word or PDF and data extracted from tables in a database.
How does SOLR search work?
Solr is a Apache lucene library wrapper. It uses lucene classes to create this index called Inverted Index. Solr maintains a list called posting list, which holds the mapping of words/terms/phrases with the corresponding places where they occur. Apache Solr is a search engine, where you can index a set of document (say, news articles) and then query Solr to return a set of documents that matches user query.
Solr client for node.js
solr client for indexing, adding, deleting and searching documents.
Installation:-
npm install solr-client
Installation procedure for Solr client installation
Add documents into the Solr index:-
var solr = require('solr-client'); var client = solr.createClient(); client.autoCommit = true; var docs = []; var doc = { id : 1, title : "Hello", description : "World" }; docs.push(doc); client.add(docs, function(err, obj) { if(err) { console.log(err); } else { console.log(obj); } });
Delete set of documents:-
var solr = require('solr-client'); var client = solr.createClient(); var field = 'id'; var query = '*'; client.delete(field, query, function(err, obj) { If (err) { console.log(err); } else { console.log(obj); } });
Search documents:-
var query = client.createQuery() .q({title : 'Hello’}) .start(0) .rows(10); client.search(query, function(err, obj) { if(err) { console.log(err); } else { console.log(obj); } });
Advantages of using Solr
- NoSQL database − Solr can also be used as big data scale NOSQL database where we can distribute the search tasks along a cluster.
- Full text search − Solr provides all the capabilities needed for a full text search such as phrases, tokens, wildcard, spell check and auto-complete.
- Highly Scalable − While using Solr with Hadoop, we can scale its capacity by adding replicas.
- Enterprise ready − According to the need of the organization, Solr can be deployed in any kind of systems (big or small) such as standalone, distributed, cloud, etc.
- Restful APIs − To communicate with Solr, It is not mandatory to have Java development skills. Instead you can use restful services to communicate with it. We enter documents in Solr in file formats like JSON, CSV and XML and get results in the same file formats.
- Text-Centric and Sorted by Relevance − Solr is mostly used to search text documents and the results are delivered according to the relevance with the user’s query in order.
- Extensible and Flexible − We can customize the components of Solr easily by extending the Java classes and configuring accordingly.
- Admin Interface − Solr provides an easy-to-use, user friendly, feature powered, user interface, using which we can perform all the possible tasks such as manage add, update, delete, search documents and logs.
Limitations of using Solr
- Latency is Increased (Solr replication latency and sum of tracking).
- Need to do occasional large IO load to replicate large merges.
- Complicated Load balance and management.
- Need to do Reconfiguration, if the master is lost.