Elasticsearch — Search engine or more?
I’m sure you are aware of the fact that for any modern software application, search and analytics are some of the key features. Mobile apps as well as web and data analytics applications require scalability and must be capable of handling massive volumes of real-time data effectively. Today, a fairly basic application has usability features such as auto complete, search suggestions, location search, etc. And what is the power behind these experiences? You guessed it right- it’s Elasticsearch!
Simply put, Elasticsearch is a full-text search engine based on the Apache Lucene library. It stores data in json document, so it is schema-free. If you need an extensive search functionality or have a website that needs to perform heavy search, then using Elasticsearch is your best bet. Not only that, but it’s also very versatile.
I’ve previously covered basics of elastic search, what is it, why and when to choose it. Please go over previous blogs for the start, you can find it here —
— Blog— Youtube
https://youtu.be/BT1Ea67b8S0
but for today we will be diving directly into queries and usage of elastic search so buckle up.
Creating an Index
Generally, Elasticsearch comes into the picture whenever there are large volumes of data. In order to effectively use this data, we should create an index in it. An index in Elasticsearch is like a database that is seen in a relational database. It has the capacity for mapping, which defines multiple types of data. We can also say that an index is a logical namespace which maps to one or more primary shards and can have zero or more replica shards.
So, from the aforementioned statements, we can infer two things- first, an index is a type of data organization mechanism which allows the user to partition data a certain way. The second concept relates to replicas and shards.
Now, let’s understand how you can map Elasticsearch with Relational DB. When we come from a relational database, we find it difficult to understand the terminology.
But, don’t worry. I will take you through some sample queries that will help you get a grasp on it.
Before I do that, let’s take a look at how you can create an index. While creating the index we need to keep the aspect of mappings in mind. If mappings are good, then querying indices becomes more fun.
How to create Elasticsearch Index?
Creating an index on Elasticsearch is the first step towards leveraging the stellar power of Elasticsearch. Elasticsearch has an in-built feature that will auto create indices. When you put data into a non-existent index, it will create that index with mappings referred from the data we pushed.
Elasticsearch provides http endpoint using which we can create index. We can use Kibana service as well for creating indices. Here is a sample example to create employee index:
- curl -X PUT "localhost:9200/employee?pretty"
Each index created can have specific settings associated with it, as below:
PUT /employee
{
"settings": {
"number_of_shards": 1
},
"mappings": {
"properties": {
"name": { "type": "text" }
}
}
}
Searching from an index is quite simple in this, similar to when we make an API call. Give URL, request parameter and we are done.
URL : GET employee/_search
Request body
{
"query":{
"match":{
"name":"Mark"
}
}
}
Let’s break it down.
- Employee is an index name against which we want to perform search.
- Match is an operation that we are performing.
- Name is the filed name we are searching for: Name=mark.
Now what if I want to search against multiple fields, our request body changes as follows:
Request body
{
"query":{
"multi_match":{
"query":"Mark"
"fields":["first_name","last_name"]
}
}
}
As I said, it is pretty simple to understand and frame a query.
Now let’s see what are the different query types that we can use in Elasticsearch. Basically, in Elasticsearch the searching mechanism is done using query based on JSON.
There are two things with which a query can be written in ES- Leaf query clauses & Compound query clauses.
Leaf query clauses — It uses match, term or range, which looks for a specific value in a specific field.
Compound Query Clauses — It’s a combination of leaf query and other compound queries to extract the desired output.
The different types of queries are as follows –
1. Match All Query –
This is the most basic query. It returns all the content with a score of 1.0 for every object.
{
"query": {
"match_all": {}
}
}
2. Match None Query –
{
"query": {
"match_none": {}
}
}
3. Full Text Queries –
These queries are used to search a full body of text like a chapter or a news article.
The queries in this group are:
A. match query
This query matches a text or phrase with the values of one or more fields
{ "query": { "match": { "rating":"4.5" } }}
It’s similar to the match query but it is used for matching exact phrases or word proximity matches.
{"query": {"match_phrase":{"message":"this is a test"}}}
It is like search-as-you-type. Like the match phrase query but does a wildcard search on the final word.
{"query": {"match_phrase_prefix":{"message":"quick brown f"}}}
This query matches a text or phrase with more than one field.
{"query": {"multi_match":{"query":"query String","fields":["field1","field2"]}}}
A more specialized query that gives more preference to uncommon words.
{"query": {"common": {"body": {"query": "this is string ","cutoff_frequency":0.001}}}}
This query uses query parser and query string keyword.
{ "query": { "query_string":{ "query":"string" } }}
A simpler, more robust version of the query string syntax suitable for exposing directly to users.
{"query": {"simple_query_string":{"query":"\"fried eggs\" +(eggplant | potato) -frittata","fields":["title^5","body"],"default_operator":"and"}}}
4. Term Level Queries –
These queries mainly deal with structured data like numbers, dates, and enums.
{
"query": {
"term": {
"zip": "176115"
}
}
}
5. Range Query –
This query is used to find the objects having values within a range. For this, we need to use operators such as −
- gte − greater than equal to
- gt − greater-than
- lte − less-than equal to
- lt − less-than
Ex.
{
"query": {
"range": {
"rating": {
"gte": 3.5
}
}
}
}
6. Compound Queries –
– These queries are a collection of different queries merged with each other by using Boolean operators like and, or not or for different indices or having function calls etc.
Ex.
{
"query": {
"bool" : {
"must" : {
"term" : { "state" : "UP" }
},
"filter": {
"term" : { "fees" : "2200" }
},
"minimum_should_match" : 1,
"boost" : 1.0
}
}
}
In conclusion, if we have a website that needs to perform a lot of searches or search a lot of logs and prepare analytics, Elasticsearch is the way to go!
However, we cannot finish discussing Elasticsearch without elaborating on Logstash and Kibana. I’ve already covered ELK tech stack and its installation steps, Do have a look here —
Elastic search & kibana installation
https://youtu.be/NLEldEJ3cs0
Elastic search is much more than just a database, we just need to utilize the complete power behind it.