Jan 12, 2018

Textual description of firstImageUrl

Full Text Query in Elasticsearch : match, match_phrase, match_phrase_prefix

In previous posts Search using query param and request body and Query term and Source filtering  received fair understanding of how to query and filter documents fields to retrieve relevant fields of interest. In this post we will go thorough advanced searching techniques using match, match_phrase and match_phrase_prefix construct provided by Elasticsearch.

Match keyword

"match" keyword is used with query and it hints search request to look for given value of the fields. It is not exact term match (as discussed in Query term and Source filtering). match keyword is used along with OR/AND logical operators.

Display/retrieve all documents with name (first or last) contains <X> : Below search query finds all documents with name = "gibson".
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
    "query": {
        "match" : {
            "name" : "gibson"
        }
    }
}
' -H 'Content-Type: application/json'
{
  "took" : 21,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 4.9511213,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "xn_g6mABB3_D7Pc85hFv",
        "_score" : 4.9511213,
        "_source" : {
          "name" : "Sue Gibson",
          "age" : 60,
          "gender" : "female",
          "email" : "suegibson@comvex.com",
          "phone" : "+1 (919) 450-2888",
          "street" : "166 Newel Street",
          "city" : "Jacksonwald",
          "state" : "Northern Mariana Islands, 9865"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "2X_g6mABB3_D7Pc85hNx",
        "_score" : 4.804021,
        "_source" : {
          "name" : "Gibson Velasquez",
          "age" : 57,
          "gender" : "male",
          "email" : "gibsonvelasquez@comvex.com",
          "phone" : "+1 (906) 436-3683",
          "street" : "362 Beverly Road",
          "city" : "Dalton",
          "state" : "Texas, 8682"
        }
      }
    ]
  }
}
Note:- match keyword suggest request query to retrieve all docs with name as "gibson". Elasticsearch brings two documents-  first record with high score has second name as "gibson" and second document has first name "gibson". Generally, Elasticsearch gives more preference to First name(more score) however here due to small size of name its score is more( Sue gibson is more relevant than Gibson valasquez)

Match with OR operator :- Retrieve all documents where name contains either "Tyler"  or "Macdonald". Below query retrieves all documents which has either of these two names. Match keyword hints query that apply OR operator on given name values <tyler macdonald> and retrieve all documents wherever either one of name appears.
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'{
    "query": {
        "match" : {
              "name" : {
                  "query" : "tyler macdonald",
                  "operator" : "or"
               }
        }
    }
}
' -H 'Content-Type: application/json'

{
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 4.9416423,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "IX_g6mABB3_D7Pc85hRy",
        "_score" : 4.9416423,
        "_source" : {
          "name" : "Macdonald Perkins",
          "age" : 49,
          "gender" : "male",
          "email" : "macdonaldperkins@comvex.com",
          "phone" : "+1 (863) 559-2182",
          "street" : "687 Bayview Avenue",
          "city" : "Ola",
          "state" : "South Carolina, 2216"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "pH_g6mABB3_D7Pc85hRy",
        "_score" : 4.9416423,
        "_source" : {
          "name" : "Tyler Flores",
          "age" : 25,
          "gender" : "male",
          "email" : "tylerflores@comvex.com",
          "phone" : "+1 (977) 433-3222",
          "street" : "974 Sedgwick Place",
          "city" : "Vallonia",
          "state" : "Kansas, 1804"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "eH_g6mABB3_D7Pc85hNx",
        "_score" : 4.804021,
        "_source" : {
          "name" : "Kimberly Tyler",
          "age" : 50,
          "gender" : "female",
          "email" : "kimberlytyler@comvex.com",
          "phone" : "+1 (867) 568-3457",
          "street" : "679 Rugby Road",
          "city" : "Walton",
          "state" : "Alabama, 3785"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "mX_g6mABB3_D7Pc85hRy",
        "_score" : 4.804021,
        "_source" : {
          "name" : "Dotson Macdonald",
          "age" : 24,
          "gender" : "male",
          "email" : "dotsonmacdonald@comvex.com",
          "phone" : "+1 (874) 525-3190",
          "street" : "525 Boardwalk ",
          "city" : "Sylvanite",
          "state" : "Oklahoma, 2414"
        }
      }
    ]
  }
}
Note:-  By default match keyword uses OR operator , if not specified.

Query with "and" operator : Below query retrieves all document where name contains both "arnold"and "knowles". It retrieves just one document.
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'{
    "query": {
        "match" : {
              "name" : {
                  "query" : "arnold knowles",
                  "operator" : "and"
               }
        }
    },"_source":false
}
' -H 'Content-Type: application/json'

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 9.902243,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "ln_g6mABB3_D7Pc85hRy",
        "_score" : 9.902243
      }
    ]
  }
}
Note:- "_source": false is added in request to shorten response, just display doc with id. We are interested in number of documents retrieved.

Match keyword Default operator OR:  Below screenshot suggest that query apply default Operator as OR and retrieve total 53 documents. It contains all doc where either south or carolina is found.

Match_phrase keyword

Retrieve all documents which matches a given phrase (whole text): Below query retrieves all documents where south carolina is found as whole. It retrieves total 17 documents (less than retrieved above in default OR query).
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
    "query": {
        "match_phrase" : {
            "state" : "south carolina"
        }
    },
    "size":1
}
' -H 'Content-Type: application/json'
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 17,
    "max_score" : 6.3648453,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "F3_g6mABB3_D7Pc85hJw",
        "_score" : 6.3648453,
        "_source" : {
          "name" : "Horton Mcclure",
          "age" : 59,
          "gender" : "male",
          "email" : "hortonmcclure@comvex.com",
          "phone" : "+1 (860) 507-2823",
          "street" : "991 Oakland Place",
          "city" : "Northchase",
          "state" : "South Carolina, 8608"
        }
      }
    ]
  }
}

Match_phrase_prefix keyword

Retrieve all documents where state starts with "mi" - Michigan,Minnesota, Mississippi,Missouri etc.
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
    "query": {
        "match_phrase_prefix" : {
            "state" : "mi"
        }
    },
    "size": 3,
    "_source" :"st*"
}
' -H 'Content-Type: application/json'
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 86,
    "max_score" : 4.7598696,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "nn_g6mABB3_D7Pc85hNx",
        "_score" : 4.7598696,
        "_source" : {
          "street" : "457 Thomas Street",
          "state" : "Michigan, 2490"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "NH_g6mABB3_D7Pc85hVy",
        "_score" : 4.7598696,
        "_source" : {
          "street" : "106 Tompkins Place",
          "state" : "Michigan, 3205"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "P3_g6mABB3_D7Pc85hNx",
        "_score" : 4.6157947,
        "_source" : {
          "street" : "921 Judge Street",
          "state" : "Minnesota, 6838"
        }
      }
    ]
  }
}
Retrieve all documents with prefix as word followed by space :
Below query retrieves all doc with street name start with < Sunnyside > and it displays two documents with street name <Sunnyside Avenue> and <Sunnyside Court>.
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
    "query": {
        "match_phrase_prefix" : {
            "street" : "Sunnyside "
        }
    },
    "_source" :"st*"

}
' -H 'Content-Type: application/json'
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 4.9511213,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "G3_g6mABB3_D7Pc85hJw",
        "_score" : 4.9511213,
        "_source" : {
          "street" : "733 Sunnyside Avenue",
          "state" : "Oklahoma, 9311"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "CH_g6mABB3_D7Pc85hNx",
        "_score" : 4.800417,
        "_source" : {
          "street" : "864 Sunnyside Court",
          "state" : "New York, 1549"
        }
      }
    ]
  }
}
Note: math_phrase_prefix is widely used in suggestion as it gives relevant documents only with context match.

======  **** ======
Location: Bengaluru, Karnataka, India

8 comments:

  1. Replies
    1. IEEE Final Year Project centers make amazing deep learning final year projects ideas for final year students Final Year Projects for CSE to training and develop their deep learning experience and talents.

      IEEE Final Year projects Project Centers in India are consistently sought after. Final Year Students Projects take a shot at them to improve their aptitudes, while specialists like the enjoyment in interfering with innovation.

      corporate training in chennai corporate training in chennai

      corporate training companies in india corporate training companies in india

      corporate training companies in chennai corporate training companies in chennai

      I have read your blog its very attractive and impressive. I like it your blog. Digital Marketing Company in Chennai

      Delete
  2. I am obliged to you for sharing this piece of information here and updating us with your resourceful guidance. Hope this might benefit many learners. Keep sharing this gainful articles and continue updating us.
    Python Training in Chennai
    Python Course in Chennai
    German Classes in Chennai
    Cloud Computing Training in Chennai
    Data Science Course in Chennai
    Devops Training in Chennai
    Python Training in Porur
    Python Training in Adyar

    ReplyDelete
  3. This blog is very interesting to read, this contains more useful information, Keep sharing more blogs.
    Array in python
    oops in python
    Python frameworks
    goto statement in python
    Selenium interview questions and answers

    ReplyDelete