Full Text Query in Elasticsearch : match, match_phrase, match_phrase_prefix

In previous posts Search using query param and request body and Query term and Source filtering  received fair understanding of how to query and filter documents fields to retrieve relevant fields of interest. In this post we will go thorough advanced searching techniques using match, match_phrase and match_phrase_prefix construct provided by Elasticsearch.

Match keyword

"match" keyword is used with query and it hints search request to look for given value of the fields. It is not exact term match (as discussed in Query term and Source filtering). match keyword is used along with OR/AND logical operators.

Display/retrieve all documents with name (first or last) contains <X> : Below search query finds all documents with name = "gibson".
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
    "query": {
        "match" : {
            "name" : "gibson"
        }
    }
}
' -H 'Content-Type: application/json'
{
  "took" : 21,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 4.9511213,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "xn_g6mABB3_D7Pc85hFv",
        "_score" : 4.9511213,
        "_source" : {
          "name" : "Sue Gibson",
          "age" : 60,
          "gender" : "female",
          "email" : "suegibson@comvex.com",
          "phone" : "+1 (919) 450-2888",
          "street" : "166 Newel Street",
          "city" : "Jacksonwald",
          "state" : "Northern Mariana Islands, 9865"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "2X_g6mABB3_D7Pc85hNx",
        "_score" : 4.804021,
        "_source" : {
          "name" : "Gibson Velasquez",
          "age" : 57,
          "gender" : "male",
          "email" : "gibsonvelasquez@comvex.com",
          "phone" : "+1 (906) 436-3683",
          "street" : "362 Beverly Road",
          "city" : "Dalton",
          "state" : "Texas, 8682"
        }
      }
    ]
  }
}
Note:- match keyword suggest request query to retrieve all docs with name as "gibson". Elasticsearch brings two documents-  first record with high score has second name as "gibson" and second document has first name "gibson". Generally, Elasticsearch gives more preference to First name(more score) however here due to small size of name its score is more( Sue gibson is more relevant than Gibson valasquez)

Match with OR operator :- Retrieve all documents where name contains either "Tyler"  or "Macdonald". Below query retrieves all documents which has either of these two names. Match keyword hints query that apply OR operator on given name values <tyler macdonald> and retrieve all documents wherever either one of name appears.
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'{
    "query": {
        "match" : {
              "name" : {
                  "query" : "tyler macdonald",
                  "operator" : "or"
               }
        }
    }
}
' -H 'Content-Type: application/json'

{
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 4.9416423,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "IX_g6mABB3_D7Pc85hRy",
        "_score" : 4.9416423,
        "_source" : {
          "name" : "Macdonald Perkins",
          "age" : 49,
          "gender" : "male",
          "email" : "macdonaldperkins@comvex.com",
          "phone" : "+1 (863) 559-2182",
          "street" : "687 Bayview Avenue",
          "city" : "Ola",
          "state" : "South Carolina, 2216"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "pH_g6mABB3_D7Pc85hRy",
        "_score" : 4.9416423,
        "_source" : {
          "name" : "Tyler Flores",
          "age" : 25,
          "gender" : "male",
          "email" : "tylerflores@comvex.com",
          "phone" : "+1 (977) 433-3222",
          "street" : "974 Sedgwick Place",
          "city" : "Vallonia",
          "state" : "Kansas, 1804"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "eH_g6mABB3_D7Pc85hNx",
        "_score" : 4.804021,
        "_source" : {
          "name" : "Kimberly Tyler",
          "age" : 50,
          "gender" : "female",
          "email" : "kimberlytyler@comvex.com",
          "phone" : "+1 (867) 568-3457",
          "street" : "679 Rugby Road",
          "city" : "Walton",
          "state" : "Alabama, 3785"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "mX_g6mABB3_D7Pc85hRy",
        "_score" : 4.804021,
        "_source" : {
          "name" : "Dotson Macdonald",
          "age" : 24,
          "gender" : "male",
          "email" : "dotsonmacdonald@comvex.com",
          "phone" : "+1 (874) 525-3190",
          "street" : "525 Boardwalk ",
          "city" : "Sylvanite",
          "state" : "Oklahoma, 2414"
        }
      }
    ]
  }
}
Note:-  By default match keyword uses OR operator , if not specified.

Query with "and" operator : Below query retrieves all document where name contains both "arnold"and "knowles". It retrieves just one document.
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'{
    "query": {
        "match" : {
              "name" : {
                  "query" : "arnold knowles",
                  "operator" : "and"
               }
        }
    },"_source":false
}
' -H 'Content-Type: application/json'

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 9.902243,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "ln_g6mABB3_D7Pc85hRy",
        "_score" : 9.902243
      }
    ]
  }
}
Note:- "_source": false is added in request to shorten response, just display doc with id. We are interested in number of documents retrieved.

Match keyword Default operator OR:  Below screenshot suggest that query apply default Operator as OR and retrieve total 53 documents. It contains all doc where either south or carolina is found.

Match_phrase keyword

Retrieve all documents which matches a given phrase (whole text): Below query retrieves all documents where south carolina is found as whole. It retrieves total 17 documents (less than retrieved above in default OR query).
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
    "query": {
        "match_phrase" : {
            "state" : "south carolina"
        }
    },
    "size":1
}
' -H 'Content-Type: application/json'
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 17,
    "max_score" : 6.3648453,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "F3_g6mABB3_D7Pc85hJw",
        "_score" : 6.3648453,
        "_source" : {
          "name" : "Horton Mcclure",
          "age" : 59,
          "gender" : "male",
          "email" : "hortonmcclure@comvex.com",
          "phone" : "+1 (860) 507-2823",
          "street" : "991 Oakland Place",
          "city" : "Northchase",
          "state" : "South Carolina, 8608"
        }
      }
    ]
  }
}

Match_phrase_prefix keyword

Retrieve all documents where state starts with "mi" - Michigan,Minnesota, Mississippi,Missouri etc.
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
    "query": {
        "match_phrase_prefix" : {
            "state" : "mi"
        }
    },
    "size": 3,
    "_source" :"st*"
}
' -H 'Content-Type: application/json'
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 86,
    "max_score" : 4.7598696,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "nn_g6mABB3_D7Pc85hNx",
        "_score" : 4.7598696,
        "_source" : {
          "street" : "457 Thomas Street",
          "state" : "Michigan, 2490"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "NH_g6mABB3_D7Pc85hVy",
        "_score" : 4.7598696,
        "_source" : {
          "street" : "106 Tompkins Place",
          "state" : "Michigan, 3205"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "P3_g6mABB3_D7Pc85hNx",
        "_score" : 4.6157947,
        "_source" : {
          "street" : "921 Judge Street",
          "state" : "Minnesota, 6838"
        }
      }
    ]
  }
}
Retrieve all documents with prefix as word followed by space :
Below query retrieves all doc with street name start with < Sunnyside > and it displays two documents with street name <Sunnyside Avenue> and <Sunnyside Court>.
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
    "query": {
        "match_phrase_prefix" : {
            "street" : "Sunnyside "
        }
    },
    "_source" :"st*"

}
' -H 'Content-Type: application/json'
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 4.9511213,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "G3_g6mABB3_D7Pc85hJw",
        "_score" : 4.9511213,
        "_source" : {
          "street" : "733 Sunnyside Avenue",
          "state" : "Oklahoma, 9311"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "CH_g6mABB3_D7Pc85hNx",
        "_score" : 4.800417,
        "_source" : {
          "street" : "864 Sunnyside Court",
          "state" : "New York, 1549"
        }
      }
    ]
  }
}
Note: math_phrase_prefix is widely used in suggestion as it gives relevant documents only with context match.

======  **** ======

17 Comments

  1. Replies
    1. IEEE Final Year Project centers make amazing deep learning final year projects ideas for final year students Final Year Projects for CSE to training and develop their deep learning experience and talents.

      IEEE Final Year projects Project Centers in India are consistently sought after. Final Year Students Projects take a shot at them to improve their aptitudes, while specialists like the enjoyment in interfering with innovation.

      corporate training in chennai corporate training in chennai

      corporate training companies in india corporate training companies in india

      corporate training companies in chennai corporate training companies in chennai

      I have read your blog its very attractive and impressive. I like it your blog. Digital Marketing Company in Chennai

      Delete
  2. I am obliged to you for sharing this piece of information here and updating us with your resourceful guidance. Hope this might benefit many learners. Keep sharing this gainful articles and continue updating us.
    Python Training in Chennai
    Python Course in Chennai
    German Classes in Chennai
    Cloud Computing Training in Chennai
    Data Science Course in Chennai
    Devops Training in Chennai
    Python Training in Porur
    Python Training in Adyar

    ReplyDelete
  3. Yeah bookmaking this wasn't a bad determination outstanding post!

    My web site; 대구오피

    ReplyDelete


  4. Hey friend, it is very well written article, thank you for the valuable and useful information you provide in this post. Keep up the good work! FYI, please check these depression, stress and anxiety related articles.
    How to Build a Portfolio with ETFs, My vision for India in 2047 postcard, Essay on Unsung Heroes of Freedom Struggle

    ReplyDelete
  5. you will need support or suggestions, write me privately.
    I interested in your implementation/use case.
    the best situs slot terbaik
    Togel2win
    daftar bo bonanza

    ReplyDelete
  6. Visit OGEN Infosystem; leading Pay Per Clicks (PPC) and SEO company in india. We have well-experienced digital marketing experts. For more information about ppc and seo services visit O’GEN Infosystem.
    PPC Company in Delhi

    ReplyDelete
  7. You've written a very detailed post regarding..... This information is useful to me and is also beneficial to people who...... I appreciate you providing this information.digital signature

    ReplyDelete
  8. The information you have provided here on your blog is quite beneficial. The work you put into this essay is really appreciated, and it is also beneficial to us. I appreciate you posting this content.digital branding

    ReplyDelete
  9. The fact that you provided this knowledge on your blog is quite beneficial. We genuinely appreciate the time and work you put into writing your article, which also benefits us. I appreciate you giving this knowledge.ios app development

    ReplyDelete
  10. A very thorough article about.....has been supplied by you. It's a great article for me and for those who...... We appreciate you sharing this knowledge with us.electronic signature

    ReplyDelete
  11. You've provided here with great substance. I'm happy that I came into this article because it has a wealth of useful information. I appreciate you sharing such an informative article.Offshore Java Developers

    ReplyDelete
Previous Post Next Post