What is Term queries and how boost parameter affect score and documents retrieved

Term queries:  In Elasticsearch term query is used when we want to retrieve documents with exact match. In another words, the exact term needs to be found in inverted index for indexed documents. Below is sample term queries which retrieves all document from index customers - if name (first/last) contains "gates".
curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
    "query" : {
        "term" : { "name" : "gates" }
    }
}
' -H 'Content-Type: application/json'

Consider another example of term query with boolean compound queries. Below query retrieves all documents where state name contains - either avenue or court (Remember should construct ?). Below response shows that "court" got priority over "avenue" in score computation so all documents with "court" appears first.
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "street": {
              "value": "avenue"
            }
          }
        },
        {
          "term": {
            "street": {
              "value": "court"
            }
          }
        }
      ]
    }
  },
  "size": 2,
  "_source" :"st*"
}
' -H 'Content-Type: application/json'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 333,
    "max_score" : 2.2525747,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "BH_g6mABB3_D7Pc85hJw",
        "_score" : 2.2525747,
        "_source" : {
          "street" : "980 Seigel Court",
          "state" : "Hawaii, 9974"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "Bn_g6mABB3_D7Pc85hJw",
        "_score" : 2.2525747,
        "_source" : {
          "street" : "323 Aurelia Court",
          "state" : "Maine, 8213"
        }
      }
    ]
  }
}
How to enforce term query so that "avenue" gets priority over "court" so that all documents with "avenue" appears first with higher score - Using Boost parameter

Now we use boost parameter and update above query so that "avenue" gets priority over "court".
➜  Desktop curl -XGET 'localhost:9200/customers/_search?pretty' -d'
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "street": {
              "value": "avenue",
              "boost": 2.0
            }
          }
        },
        {
          "term": {
            "street": {
              "value": "court"
            }
          }
        }
      ]
    }
  },
  "size": 2,
  "_source" :"st*"
}
' -H 'Content-Type: application/json'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 333,
    "max_score" : 3.3973382,
    "hits" : [
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "tX_g6mABB3_D7Pc85hFv",
        "_score" : 3.3973382,
        "_source" : {
          "street" : "438 Gotham Avenue",
          "state" : "Indiana, 7577"
        }
      },
      {
        "_index" : "customers",
        "_type" : "vendors",
        "_id" : "vn_g6mABB3_D7Pc85hFv",
        "_score" : 3.3973382,
        "_source" : {
          "street" : "100 Tapscott Avenue",
          "state" : "Palau, 2140"
        }
      }
    ]
  }
}

Note
:-
- Total number of results is same as above query without boost parameter.
- Document with "Avenue" got boost of 2.0 and its score improved as compare to last results.

=====*****======

2 Comments


  1. Wie sieht der Alltag eines Cybersecurity-Analysten in der Weiterbildung aus? Am Morgen beginnt er oft mit einer Analyse von Sicherheitsvorfällen, die während der letzten 24 Stunden gemeldet wurden. Dabei nutzt er Tools wie Splunk oder SIEM-Systeme, um Muster zu erkennen und Bedrohungen frühzeitig zu identifizieren. Im Laufe des Tages steht häufig eine Schulung im Fokus, die praktische Übungen beinhaltet, zum Beispiel das Testen von Schwachstellen in simulierten Netzwerken. Für den Austausch mit Kollegen besucht er manchmal Seminare oder Webinare, beispielsweise bei https://csvisor.de/, um sich über aktuelle Gesetzesänderungen wie DORA oder das BSIG auf dem Laufenden zu halten. Der Nachmittag wird genutzt, um Dokumentationen anzupassen und Compliance-Anforderungen gemäß ISO 27001 oder BSI TR-03116 umzusetzen. Die Praxisnähe ist entscheidend: Durch reale Szenarien lernen Analysten, Angriffe wie Ransomware-Attacken effizient abzuwehren und Risiken zu minimieren.

    ReplyDelete
Previous Post Next Post