What is the best way to perform a partial match query in Elasticsearch?

Can anyone provide me with a solution for extracting the word "google" from a link like http://drive.google.com?

Currently, I am using the following search query:

query: {
    bool : {
        must: {
            match: { text: 'google'} 
        }
    }
}

However, this query only matches if the entire text is 'google' (ignoring case sensitivity). Is there a way to match only the occurrence of 'google' within another string?

Answer №1

The key point to remember about the ElasticSearch regex you are utilizing is that it requires a full string match:

Lucene’s patterns are always anchored. The pattern provided must match the entire string.

Therefore, to match any character (except for a newline), the .* pattern should be used:

match: { text: '.*google.*'}
                ^^      ^^

In ES6 and later versions, opt for using regexp instead of match:

"query": {
   "regexp": { "text": ".*google.*"} 
}

Another option arises when dealing with strings containing newlines:

match: { text: '(.|\n)*google(.|\n)*'}
. This seemingly convoluted (.|\n)* construct is necessary in ElasticSearch due to limitations on shortcuts like [\s\S], as well as restrictions on DOTALL/Singleline flags. "The Lucene regular expression engine is not Perl-compatible but supports a smaller range of operators."

However, if your intention does not involve complex pattern matching and excludes word boundary checks, opting for a simple wildcard search would suffice:

{
    "query": {
        "wildcard": {
            "text": {
                "value": "*google*",
                "boost": 1.0,
                "rewrite": "constant_score"
            }
        }
    }
} 

Refer to Wildcard search for further information.

NOTE: It is imperative to realize that the wildcard pattern must encompass the entire input string, hence

  • google* locates all strings beginning with google
  • *google* identifies all strings containing google
  • *google pinpoints all strings ending with google

Additionally, take into consideration the exclusive pair of special characters in wildcard expressions:

?, which matches any single character
*, which can match zero or more characters, even none at all

Answer №2

Try implementing a wildcard query for searching:

'{"query":{ "wildcard": { "text.keyword" : "*google*" }}}'

Answer №3

Both partial and full text matching can be achieved using the following method:

"query" : {
    "query_string" : {
      "query" : "*searchText*",
      "fields" : [
        "fieldName"
      ]
    }

Answer №4

I've been searching for a way to disable regular expressions in the latest update, but using match: { text: '.*google.*'} doesn't seem to be effective on my Elasticsearch 6.2 clusters. Is there a setting that can be adjusted for this?

However, I have found that utilizing the Regexp feature works well:

"query": {
   "regexp": { "text": ".*google.*"} 
}

Answer №6

To achieve a more versatile solution, consider exploring alternative analyzers or crafting your own. It seems like you are currently utilizing the standard analyzer which breaks down http://drive.google.com into "http" and "drive.google.com". As a result, searching for just "google" is ineffective as it's attempting to match the entire "drive.google.com".

Switching to the simple analyzer during indexing would segment the text into "http", "drive", "google", and "com". This adjustment allows you to search for any of these terms individually.

Answer №7

When utilizing the node.js client, make sure to pay attention to the following:

tag_name refers to the field name, with its value being the incoming search value.

  const { body } = await elasticWrapper.client.search({
        index: ElasticIndexs.Tags,
        body: {
          query: {
            wildcard: {
              tag_name: {
                value: `*${value}*`,
                boost: 1.0,
                rewrite: 'constant_score',
              },
            },
          },
        },
      });

Answer №8

If you need to perform a wildcard search, here is how it can be accomplished based on the official documentation:

query_string: {
  query: `*${keyword}*`,
  fields: ["fieldOne", "fieldTwo"],
},

Wildcard searches allow for replacing characters with ? representing a single character and * representing zero or more characters: qu?ck bro*

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-wildcard

It's important to proceed with caution:

Wildcard queries can consume a significant amount of memory and may result in poor performance — consider the number of terms that must be matched for a query string like "a* b* c*".

Using a wildcard at the beginning of a word (e.g., "*ing") is resource-intensive as all terms in the index are checked for potential matches. To prevent leading wildcards, disable them by setting allow_leading_wildcard to false.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

How can I update my outdated manifest v2 code to manifest v3 for my Google Chrome Extension?

Currently, I am developing an extension and using a template from a previous YouTube video that is based on manifest v2. However, I am implementing manifest v3 in my extension. Can anyone guide me on how to update this specific piece of code? "backgro ...

Why is the toggle list not functioning properly following the JSON data load?

I attempted to create a color system management, but as a beginner, I find it quite challenging! My issue is: When I load my HTML page, everything works fine. However, when I click on the "li" element to load JSON, all my toggle elements stop working!!! ...

What is the process of using an if statement in jQuery to verify the existence of a property in a JSON file?

I am working on a task to incorporate an if statement that checks for the existence of a specific property in a JSON file. If the property exists, I need to display its value within HTML tags <div class='titleHolder'> and "<div class=&ap ...

How to use JQuery to retrieve multiple background image URLs from CSS styling

Initially, here is the CSS code that I am working with: background-image: url(/style_elements/img/first.png), url(/style_elements/img/second.png); I am trying to specifically target the second background image URL using jQuery: var item_img_url = item_i ...

Modifying the color of a variety of distinct data points

There was a previous inquiry regarding Changing Colour of Specific Data, which can be found here: Changing colour of specific data Building upon that question, I now have a new query. After successfully changing the 2017 dates to pink, I am seeking a way ...

fullcalendar adjusting color while being moved

Currently, I have implemented a fullcalendar feature that displays entries for multiple users with different colored calendars. However, there seems to be an issue when dragging an entry to another date - the color reverts back to default. Below is an exa ...

Ensure that the URL is updated correctly as the client navigates between pages within a Single Page Application

Seeking a solution for maintaining the URL in a Single Page application as the client navigates between pages, with the URL changing in the background. I attempted using @routeProvider but it doesn't seem to be suitable for my situation. Any suggestio ...

Utilizing Javascript to Extract Data from Twitter Json Files

Can someone provide assistance with parsing JSON feed text retrieved from Twitter? I am looking to access and apply style tags to elements like the link, created date, and other information. Any tips on how I can achieve this task successfully would be g ...

Utilizing Regex Patterns to Manipulate CSS Attributes

I am dealing with a string containing CSS properties and their values: str = "filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#cccccc', endColorstr='#000000'); background: -webkit-linear-gradient(top, black, wh ...

CSS background image referencing in React to the public folder's path

I am currently working on a project using Create-React-App and I am trying to set a background image for my header section. Here is how I'm attempting to do it: background-image: url('~/Screenshot_11.png'); However, I encountered an error w ...

Is it acceptable to replicate another individual's WordPress theme and website design in order to create my own WordPress website that looks identical to theirs?

It may sound shady, but a friend of mine boasts about the security of his WordPress website, claiming it's impossible to copy someone else's layout or theme. However, my limited experience in web development tells me otherwise. I believe it is po ...

Instructions on utilizing the CSSStyleSheet.insertRule() method for modifying a :root attribute

Is it possible to dynamically set the background color of the :root CSS property in an HTML file based on a hash present in the URL? The code provided does change the background color, but unfortunately, the hash value doesn't persist as users navigat ...

What is the best way to add custom styles to an Ext JS 'tabpanel' xtype using the 'style

Is there a way to change the style of a Ext.tab.Panel element using inline CSS structure like how it's done for a xtype: button element? { xtype: "button", itemId: "imageUploadButton1", text: "Uploader", style: { background : ' ...

Retrieve jQuery CSS styles from a JSON database

I've been attempting to pass CSS attributes into a jQuery method using data stored in a JSON database. However, it doesn't seem to be functioning as expected. I suspect that directly inputting the path to the JSON variable may not be the correct ...

Trouble displaying data in Jquery JSON

I've been on the hunt for hours, trying to pinpoint where the issue lies within my code. Despite scouring different resources and sites, I can't seem to figure it out. Whenever I click "Get JSON Data," nothing seems to display below. Can someone ...

Is there a way to open an HTML file within the current Chrome app window?

Welcome, My goal is to create a Chrome App that serves as a replacement for the Chrome Dev Editor. Here is my current progress: background.js chrome.app.runtime.onLaunched.addListener(function() { chrome.app.window.create('backstage.html', { ...