Searching for precise string delimited by spaces in Python

Unique Example:

words_to_check = ['dog', 'cat', 'bird chirp']

full_word_list = ['doggy dog cat', 'meow cat purr', 'bird chirp tweet tweet', 'chirp bird']

for word in words_to_check:
    print(list(map(lambda x: re.findall(word, x), full_word_list)))

After running the above code snippet, the output shows that 'dog' is matched 2 times, once as a standalone word and once within another word. 'cat' is only matched when it stands alone, while 'bird chirp' is matched against its exact occurrence with white space separation.

I am looking to refine this matching process in Python so that 'cat' will only match if it appears as a single word or at the beginning/end of another word. Similarly, 'bird chirp' should only match when separated by spaces. Any suggestions on how I can achieve this?

[['dog', 'dog'], [], ['cat'], []]

[[], ['cat'], [], []]

[[], [], ['bird chirp'], ['chirp']]

Answer №1

Remember to utilize word breaks (\b) when working with regular expressions.

import re
strings_to_search = ['abc', 'def', 'fgh hello']
complete_list = ['abc abc dsss abc', 'defgj', 'abc fgh hello xabd', 'fgh helloijj']

for col_key in strings_to_search:
    word = r'\b{}\b'.format(col_key)
    print(list(map(lambda x: re.findall(word, x), complete_list)))

Here is the resulting output:

[['abc', 'abc', 'abc'], [], ['abc'], []]
[[], [], [], []]
[[], [], ['fgh hello'], []]

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Struggling to make a basic loop function correctly

def getMSTestPath(testPathList): dllFilePath = r'C:\Users\bgbesase\Documents\Brent\Code\Visual Studio' msTestFilePath = [] dllConvert = [] full_dllPath = [] for r, d, f in os.walk(testPathList): ...

Rapid array manipulations for optimizing game graphic displays

I have been coding for quite some time, however, my experience with graphics, GUIs, and event collection is limited. Recently, I decided to try my hand at creating a simple game based on the mechanics of Conway's Game of Life. For this project, I aim ...

The response from a Python requests.get() call is coming back as empty

Currently, I am in the process of developing a script that interacts with the Jenkins plugin API to retrieve a list of plugin dependencies. To achieve this, I have utilized the Python requests module. However, I am encountering an issue where the response ...

Selenium is having difficulty fetching the content within the table's body section

I attempted to retrieve the contents of the body in a table with the ID "mytable" by entering a value in the registration number field, but unfortunately, I was unsuccessful. I also tried using headers like user agents and beautifulsoup network tab form d ...

Geopy Error: Unable to Import

Having issues with the Geopy package. Successfully installed it using pip with python 2.7.3. Geopy is located in the following directory: /usr/local/lib/python2.7/dist-packages Tried running a file by importing only geopy.geocoders, as per Geopy&apos ...

Generating a numpy data type using a Cython structure

Displayed here is a segment of Cython code that is currently utilized in scikit-learn binary trees: # Some compound datatypes used below: cdef struct NodeHeapData_t: DTYPE_t val ITYPE_t i1 ITYPE_t i2 # build the corresponding nump ...

The C++ Cython mapping operations are yielding {b'a': 2} instead of {a: 2}

How come the code above is printing {b'a': 2} instead of {'a': 2}? What can I do to resolve this issue: File1: parameters.pyx # parameters.pyx from libcpp.map cimport map from libcpp.string cimport string cdef class Parameters: c ...

Leveraging Scapy for filtering HTTP packets

I'm currently working on creating a filter to specifically target packets with HTTP data, but I'm at a loss for how to go about it. For instance, does anyone know if there is a method in Scapy that allows you to filter only HTTP packets? ...

The Pyinstaller exe-generated tkinter GUI is not functioning properly

I recently used Pyinstaller to convert my Python script into a stand-alone executable program. However, I encountered an issue where the exe version of the program did not work as expected, even though it ran perfectly fine in spyder during testing. Progr ...

Looking to identify the correct element using inspect element while working with Python and Selenium?

When trying to access the menu and select an option using find_element_by_css_selector, find_element_by_name or driver.find_elements_by_id("#####").click(), it's not working. I believe this is because there are multiple options to choose fro ...

Retrieve individual items one at a time from QListWidget

My setup includes a QListWidget and a QPushButton. The QListWidget currently contains the following items: apple pear banana peach bear I want each click on the QPushButton to return the name of the individual item in the list, cycling through them sequ ...

What is the best method for generating a comprehensive list of xpaths for a website utilizing Python?

Method I Attempting to generate a hierarchical tree of all the xpaths within a website () using Python, I initially aimed to acquire the xpath for the branch: /html/body with the following code: from selenium import webdriver url = 'https://startpag ...

Selenium and Python conflict

After writing the code snippet below: from selenium import webdriver from selenium.webdriver.common.keys import Keys bot = webdriver.Firefox() bot.find_element_by_name("username").send_keys(config['username']) I encountered an issue where if I ...

Login to Django-allauth using the Facebook canvas app

Recently, I have been attempting to establish a connection for users on my Facebook canvas application. I decided to incorporate the Django-allauth Facebook login feature on my website. However, while working on the Facebook canvas app, I am struggling to ...

The functionality of Python Pandas groupby and qcut is experiencing issues in version 0.14.1

Is there a way to classify observations by group and then add the assigned bin back into the dataframe? In [60]: df = pd.DataFrame({'x': np.random.rand(20), 'grp': ['a'] * 10 + ['b'] * 10}) In [61]: df['y&apos ...

The function ascending=False does not seem to be functioning properly in the pandas

Ascending order not providing complete data from the highest number. import pandas as pd from binance.client import Client client = Client() ticker_info = pd.DataFrame(client.get_ticker()) busd_info = ticker_info[ticker_info.symbol.str.contains('USDT ...

Discover the least common multiple for a maximum of 5 numerical values

Struggling to create a program that can compute the least common multiple of up to five numbers. The current code I have looks like this: a = b = True while a == True: x = input('\nEnter two to five numbers separated by commas: ').spl ...

Executing the remote script through nohup is not supported by fabric

On a distant server, there is a script named test.sh which looks like this: #!/bin/bash echo "I'm here!" nohup sleep 100& From my local machine, I execute 'fab runtest' to run the remote test.sh. def runtest(): run('xxxx/test ...

Extract the data from a column in a dataframe and filter out rows that contain a specific

Seeking some guidance on extracting data from column B of the dataframe provided below. Looking to filter out rows based on the 'close' price value. Column B appears to contain dictionary data that needs to be unpacked and filtered to display on ...

What is the process for allowing a Python program to access and utilize multiple packages that share the same name?

I am currently working on a Python application that is responsible for controlling hardware, specifically FPGA-based systems. The control of these systems is managed through an automated system called AGWB (available at http://github.com/wzab/agwb.git). E ...