Find and match a group in a string multiple times

I am attempting to utilize regular expressions. I have a string that needs to be matched.

 influences = 
 {{hlist |[[Plato]] |[[Aristotle]] |[[Socrates]] |[[David Hume]] |[[Adam Smith]] |[[Cicero]] |[[John Locke]]}}
 {{hlist |[[Saint Augustine]] |[[Saint Thomas Aquinas]] |[[Saint Thomas More]] |[[Richard Hooker]] |[[Edward Coke]]}}
 {{hlist |[[Thomas Hobbes]] |[[Rene Descartes]] |[[Montesquieu]] |[[Joshua Reynolds]] |[[Sir William Blackstone|William Blackstone]]}}
 {{hlist |[[Niccolo Machiavelli]] |[[Dante Alighieri]] |[[Samuel Johnson]] |[[Voltaire]] |[[Jean Jacques Rousseau]] |[[Jeremy Bentham]]}}

I need to extract the following templates from the text:

{{hlist .... }}

The text below should not be matched:

main_interests = 
 {{hlist |[[Music]] |[[Art]] |[[Theatre]] |[[Literature]]}}

I have crafted this regex but it is not functioning correctly.

(?:^\|\s*)?(?:influences)\s*?=\s*?(?:(?:\s*\{\{hlist)\s*\|([\d\w\s\-()*—&;\[\]|#%.<>·:/",\'!{}=•?’
á~ü°œéö$àèìòùÀÈÌÒÙáéíóúýÁÉÍÓÚÝâêîôûÂÊÎÔÛãñõÃÑÕäëïöüÿÄËÏÖÜŸçÇßØøÅåÆæœ]*?)(?=\n))+

I am using Python.

Answer №1

If you want to extract information using list comprehension and regular expressions, here's a sample code:

import re
string = """
influences = 
 {{hlist |[[Plato]] |[[Aristotle]] |[[Socrates]] |[[David Hume]] |[[Adam Smith]] |[[Cicero]] |[[John Locke]]}}
 {{hlist |[[Saint Augustine]] |[[Saint Thomas Aquinas]] |[[Saint Thomas More]] |[[Richard Hooker]] |[[Edward Coke]]}}
 {{hlist |[[Thomas Hobbes]] |[[Rene Descartes]] |[[Montesquieu]] |[[Joshua Reynolds]] |[[Sir William Blackstone|William Blackstone]]}}
 {{hlist |[[Niccolo Machiavelli]] |[[Dante Alighieri]] |[[Samuel Johnson]] |[[Voltaire]] |[[Jean Jacques Rousseau]] |[[Jeremy Bentham]]}}
"""

matches = [template.group(1) 
           for match in re.findall(r'\{\{hlist.+?\}}', string)
           for template in re.finditer(r'\[\[([^]]+)\]\]', match)]
print(matches)
# ['Plato', 'Aristotle', 'Socrates', 'David Hume', 'Adam Smith', 'Cicero', 'John Locke', 'Saint Augustine', 'Saint Thomas Aquinas', 'Saint Thomas More', 'Richard Hooker', 'Edward Coke', 'Thomas Hobbes', 'Rene Descartes', 'Montesquieu', 'Joshua Reynolds', 'Sir William Blackstone|William Blackstone', 'Niccolo Machiavelli', 'Dante Alighieri', 'Samuel Johnson', 'Voltaire', 'Jean Jacques Rousseau', 'Jeremy Bentham']

This method incorporates two regex patterns–one for the outer section ({{hlist...}}) and another for the inner portion ([[...]]).


Check out a demonstration on regex101.com.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

When a StaticFiles instance is mounted, FastAPI will issue a 405 Method Not Allowed response

Running a FastAPI application has been smooth sailing until I encountered an issue. In my current setup, the application script is as follows: import uvicorn from fastapi import FastAPI from starlette.responses import FileResponse app = FastAPI() @app.ge ...

Create a collection of JSON objects and save them to a text file

I need help filling an empty file named data.json with a JSON array using Python. Here is the script I have written: import json myArray = [] first = {} second = {} third = {} first["id"] = 1 first["name"] = "Foo" second["id"] = 2 second["name"] = "Joh ...

Experiencing an AssertionError: The result must be None, a string, bytes, or StreamResponse. Having trouble finding a resolution to this issue?

I have implemented a post API that reads JSON content from a file. When a request is made to the API, the device_id is provided. Using this device_id, I fetch entities related to it. Here's the code snippet: class EntityBasedOnDeviceId(HomeAssistantVi ...

Error: The specified module 'selenium' cannot be found

Encountering an error while executing a selenium script. Looking for suggestions on how to resolve this issue: Script: from selenium import webdriver from selenium.webdriver.common.keys import Keys import re import csv import time driver = webdriver.chrom ...

Basic paddleball game with an unresponsive ball

I've been learning Python through a course designed for kids, and one of the projects we worked on was creating a simple paddleball game. I managed to get the ball bouncing off the walls earlier, but now it's not working as expected after complet ...

What are the drawbacks of employing object as a type in Python?

I encountered an issue with the program below, as it resulted in a MyPy error: from typing import Type, TypeVar, Any, Optional T = TypeVar('T') def validate(element: Any, types: Type[T] = object) -> Optional[T]: if not isinstance(elem ...

What is the best way to output multiple values within an array in programming?

Here is my code snippet: main(){ var courselist = ["dart","flutter","swift","R"]; print(courselist[0]); } I would like to print 'dart' and 'R' from the array. How can I achieve this by printing 2 or more values from an array? I attem ...

PHP: Utilizing preg_replace for dynamic content manipulation

The following code snippet serves as a reference point to illustrate the concept I am attempting to convey: // $i = position of occurrence to replace // $r = content to replace private function inject($i, $r) { // Regular expression matches anythi ...

List of Links to Retrieve

I've asked numerous questions about this particular subject, and I apologize. But this is the final one. Below is the code in question: import urllib import urllib.request from bs4 import BeautifulSoup import sys from collections import defaultdict ...

What steps should I take to fix the type error in my project where the blogpost object is not iterable?

My views.py file contains the following code. I am considering creating another model to handle entries, but I believe it may not be necessary. from django.shortcuts import render, redirect from . models import BlogPost def index(request): ...

Difficulty in extracting certain data using Selenium

I have encountered a new challenge that is somewhat similar to my previous one. I recently learned that elements within an iframe cannot be located unless you switch to it, which has been helpful. However, I am still facing difficulties in finding elements ...

What exactly does the term "r-squared" represent in linear regression analysis?

I'm feeling a little puzzled about the definition of the r-squared score when it comes to linear regression models. From what I understand, the R-squared score indicates how much of the dependent variable can be explained by the independent variables. ...

Is there a way to delete all system-level pip packages currently installed on the system

When I first started learning Python, I didn't know about virtualenv and ended up installing all my packages at the system level. I decided to do a thorough cleanup to avoid any potential issues. However, when creating the requirements.txt file and a ...

python Efficiently load zipfile data into a numpy array

I am searching for a solution to efficiently read a zipfile into memory and extract its contents into a numpy array with numpy datatypes. The challenge lies in the fact that these files are large in size and there are numerous of them, making speed a cruci ...

Exploring web content using BeautifulSoup and Selenium

Seeking to extract average temperatures and actual temperatures from a specific website: Although I am able to retrieve the source code of the webpage, I am encountering difficulties in filtering out only the data for high temperatures, low temperatures, ...

The Django template engine effortlessly renders all of my files simultaneously

I've been grappling with this issue for quite some time now. The situation is as follows: I have a model that represents lectures, and I want to allow multiple files to be uploaded for each lecture. To accomplish this, I created a model with a FileFie ...

Create data visualization graphs displaying the frequency distribution of every numerical variable

I am struggling to create histograms showing the distribution of all numerical variables in my data. Is there a correct way to do this? import pandas as pd ind = [5375, 11681, 5325, 679, 12625, 8090, 11518, 16341, 2607,1742] dats = { 'index' : [5 ...

Using AJAX to send data with a POST request in Django may not function properly

Let me preface by saying I have searched for solutions online, but none of them seem to address my specific issue (mainly because they use outdated methods like Jason). I am currently working on a Django project and trying to implement ajax for a particul ...

When Button is clicked, it freezes indefinitely without a timeout

I'm currently learning Selenium and facing an issue with a button click function on a web page. Sometimes, the button.click() command freezes without raising any exceptions and fails to continue executing. Is there a way to prevent this freeze? Perha ...

Within the Django framework, where should I place the Python script that needs to be called by a JavaScript function?

When it comes to Django and file locations, I often find myself getting confused a lot, especially since I am using Django 1.10. Currently, in my static/(django-proj-name)/js/ folder, I have my main.js file where I need to call a Python script along with t ...