Newest 'beautifulsoup+regex' Questions

1 vote

1 answer

54 views

How to handle regex in BeautifulSoup / CSS selector?

I'm looking for a solution to use regex in BeautifulSoup to find elements that may contain the text HO # with possible spaces and ignoring case sensitivity. check_ho_number3 = soup.select_one('td:-...

Tonin thomas

37

asked Mar 28 at 4:23

0 votes

2 answers

511 views

Extract Text from Unstructured HTML using Python and Beautiful Soup

For the HTML code below, how do I extract the content below aaa, bbb after the tag using regular Expressions and Beautiful Soup with the Python Requests Library <html> <head></head> ...

ZASE

66

asked Dec 17, 2023 at 21:28

0 votes

2 answers

114 views

How to find main price and discounted price in a webpage using selenium and python?

I am trying to find a way to find main price and also discounted price in a webpage but I can get just one of them and I need a good pattern or method to extract all price and discounted prices from ...

Alireza Mirhabibi - IRAN

70

asked Nov 26, 2023 at 11:15

0 votes

1 answer

51 views

regex code to find email address within HTML script webscraping

I am trying to extract phone, address and email from couple of corporate websites through webscraping My code for that is as follows l = 'https://www.zimmermanfinancialgroup.com/about' address_t = [] ...

anonymous13

601

asked Apr 19, 2023 at 20:15

-3 votes

1 answer

43 views

Manipulate string in python

I am scraping web content with Beautifulsoup, Python and I would like to manipulate the following strings: 'Induktora 28" 36V/14 Ah | 16.5" Bordo' 'Induktora 28" 36V/14 Ah | 18" ...

Bohumír Mäsiar

139

asked Jan 17, 2023 at 23:09

-1 votes

2 answers

464 views

BeautifulSoup: Search and replace in the text parts of HTML

I want to do a search and replace on the textual part of the content of the HTML elements. E.g., replacing foo with <b>bar</b> in <div id="foo">foo <i>foo</i> ...

HappyFace

4,007

asked Oct 31, 2022 at 12:24

0 votes

1 answer

63 views

Python - Beautifulsoup - parse multiple span elements

I am trying to extract title from 'span'. Using the below code as an example, the output I am looking for is 6536 and 9319, which are part of 'title'. Seen below: span aria-label="6536 users ...

JJH

9

asked Oct 6, 2022 at 23:01

0 votes

1 answer

70 views

What's the proper way to exclude uppercase word/s in regex python

Let's say I've scrapped this from a website. PARIS - Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua (2015). Ut enim ad minim ...

pixsay

27

asked Sep 30, 2022 at 4:53

0 votes

1 answer

411 views

Regex pattern with accented characters

I am trying to get the words that start with a capital letter regardless of whether it has a special character or not in the word. Currently, my pattern only gets capital letters without accents. I ...

user12096189

asked Aug 8, 2022 at 21:27

1 vote

2 answers

58 views

How to extract key info from <script> tag

I'm trying to extract the user id from this link https://www.instagram.com/design.kaf/ using bs4 and Regex Found a JSON key inside script tag called "profile_id" but I can't even search that ...

Hossam Hassan

15

asked Aug 1, 2022 at 6:07

1 vote

1 answer

35 views

Find string between two sets of characters or 3rd and 4th quotation marks

I have been playing with Beautifulsoup and re to collect only the links I need from a webpage. I was able to cut the page content to a <class 'bs4.element.ResultSet'> This dataset contains the ...

Blackwidow

146

asked Jun 11, 2022 at 23:01

0 votes

1 answer

478 views

api request parameters are ignored

This code works as expected and shows 3 recent wikipedia editors. My question is that if I uncomment the second URL line, I should get Urmi27 three times or None if the user is not listed. But I get ...

shantanuo

32.2k

asked Jun 2, 2022 at 4:06

1 vote

2 answers

51 views

Extracting RegEx pattern across list excluding other html code

I've written a script to pull a list of available report url extensions page available for text extraction. I've used parsing and BeautifulSoup to extract the reference area for the latest report ...

Pryore

520

asked May 16, 2022 at 11:16

-2 votes

2 answers

51 views

How to extract text from html in Python with BeautifulSoup4

I am trying to extract its text i.e only the filename from the below html tags So in the end I would like to have output as below- BeforeStructure.PNG AfterStructure.PNG Can you please guide how to I ...

Deepali

97

asked Apr 22, 2022 at 4:05

1 vote

2 answers

687 views

Adding line breaks after times in parentheses

I'm trying to clean up some data from web scraping. This is an example of the information I'm working with: Best Time Adam Jones (w/ help) (6:34)Best Time Kenny Gobbin (a) (2:38)Personal Best Matt ...

SpingoTakagi

71

asked Mar 28, 2022 at 17:39

Collectives™ on Stack Overflow

All Questions

How to handle regex in BeautifulSoup / CSS selector?

Extract Text from Unstructured HTML using Python and Beautiful Soup

How to find main price and discounted price in a webpage using selenium and python?

regex code to find email address within HTML script webscraping

Manipulate string in python

BeautifulSoup: Search and replace in the text parts of HTML

Python - Beautifulsoup - parse multiple span elements

What's the proper way to exclude uppercase word/s in regex python

Regex pattern with accented characters

How to extract key info from <script> tag

Find string between two sets of characters or 3rd and 4th quotation marks

api request parameters are ignored

Extracting RegEx pattern across list excluding other html code

How to extract text from html in Python with BeautifulSoup4

Adding line breaks after times in parentheses

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags