All Questions
Tagged with beautifulsoup python
28,250
questions
-2
votes
0
answers
20
views
Beautifulsoup not finding "li" class when a url with search word is given [closed]
I am trying to extract "li" class variable using BeautifulSoup from a ULTA.com for cosmetic products.
Code:
page = requests.get(url)
soup = BeautifulSoup(page.content,'html.parser')
...
0
votes
0
answers
24
views
Where do I look to find a domain's policy on web scraping? [closed]
I often use bs4 and requests in python to gather data from a website.
Where can I find the policy on data collection? (Keywords, Sections, shortcuts, etc.)
I'm hesitant to post any of my code on the ...
2
votes
1
answer
43
views
Get an <a> tag content using BeautifulSoup
I'd like to get the content of an <a> tag using BeautifulSoup (version 4.12.3) in Python.
I have this code and HTML exemple:
h = """
<a id="0">
<table>
...
-4
votes
0
answers
32
views
Efficient Strategies for Scraping Multiple Pages from a Website [closed]
I am developing a web scraper to extract data (such as titles, authors, and prices) from a website that has potentially up to 2100 pages. I’m seeking advice on the most efficient strategy for handling ...
0
votes
1
answer
36
views
Can't get all span tags inside div element using beautifulsoup
I am scraping the product details page text on Amazon, but I get the data back as bullet list. I want to have the data added as a column next to other scraped data.
csv export
Amazon Product Details
A ...
0
votes
0
answers
26
views
Sending Chunks to Mistral API and Handling Streaming Responses
I need to send HTML content in chunks to the Mistral API and receive the responses for all chunks in one go. My code processes the content in chunks and sends them to the Mistral API using streaming. ...
0
votes
2
answers
84
views
scrape a website which has the same url for multiple pages? with the page jump being an ajax request
I've been at this for days, I'm trying to scrape this website: "https://careers.ispor.org/jobseeker/search/results/".
I've got everything covered, from the script that will extract the ...
0
votes
0
answers
43
views
Why is sys.stdout adjustment needed to print Unicode in Python?
I scraped some data from the web using:
import requests
from bs4 import BeautifulSoup
def get_lines_from_url(url):
response = requests.get(url)
if response.status_code == 200:
soup =...
0
votes
0
answers
33
views
How can I extract comments from Word documents *and* match them up with the highlighted text and the full sentence in which the comments appear? [duplicate]
Note: This question was originally marked as a duplicate of this one. It's related, but I don't think it's an exact duplicate because my question asks how to extract the comment and the highlighted ...
0
votes
1
answer
28
views
HTTP Error 404 when scraping first table using BeautifulSoup, but second table works fine
I’m working on a Python script to scrape historical CDS data from Investing.com using BeautifulSoup. The goal is to extract data from a specific table on the page and compile it into a DataFrame.
Here’...
0
votes
1
answer
38
views
selenium's driver gets wrong page in python
I am trying to scrape certain odds for a football tournament. To this end I wrote a piece of code which first generates the exact link I want and then loads the corresponding page. The problem is, the ...
-1
votes
0
answers
50
views
Using firefox selenium to scrape a page with infinite scroll resulting in error, possibly due to too much data
I'm trying to scrape this page with infinite scroll on meetup for a list of past events. I want to get a list of events including name, date, and URL (mostly just the name, the other 2 are optional).
...
-1
votes
1
answer
76
views
Issues with Web Scraping Automation in Python Using Selenium
I’m having trouble with my ETL process. Let me explain my problem, I have this code:
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
import pandas as pd
import ...
1
vote
1
answer
25
views
How to get text in beautifulsoup as .innerText and not as .textContent in JS
I have an HTML file that contains text inside a p tag, something like this:
<body>
<p>Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Maecenas sed mi lacus.
...
0
votes
1
answer
79
views
Need an explanation for a web scraping lambda function in Python
I'm doing Web Scraping in Python and I've found this :
products = soup.find_all('li')
products_list = []
for product in products:
name = product.h2.string
price = product.find('p', string=...