Skip to main content

All Questions

Tagged with
1508 votes
35 answers
2.3m views

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)

I'm having problems dealing with unicode characters from text fetched from different web pages (on different sites). I am using BeautifulSoup. The problem is that the error is not always ...
Homunculus Reticulli's user avatar
684 votes
20 answers
1.2m views

How to find elements by class

I'm having trouble parsing HTML elements with "class" attribute using Beautifulsoup. The code looks like this soup = BeautifulSoup(sdata) mydivs = soup.findAll('div') for div in mydivs: if (div["...
Neo's user avatar
  • 13.8k
560 votes
13 answers
1.2m views

UnicodeEncodeError: 'charmap' codec can't encode characters

I'm trying to scrape a website, but it gives me an error. I'm using the following code: import urllib.request from bs4 import BeautifulSoup get = urllib.request.urlopen("https://www.website.com/&...
SstrykerR's user avatar
  • 8,642
443 votes
22 answers
788k views

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

... soup = BeautifulSoup(html, "lxml") File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__ % ",".join(features)) bs4.FeatureNotFound: Couldn't ...
user3773048's user avatar
  • 6,159
379 votes
16 answers
522k views

How to remove \xa0 from string in Python?

I am currently using Beautiful Soup to parse an HTML file and calling get_text(), but it seems like I'm being left with a lot of \xa0 Unicode representing spaces. Is there an efficient way to remove ...
zhuyxn's user avatar
  • 7,031
358 votes
1 answer
715k views

BeautifulSoup getting href [duplicate]

I have the following soup: <a href="some_url">next</a> <span class="class">...</span> From this I want to extract the href, "some_url" I can do ...
dkgirl's user avatar
  • 4,729
308 votes
27 answers
535k views

Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org [duplicate]

I'm practicing the code from 'Web Scraping with Python', and I keep having this certificate problem: from urllib.request import urlopen from bs4 import BeautifulSoup import re pages = set() def ...
Catherine4j's user avatar
  • 3,120
229 votes
5 answers
273k views

TypeError: a bytes-like object is required, not 'str' in python and CSV

TypeError: a bytes-like object is required, not 'str' I'm getting the above error while executing the below python code to save the HTML table data in a CSV file. How do I get rid of that error? ...
ShivaGuntuku's user avatar
  • 5,434
221 votes
11 answers
464k views

Extracting an attribute value with beautifulsoup

I am trying to extract the content of a single "value" attribute in a specific "input" tag on a webpage. I use the following code: import urllib f = urllib.urlopen("http://58....
Barnabe's user avatar
  • 2,335
219 votes
13 answers
560k views

Beautiful Soup and extracting a div and its contents by ID

soup.find("tagName", { "id" : "articlebody" }) Why does this NOT return the <div id="articlebody"> ... </div> tags and stuff in between? It returns nothing. And I know for a fact it ...
Tony Stark's user avatar
  • 25.3k
193 votes
26 answers
565k views

ImportError: No Module Named bs4 (BeautifulSoup) [duplicate]

I'm working in Python and using Flask. When I run my main Python file on my computer, it works perfectly, but when I activate venv and run the Flask Python file in the terminal, it says that my main ...
harryt's user avatar
  • 2,093
191 votes
16 answers
354k views

retrieve links from web page using python and BeautifulSoup [closed]

How can I retrieve the links of a webpage and copy the url address of the links using Python?
NepUS's user avatar
  • 1,979
185 votes
7 answers
356k views

How to find children of nodes using BeautifulSoup

I want to get all the <a> tags which are children of <li>: <div> <li class="test"> <a>link1</a> <ul> <li> <a>link2<...
tej.tan's user avatar
  • 4,137
167 votes
9 answers
96k views

Difference between BeautifulSoup and Scrapy crawler?

I want to make a website that shows the comparison between amazon and e-bay product price. Which of these will work better and why? I am somewhat familiar with BeautifulSoup but not so much with ...
Nishant Bhakta's user avatar
166 votes
10 answers
332k views

can we use XPath with BeautifulSoup?

I am using BeautifulSoup to scrape an URL and I had the following code, to find the td tag whose class is 'empformbody': import urllib import urllib2 from BeautifulSoup import BeautifulSoup url = &...
Shiva Krishna Bavandla's user avatar

15 30 50 per page
1
2 3 4 5
1884