뷰티풀 수프 (HTML 파서)

뷰티풀 수프
원저자	Leonard Richardson
발표일	2004년
안정화 버전	4.12.3 / 2024년 1월 17일(3개월 전)
저장소	code.launchpad.net/beautifulsoup/ ;
프로그래밍 언어	파이썬
플랫폼	파이썬
종류	HTML 파서 유틸리티, 웹 스크래핑
라이선스	파이썬 소프트웨어 재단 라이선스 (Beautiful Soup 3 - 구형 버전) MIT 라이선스 4+
웹사이트	www.crummy.com/software/BeautifulSoup/

뷰티풀 수프(Beautiful Soup)는 HTML과 XML 문서들의 구문을 분석하기 위한 파이썬 패키지이다. HTML로부터 데이터를 추출하기 위해 사용할 수 있는 파싱된 페이지의 파스 트리를 만드는데, 이는 웹 스크래핑에 유용하다.

뷰티풀 수프는 이 프로젝트를 계속 기여하고 있는 Leonard Richardson이 시작하였다. 추가적인 지원은 오픈 소스 유지보수를 위한 유료 구독형인 Tidelift의 지원을 받는다.

파이썬 2.7과 파이썬 3용으로 사용 가능하다.

예시 코드[편집]

#!/usr/bin/env python3
# Anchor extraction from HTML document
from bs4 import BeautifulSoup
from urllib.request import urlopen
with urlopen('https://en.wikipedia.org/wiki/Main_Page') as response:
    soup = BeautifulSoup(response, 'html.parser')
    for anchor in soup.find_all('a'):
        print(anchor.get('href', '/'))

각주[편집]

↑ https://git.launchpad.net/beautifulsoup/tree/CHANGELOG; 확인한 날짜: 2024년 1월 18일.
↑ “Beautiful Soup website”. 2012년 4월 18일에 확인함. Beautiful Soup is licensed under the same terms as Python itself

[wikidata-270cf90818bd03dc83ccffd63c9903d697c1d933-v3-1] ttps://git.launchpad.net/beautifulsoup/tree/CHANGELOG; 확인한 날짜: 2024년 1월 18일.

[crummy.com-2] “Beautiful Soup website”. 2012년 4월 18일에 확인함. Beautiful Soup is licensed under the same terms as Python itself

[1]

[2]