스크래피 (웹 프레임워크)

스크래피
Scrapy
개발자	Scrapinghub, Ltd.
발표일	2008년 6월 26일
안정화 버전	1.8.0 / 2024년 5월 14일(52일 전)
저장소	github.com/scrapy/scrapy
프로그래밍 언어	Python
운영 체제	윈도우, macOS, 리눅스
종류	웹 크롤러
라이선스	BSD 허가서
웹사이트	scrapy.org

스크래피(Scrapy, /ˈskreɪpi/ SKRAY-pee)^[1]는 파이썬으로 작성된 오픈소스 웹 크롤링 프레임워크이다. 웹 데이터를 수집하는 것을 목표로 설계되었다. 또한 API를 이용하여 데이터를 추출할 수 있고, 범용 웹 크롤러로 사용될 수 있다.^[2] Scrapy는 웹 스크래핑 개발 및 서비스 회사 Scrapinghub Ltd.에 의해 유지된다.

Scrapy 프로젝트는 "spiders"를 중심으로 개발되었다. "spiders"는 여러 기능이 내장된 크롤러이다. 장고와 같은 철학인 중복배제를 따르고 있는 프레임워크이다.^[3] Scrapy는 개발자들이 코드 재사용성을 높일 수 있도록 도와주어, 큰 규모의 크롤링 프로젝트 개발을 쉽게 할 수 있도록 해준다. 또한 Scrapy는 개발자들이 크롤링하려는 사이트의 동작을 테스트할 수 있도록 웹 크롤링 셸을 제공한다.^[4]

Scrapy는 Lyst,^[5]^[6] Parse.ly,^[7] Sayone Technologies^[8], Sciences Po Medialab,^[9] Data.gov.uk’s World Government Data site.^[10][1] Archived 2018년 8월 16일 - 웨이백 머신 등등의 기업에서 사용되고 있다.

각주[편집]

↑ How do you pronounce "Scrapy"?
↑ Scrapy at a glance.
↑ “Frequently Asked Questions”. 2015년 7월 28일에 확인함.
↑ “Scrapy shell”. 2015년 7월 28일에 확인함.
↑ Bell, Eddie; Heusser, Jonathan. “Scalable Scraping Using Machine Learning”. 2016년 10월 9일에 원본 문서에서 보존된 문서. 2015년 7월 28일에 확인함.
↑ Scrapy | Companies using Scrapy
↑ Montalenti, Andrew. “Web Crawling & Metadata Extraction in Python”.
↑ “Scrapy Companies”. 《Scrapy website》.
↑ Hyphe v0.0.0: the first release of our new webcrawler is out!
↑ Ben Firshman [bfirsh] (2010년 1월 21일). “World Govt Data site uses Django, Solr, Haystack, Scrapy and other exciting buzzwords bit.ly/5jU3La #opendata #datastore” (트윗).

[1] How do you pronounce "Scrapy"?

[2] Scrapy at a glance.

[3] “Frequently Asked Questions”. 2015년 7월 28일에 확인함.

[4] “Scrapy shell”. 2015년 7월 28일에 확인함.

[5] Bell, Eddie; Heusser, Jonathan. “Scalable Scraping Using Machine Learning”. 2016년 10월 9일에 원본 문서에서 보존된 문서. 2015년 7월 28일에 확인함.

[6] Scrapy | Companies using Scrapy

[7] Montalenti, Andrew. “Web Crawling & Metadata Extraction in Python”.

[8] “Scrapy Companies”. 《Scrapy website》.

[9] Hyphe v0.0.0: the first release of our new webcrawler is out!

[10] Ben Firshman [bfirsh] (2010년 1월 21일). “World Govt Data site uses Django, Solr, Haystack, Scrapy and other exciting buzzwords bit.ly/5jU3La #opendata #datastore” (트윗).

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]