EuroPython 2015

Dive into Scrapy

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

In this talk some advanced techniques will be shown based on how Scrapy is used at Scrapinghub.

Goals:

  • Understand why its necessary to Scrapy-ify early on.
  • Anatomy of a Scrapy Spider.
  • Using the interactive shell.
  • What are items and how to use item loaders.
  • Examples of pipelines and middlewares.
  • Techniques to avoid getting banned.
  • How to deploy Scrapy projects.

in on Tuesday 21 July at 11:45 See schedule

Video


Do you have some questions on this talk?

New comment