|
|
@ -6,9 +6,22 @@
|
|
|
|
*[快速开始](#下载安装)
|
|
|
|
*[快速开始](#下载安装)
|
|
|
|
*[下载安装](#下载安装)
|
|
|
|
*[下载安装](#下载安装)
|
|
|
|
*[创建一个分布式爬虫](#爬虫开始)
|
|
|
|
*[创建一个分布式爬虫](#爬虫开始)
|
|
|
|
|
|
|
|
*[修改配置文件](#修改配置文件)
|
|
|
|
|
|
|
|
*[启动爬虫](#启动爬虫)
|
|
|
|
|
|
|
|
|
|
|
|
###下载安装
|
|
|
|
###下载安装
|
|
|
|
pip install -r requirements.txt
|
|
|
|
pip install -r requirements.txt
|
|
|
|
|
|
|
|
|
|
|
|
###爬虫开始
|
|
|
|
###爬虫开始
|
|
|
|
|
|
|
|
scrapy startproject XXX(项目名)
|
|
|
|
|
|
|
|
cd XXX
|
|
|
|
|
|
|
|
scrapy genspider xxx(爬虫名) www.baidu.com(域名-示例)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
###修改配置文件
|
|
|
|
|
|
|
|
1、配置setting.py文件
|
|
|
|
|
|
|
|
2、根据任务需求更改items.py,middlewares.py,pipelines.py文件
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
###启动爬虫
|
|
|
|
|
|
|
|
1、运行
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|