python3.4爬虫demo_Python

python3.4爬虫demo

2021-05-20 00:23chenqiangdage Python

今天小编就为大家分享一篇关于python3.4爬虫demo，小编觉得内容挺不错的，现在分享给大家，具有很好的参考价值，需要的朋友一起跟随小编来看看吧

仅仅是个demo，以百度图片首页图片为例。能跑出图片上的图片；

使用 eclipse pydev 编写：

				?

									from SpiderSimple.HtmLHelper import *

									import imp

									import sys

									imp.reload(sys) 

									#sys.setdefaultencoding('utf-8')  

									html = getHtml('http://image.baidu.com/')

									try:

									  getImage(html)

									  exit()

									except Exception as e:

									  print(e)

HtmlHelper.py文件

上面的 SpiderSimple是自定义的包名

				?

									from urllib.request import urlopen,urlretrieve

									#正则库

									import re

									#打开网页

									def getHtml(url):

									  page = urlopen(url)        

									  html = page.read()

									  return html

									#用正则爬里面的图片地址  

									def getImage(Html):

									  try:

									    #reg = r'src="(.+?\.jpg)" class'

									    #image = re.compile(reg)  

									    image = re.compile(r'<img[^>]*src[=\"\']+([^\"\']*)[\"\'][^>]*>', re.I)     

									    Html = Html.decode('utf-8')

									    imaglist = re.findall(image,Html)    

									    x =0   

									    for imagurl in imaglist:  

									      #将图片一个个下载到项目所在文件夹     

									      urlretrieve(imagurl, '%s.jpg' % x)

									      x+=1

									  except Exception as e:

									    print(e)

要注意个大问题，python 默认编码的问题。

有可能报UnicodeDecodeError: 'ascii' codec can't decode byte 0x?? in position 1: ordinal not in range(128)，错误。这个要设置python的默认编码为utf-8.

设置最好的方式是写bat文件，

				?

									echo off

									set PYTHONIOENCODING=utf8

									python -u %1

然后重启电脑。

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，谢谢大家对服务器之家的支持。如果你想了解更多相关内容请查看下面相关链接

原文链接：https://blog.csdn.net/chenqiangdage/article/details/51168231

python3.4爬虫demo

延伸 · 阅读

python 列表转为字典的两个小方法(小结)

在Windows系统上搭建Nginx+Python+MySQL环境的教程

使用NumPy和pandas对CSV文件进行写操作的实例

Python的dict字典结构操作方法学习笔记

Python实现ping指定IP的示例

Python3以GitHub为例来实现模拟登录和爬取的实例讲解

python 插入Null值数据到Postgresql的操作

python直接访问私有属性的简单方法

PyCharm设置SSH远程调试的方法

Python安装图文教程 Pycharm安装教程

python是什么意思？python有什么用？

使用Python抓取模板之家的CSS模板

Python 列表(List)操作方法详解