python第三节笔记

2019-12-18 捕抓异常爬虫

捕抓异常

>>> ###捕抓异常
... 
>>> a=10
>>> b=a+'hello' 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'
>>> 
>>> 
>>> try:
...     a=10
...     b=a+'hello'
...     except Exception as e:
  File "<stdin>", line 4
    except Exception as e:
         ^
SyntaxError: invalid syntax
>>> 
>>> 
>>> try:
...     a=10
...     b=a+'hello'
... except Exception as e:
...     print(e)
... 
unsupported operand type(s) for +: 'int' and 'str'
>>> 
>>> 
>>> ##有遗留的问题---出现错误的时候，数据库怎么自动回滚

爬虫

>>> ###创建数据库
...     #|-requests 用来获取页面的内容
...     #|-BeautifulSoup 用来获取网页元素  这两个都有中文版，写上python就好
... 
>>> 
>>> url='https://bj.lianjia.com/zufang/'
KeyboardInterrupt
>>> 
>>> ###安装模块
...     #pip install requests
...     #pip install bs4
... 
>>> import requests
>>> from bs4 import BeautifulSoup
>>> url='https://bj.lianjia.com/zufang/'
>>> ##实验目的：通过首页链接，过去具体的租房信息（价格，大小，位置等）
... 
>>> responce = requests.get(url) ##获取一个页面的信息
>>> soup = BeautifulSoup(responce.text,'lxml') ##text是已经获取的网页信息，beaut做元素铺抓，做成lxml的网页代码格式
>>> links_div = soup.find_all('div',class="content__list--item") ##进行查找，找出有链接的div
  File "<stdin>", line 1
    links_div = soup.find_all('div',class="content__list--item") ##进行查找，找出有链接的div
                                        ^
SyntaxError: invalid syntax
>>> links_div = soup.find_all('div',class_="content__list--item") ##进行查找，找出有链接的div
>>> ##报错的原因是，python有一个函数是class，为了区分，需要加_
>>> links_div[0]
>>> links=[div.a.get('href') for div in links_div]


>>> ##封装成函数，作用是获取列表页下面的所有租房页面的链接，返回一个链接列表

其他参数：https://blog.csdn.net/weixin_43930694/article/details/90142678

点击展开查看更多

ひょうりゅ

python第三节笔记

2019-12-18 捕抓异常爬虫

捕抓异常

爬虫

Comments | NOTHING

取消回复

ひょうりゅ

2019-12-18 捕抓异常 爬虫

捕抓异常

爬虫

Comments | NOTHING

取消回复

2019-12-18 捕抓异常爬虫