2019-12-18 捕抓异常 爬虫
捕抓异常
>>> ###捕抓异常 ... >>> a=10 >>> b=a+'hello' Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'int' and 'str' >>> >>> >>> try: ... a=10 ... b=a+'hello' ... except Exception as e: File "<stdin>", line 4 except Exception as e: ^ SyntaxError: invalid syntax >>> >>> >>> try: ... a=10 ... b=a+'hello' ... except Exception as e: ... print(e) ... unsupported operand type(s) for +: 'int' and 'str' >>> >>> >>> ##有遗留的问题---出现错误的时候,数据库怎么自动回滚
爬虫
>>> ###创建数据库 ... #|-requests 用来获取页面的内容 ... #|-BeautifulSoup 用来获取网页元素 这两个都有中文版,写上python就好 ... >>> >>> url='https://bj.lianjia.com/zufang/' KeyboardInterrupt >>> >>> ###安装模块 ... #pip install requests ... #pip install bs4 ... >>> import requests >>> from bs4 import BeautifulSoup >>> url='https://bj.lianjia.com/zufang/' >>> ##实验目的:通过首页链接,过去具体的租房信息(价格,大小,位置等) ... >>> responce = requests.get(url) ##获取一个页面的信息 >>> soup = BeautifulSoup(responce.text,'lxml') ##text是已经获取的网页信息,beaut做元素铺抓,做成lxml的网页代码格式 >>> links_div = soup.find_all('div',class="content__list--item") ##进行查找,找出有链接的div File "<stdin>", line 1 links_div = soup.find_all('div',class="content__list--item") ##进行查找,找出有链接的div ^ SyntaxError: invalid syntax >>> links_div = soup.find_all('div',class_="content__list--item") ##进行查找,找出有链接的div >>> ##报错的原因是,python有一个函数是class,为了区分,需要加_ >>> links_div[0] >>> links=[div.a.get('href') for div in links_div] >>> ##封装成函数,作用是获取列表页下面的所有租房页面的链接,返回一个链接列表
其他参数:https://blog.csdn.net/weixin_43930694/article/details/90142678
Comments | NOTHING