博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
第一个爬虫测试
阅读量:4652 次
发布时间:2019-06-09

本文共 4977 字,大约阅读时间需要 16 分钟。

今天主要介绍网页爬虫以及上次的比赛预测程序的测试

1.首先是对上次的测试程序进行测试

# -*- coding: utf-8 -*-"""Created on Thu May 23 00:41:48 2019@author: h2446"""from random import randomdef printIntro():    print("这个程序模拟两支排球队A和B的排球比赛")    print("程序运行需要A和B的能力值(以0到1之间的小数表示)")def getInputs():    a=eval(input("请输入队伍A的能力值(0~1):"))    b=eval(input("请输入队伍B的能力值(0~1):"))    n=eval(input("模拟比赛的场次:"))    return a,b,ndef simNGames(n,probA,probB):    winsA,winsB=0,0    for i in range(n):        scoreA,scoreB=simOneGame(probA,probB)        if scoreA>scoreB:            winsA +=1        else:            winsB +=1        return winsA,winsBdef gameOver(a,b):        if (a>25 and abs(a-b)>=2 )or(b>25 and abs(a-b)>=2):            return True        else:            return Falsedef simOneGame(probA,probB):    scoreA,scoreB=0,0    serving = "A"    while not gameOver(scoreA,scoreB):        if serving =="A":            if random()
scoreB: winsA+=1 else: winsB+=1 return winsA,winsBdef simOneGame2(probA,probB): scoreA,scoreB=0,0 serving="A" while not GG(scoreA,scoreB): if serving=="A": if random() < probA: scoreA += 1 else: serving="B" else: if random() < probB: scoreB += 1 else: serving="A" return scoreA,scoreBdef simOneGame1(probA,probB): scoreA,scoreB=0,0 serving="A" while not finalGameOver(scoreA,scoreB): if serving=="A": if random() < probA: scoreA += 1 else: serving="B" else: if random() < probB: scoreB += 1 else: serving="A" return scoreA,scoreBdef GG(a,b): return a==3 or b==3def finalGameOver(a,b): if (a==8 or b==8): if a>b: print("A队获得8分,双方交换场地") else: print("B队获得8分,双方交换场地") if (a>15 and abs(a-b)>=2 )or(b>15 and abs(a-b)>=2): return True else: return Falsedef finalprintSummary(winsA,winsB): n=winsA+winsB if n>=4: print("进行最终决赛") if winsA>winsB: print("最终决赛由A获胜") else: print("最终决赛由B获胜") else: if winsA>winsB: print("最终决赛由A获胜") else: print("最终决赛由B获胜")def printSummary(winsA,winsB): n=winsA+winsB print("竞技分析开始,共模拟{}场比赛".format(n)) print("选手A获胜{}场比赛,占比{:0.1%}".format(winsA,winsA/n)) print("选手B获胜{}场比赛,占比{:0.1%}".format(winsB,winsB/n))def main(): printIntro() probA,probB,n=getInputs() winsA,winsB=simNGames(n,probA,probB) printSummary(winsA,winsB) final(probA,probB)try: main()except: print("Error")

测试及结果为

2.接下来是网页爬虫的应用

先是爬中国大学排名代码如# -*- coding: utf-8 -*-"""

Created on Wed May 22 16:08:16 2019@author: h2446"""import requestsfrom bs4 import BeautifulSoupallUniv = []def getHTMLText(url):    try:        r=requests.get(url,timeout=30)        r.raise_for_status()        r.encoding='utf-8'        return r.text    except:        return""def fillUniVList(soup):    data=soup.find_all('tr')    for tr in data:        ltd = tr.find_all('td')        if len(ltd)==0:            continue        singleUniv = []        for td in ltd:            singleUniv.append(td.string)        allUniv.append(singleUniv)def printUnivList(num):    print("
{1:^2}{2:{0}^10}{3:{0}^6}{4:{0}^4}{5:{0}^10}
".format("排名","学校名称","省事","总分","培养规模")) for i in range(num): u=allUniv[i] print("{:^4}{:^10}{:^5}{:^8}{:^10}".format(u[0],u[1],u[2],u[3],u[6])) def main(num): url = "http://www.zuihaodaxue.cn/zuihaodaxuepaiming2017.html"     html = getHTMLText(url)     soup = BeautifulSoup(html,"html.parser")     fillUniVList(soup)     printUnivList(num)  main(10)

运行结果为

出乎我意料的是这与上课时提取2016年的完全不一样

可惜我是16号

后来我将那个网址放到浏览器里确实是2017年的中国大学排名

然后我在按照上课时的网址发现

居然成功了

3.然后是爬取谷歌网的

# -*- coding: utf-8 -*-"""Created on Wed May 22 16:08:16 2019@author: h2446"""import requestsfrom bs4 import BeautifulSoupalluniv = []def getHTMLText(url):    try:        r = requests.get(url,timeout = 30)        r.raise_for_status()        r.encoding = 'utf-8'        return r.text    except:        return "error"def xunhuang(url):     for i in range(20):         getHTMLText(url)def fillunivlist(soup):    data=soup.find_all('tr')    for tr in data:        ltd =tr.find_all('td')        if len(ltd)==0:            continue        singleuniv=[]        for td in ltd:            singleuniv.append(td.string)        alluniv.append(singleuniv)def printf():     print("\n")     print("\n")     print("\n")def main():     url = "http://www.google.com"     html=getHTMLText(url)     xunhuang(url)     print(html)     soup=BeautifulSoup(html,"html.parser")     fillunivlist(soup)     print(html)     printf()     print(soup.title)     printf()     print(soup.head)     printf()     print(soup.body)main()

但由于谷歌网早些年在中国被封的原因

我得到的结果是这样的

转载于:https://www.cnblogs.com/Daisylin/p/10944304.html

你可能感兴趣的文章
mysql触发器
查看>>
我是怎么让全国最大的儿童失踪预警平台流量掉底的
查看>>
领扣(LeetCode)二叉树的中序遍历 个人题解
查看>>
MySQL5.5登录密码忘记了,怎嘛办?
查看>>
[javascript]9宫格拖拽拼图游戏 puzzle
查看>>
论文笔记《Hand Gesture Recognition with 3D Convolutional Neural Networks》
查看>>
java内部类
查看>>
Entity Framework底层操作封装(3)
查看>>
python 全栈开发,Day37(操作系统的发展史)
查看>>
InputStream 转换 InputStreamReader再转换BufferedReader
查看>>
在线程池中的使用spring aop事务增强
查看>>
继续深入了解Cookie 和 Session
查看>>
再看《操作系统》--处理机管理
查看>>
亚马逊的负载均衡(Amazon Load Balancing)
查看>>
Java学习之Comparable与Comparator的区别
查看>>
微信小程序canvas把正方形图片绘制成圆形
查看>>
CentOS安装python-2.7+安装pip-10.0.0
查看>>
MQTT_基础学习
查看>>
bzoj4556: [Tjoi2016&Heoi2016]字符串
查看>>
串行通信概念解析
查看>>