一、模擬登陸需要賬號(hào),密碼的網(wǎng)址
一些不需要登陸的網(wǎng)址操作已經(jīng)試過(guò)了,這次來(lái)用Python嘗試需要登陸的網(wǎng)址,來(lái)利用cookie模擬登陸
由于我們教務(wù)系統(tǒng)有驗(yàn)證碼偏困難一點(diǎn),故挑了個(gè)軟柿子捏,賽氪,https://www.saikr.com
我用的是火狐瀏覽器自帶的F12開(kāi)發(fā)者工具,打開(kāi)網(wǎng)址輸入賬號(hào),密碼,登陸,如圖
可以看到捕捉到很多post和get請(qǐng)求,第一個(gè)post請(qǐng)求就是我們提交賬號(hào)和密碼的,
點(diǎn)擊post請(qǐng)求的參數(shù)選項(xiàng)可以看到我們提交的參數(shù)在bian表單數(shù)據(jù)里,name為賬戶名,pass為加密后的密碼,remember為是否記住密碼,0為不記住密碼。
我們?cè)賮?lái)看看headers,即消息頭
我們把這些請(qǐng)求頭加到post請(qǐng)求的headers后對(duì)網(wǎng)頁(yè)進(jìn)行模擬登陸,
Cookie為必填項(xiàng),否則會(huì)報(bào)錯(cuò):
{"code":403,"message":"訪問(wèn)超時(shí),請(qǐng)重試,多次出現(xiàn)此提示請(qǐng)聯(lián)系QQ:1409765583","data":[]}
便可以創(chuàng)建一個(gè)帶有cookie的opener,在第一次訪問(wèn)登錄的URL時(shí),將登錄后的cookie保存下來(lái),然后利用帶有這個(gè)cookie的opener來(lái)訪問(wèn)該網(wǎng)址的其他版塊,查看登錄之后才能看到的信息。
比如我是登陸https://www.saikr.com/login后模擬登陸了“我的競(jìng)賽”版塊https://www.saikr.com/u/5598522
代碼如下:
- import urllib
- from urllib import request
- from http import cookiejar
- login_url = "https://www.saikr.com/login"
- postdata ={
- "name": "your account","pass": "your password(加密后)"
- }
- header = {
- "Accept":"application/json, text/javascript, */*; q=0.01",
- "Accept-Language":"zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2",
- "Connection":"keep-alive",
- "Host":"www.saikr.com",
- "Referer":"https://www.saikr.com/login",
- "Cookie":"your cookie",
- "Content-Type":"application/x-www-form-urlencoded; charset=UTF-8",
- "TE":"Trailers","X-Requested-With":"XMLHttpRequest"
- }
- postdata = urllib.parse.urlencode(postdata).encode('utf8')
- #req = requests.post(url,postdata,header)
- #聲明一個(gè)CookieJar對(duì)象實(shí)例來(lái)保存cookie
- cookie = cookiejar.CookieJar()
- #利用urllib.request庫(kù)的HTTPCookieProcessor對(duì)象來(lái)創(chuàng)建cookie處理器,也就CookieHandler
- cookie_support = request.HTTPCookieProcessor(cookie)
- #通過(guò)CookieHandler創(chuàng)建opener
- opener = request.build_opener(cookie_support)
- #創(chuàng)建Request對(duì)象
- my_url="https://www.saikr.com/u/5598522"
- req1 = request.Request(url=login_url, data=postdata, headers=header)#post請(qǐng)求
- req2 = request.Request(url=my_url)#利用構(gòu)造的opener不需要cookie即可登陸,get請(qǐng)求
- response1 = opener.open(req1)
- response2 = opener.open(req2)
- print(response1.read().decode('utf8'))
- print(response2.read().decode('utf8'))
到此就告一段落了:
ps:有點(diǎn)小插曲,當(dāng)在headers里加入
Accept-Encoding | gzip, deflate, br |
時(shí),最后在 print(response1.read().decode('utf8'))時(shí)便會(huì)報(bào)錯(cuò)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
原因:在請(qǐng)求header中設(shè)置了'Accept-Encoding': 'gzip, deflate'
參考鏈接:https://www.cnblogs.com/chyu/p/4558782.html
解決方法:去掉Accept-Encoding后就正常了
二、模擬登陸網(wǎng)址常用方法總結(jié)
1.通過(guò)urllib庫(kù)的request庫(kù)的函數(shù)進(jìn)行請(qǐng)求
- from urllib import request
- #get請(qǐng)求
- ------------------------------------------------------
- #不加headers
- response=request.urlopen(url)
- page_source = response.read().decode('utf-8')
- #加headers,由于urllib.request.urlopen() 函數(shù)不接受headers參數(shù),所以需要構(gòu)建一個(gè)urllib.request.Request對(duì)象來(lái)實(shí)現(xiàn)請(qǐng)求頭的設(shè)置
- req= request.Request(url=url,headers=headers)
- response=request.urlopen(req)
- page_source = response.read().decode('utf-8')
- #post請(qǐng)求
- -------------------------------------------------------
- postdata = urllib.parse.urlencode(data).encode('utf-8')#必須進(jìn)行重編碼
- req= request.Request(url=url,data=postdata,headers=headers)
- response=request.urlopen(req)
- page_source = response.read().decode('utf-8')
- #使用cookie訪問(wèn)其他版塊
- #聲明一個(gè)CookieJar對(duì)象實(shí)例來(lái)保存cookie
- cookie = cookiejar.CookieJar()
- #利用urllib.request庫(kù)的HTTPCookieProcessor對(duì)象來(lái)創(chuàng)建cookie處理器,也就CookieHandler
- cookie_support = request.HTTPCookieProcessor(cookie)
- #通過(guò)CookieHandler創(chuàng)建opener
- opener = request.build_opener(cookie_support)
- # 將Opener安裝位全局,覆蓋urlopen函數(shù),也可以臨時(shí)使用opener.open()函數(shù)
- #urllib.request.install_opener(opener)
- #創(chuàng)建Request對(duì)象
- my_url="https://www.saikr.com/u/5598522"
- req2 = request.Request(url=my_url)
- response1 = opener.open(req1)
- response2 = opener.open(req2)
- #或者直接response2=opener.open(my_url)
- print(response1.read().decode('utf8'))
- print(response2.read().decode('utf8'))
2.通過(guò)requests庫(kù)的get和post函數(shù)
- import requests
- import urllib
- import json
- #get請(qǐng)求
- -----------------------------------------------------------
- #method1
- url="https://www.saikr.com/"
- params={ 'key1': 'value1','key2': 'value2' }
- real_url = base_url + urllib.parse.urlencode(params)
- #real_url="https://www.saikr.com/key1=value1&key2=value2"
- response=requests.get(real_url)
- #method2
- response=requests.get(url,params)
- print(response.text)#<class 'str'>
- print(response.content)# <class 'bytes'>
- #post請(qǐng)求
- login_url = "https://www.saikr.com/login"
- postdata ={
- "name": "1324802616@qq.com","pass": "my password",
- }
- header = {
- "Accept":"application/json, text/javascript, */*; q=0.01",
- "Accept-Language":"zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2",
- "Connection":"keep-alive",
- "Host":"www.saikr.com",
- "Referer":"https://www.saikr.com/login",
- "Cookie":"mycookie",
- "Content-Type":"application/x-www-form-urlencoded; charset=UTF-8",
- "TE":"Trailers","X-Requested-With":"XMLHttpRequest"
- }
- #requests中的post中傳入的data可以不進(jìn)行重編碼
- #login_postdata = urllib.parse.urlencode(postdata).encode('utf8')
- response=requests.post(url=login_url,data=postdata,headers=header)#<class 'requests.models.Response'>
- #以下三種都可以解析結(jié)果
- json1 = response1.json()#<class 'dict'>
- json2= json.loads(response1.text)#<class 'dict'>
- json_str = response2.content.decode('utf-8')#<class 'str'>
- #利用session維持會(huì)話訪問(wèn)其他版塊
- --------------------------------------------------------------------
- login_url = "https://www.saikr.com/login"
- postdata ={
- "name": "1324802616@qq.com","pass": "my password",
- }
- header = {
- "Accept":"application/json, text/javascript, */*; q=0.01",
- "Connection":"keep-alive",
- "Referer":"https://www.saikr.com/login",
- "Cookie":"mycookie",
- }
- session = requests.session()
- response = session.post(url=url, data=data, headers=headers)
- my_url="https://www.saikr.com/u/5598522"
- response1 = session.get(url=my_url, headers=headers)
- print(response1.json())
聯(lián)系客服