網(wǎng)頁爬蟲 - 為什么python模擬登陸 appannie一直返回503 code
問題描述
#-*-encoding:utf-8-*-import requests, xlwt, sysfrom bs4 import BeautifulSoupreload(sys)referer = 'https://www.appannie.com/account/login/?_ref=header'user_agent = (’Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36’)sys.setdefaultencoding(’utf-8’)header = {'User-Agent': user_agent, 'Referer': referer, 'Host': 'www.appannie.com', ’Connection’: ’keep-alive’, ’Accept’: ’application/json, text/plain,*/*’, ’Accept-Encoding’: ’gzip, deflate, sdch’, ’Accept-Language’: ’zh-CN,zh;q=0.8’, ’X-NewRelic-ID’: ’VwcPUFJXGwEBUlJSDgc=’, ’X-Requested-With’: ’XMLHttpRequest’, }def main(): url = ’https://www.appannie.com/account/login/’ # content = requests.get(url,headers = header).content # soup = BeautifulSoup(content,’lxml’) # key = soup.select() s = requests.Session() s.get(url,headers = header) key = s.cookies[’csrftoken’] data = { ’csrfmiddlewaretoken’: key , ’next’: ’/dashboard/home/’ , ’username’:’1195615991@qq.com’ , ’password’:’xxxxx’ } req = s.post(url,data = data) if 2 != req.status_code / 100 :raise Exception('Error while logging in, code: %d' % (req.status_code)) cookies = req.cookies n = ’2017-04-11’ url_1 = ’https://www.appannie.com/apps/google-play/top-chart/?country=US&category=game&device=&date={}’.format(n) req_1 = s.get(url_1,headers = header,cookies = cookies).content #print req_1 soup = BeautifulSoup(req_1,’lxml’) print soup # ids = soup.find_all(’span’) # for id in ids : # name = id.get(’title’) # print nameif __name__ == ’__main__’: main()
問題解答
回答1:兩個(gè)關(guān)鍵點(diǎn):1. headers的user-agent2. csrfmiddlewaretoken參數(shù)
# coding: utf-8import requestsurl = ’https://www.appannie.com/account/login’session = requests.Session()session.headers[’user-agent’] = ’Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36’session.get(url)token = session.cookies.get(’csrftoken’)data = { ’csrfmiddlewaretoken’: token, ’next’:’/dashboard/home/’, ’username’:’XXXX’, ’password’:’XXXX’}r = session.post(url, data)print r.status_code
相關(guān)文章:
1. php - 請(qǐng)問大批量數(shù)據(jù)處理,如何分割?2. javascript - vue過渡效果 css過渡 類名的先后順序3. MySQL主鍵沖突時(shí)的更新操作和替換操作在功能上有什么差別(如圖)4. 數(shù)據(jù)庫 - Mysql的存儲(chǔ)過程真的是個(gè)坑!求助下面的存儲(chǔ)過程哪里錯(cuò)啦,實(shí)在是找不到哪里的問題了。5. html5和Flash對(duì)抗是什么情況?6. css右浮動(dòng)字的順序顛倒了7. ios - 類似微博首頁,一張圖的時(shí)候是如何確定圖大小的?8. javascript - 我是做web前端的,公司最近有一個(gè)項(xiàng)目關(guān)于數(shù)據(jù)統(tǒng)計(jì)的!9. javascript - 如何使用loadash對(duì)[object,object,object]形式的數(shù)組進(jìn)行比較10. javascript - vuejs+elementui 購物車價(jià)格計(jì)算,點(diǎn)擊加減號(hào)修改數(shù)量總價(jià)都不會(huì)改變,但是計(jì)算執(zhí)行了
