基于Python爬取素材網(wǎng)站音頻文件
基本環(huán)境配置
python 3.6 pycharm requests parsel相關(guān)模塊pip安裝即可
目標(biāo)網(wǎng)頁(yè)
請(qǐng)求網(wǎng)頁(yè)
import requestsurl = ’https://www.tukuppt.com/peiyue/zonghe_0_0_0_0_0_0_1.html’ headers = { ’User-Agent’: ’Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36’, } response = requests.get(url=url, headers=headers)
解析網(wǎng)頁(yè),提取數(shù)據(jù)
import parselselector = parsel.Selector(response.text)urls = selector.css(’#audio850995 source::attr(src)’).getall()titles = selector.css(’.b-box .info .title::text’).getall()data = zip(urls, titles)for i in data: mp3_url = ’https:’ + i[0] title = i[1]
保存數(shù)據(jù)
def download(url, title): response = requests.get(url=url, headers=headers) path = ’D:pythondemo熊貓辦公素材背景音樂’ + title + ’.mp3’ with open(path, mode=’wb’) as f: f.write(response.content)
以上就是本文的全部?jī)?nèi)容,希望對(duì)大家的學(xué)習(xí)有所幫助,也希望大家多多支持好吧啦網(wǎng)。
相關(guān)文章:
1. ASP基礎(chǔ)知識(shí)Command對(duì)象講解2. ASP.NET MVC通過勾選checkbox更改select的內(nèi)容3. JavaScrip簡(jiǎn)單數(shù)據(jù)類型隱式轉(zhuǎn)換的實(shí)現(xiàn)4. 解決ajax請(qǐng)求后臺(tái),有時(shí)收不到返回值的問題5. jsp+mysql實(shí)現(xiàn)網(wǎng)頁(yè)的分頁(yè)查詢6. javascript xml xsl取值及數(shù)據(jù)修改第1/2頁(yè)7. ASP中實(shí)現(xiàn)字符部位類似.NET里String對(duì)象的PadLeft和PadRight函數(shù)8. XHTML 1.0:標(biāo)記新的開端9. JSP 中request與response的用法詳解10. asp知識(shí)整理筆記4(問答模式)
