分享web开发知识

注册/登录|最近发布|今日推荐

主页 IT知识网页技术软件开发前端开发代码编程运营维护技术分享教程案例
当前位置:首页 > IT知识

Requests-html 设置 headers

发布时间:2023-09-06 02:32责任编辑:熊小新关键词:暂无标签

要求安装Requests-html,Python版本高于或等于3.6。

 1 # -*- coding -*- 2 ?3 from requests_html import HTMLSession 4 ?5 ?6 def get_web_page_elements(url, headers={}, xpath_expression=‘‘): 7 ????‘‘‘通过 xpath expression 获取 网页元素‘‘‘ 8 ????session = HTMLSession() 9 ????response = session.get(url, headers=headers)10 ????elements_list = response.html.xpath(xpath_expression)11 ????return elements_list12 13 14 if __name__ == ‘__main__‘:15 ????url = ‘https://www.liaoxuefeng.com/wiki/0014316089557264a6b348958f449949df42a6d3a2e542c000‘16 ????# headers 设置17 ????referer = url18 ????cookie = ‘Cookie: atsp=1548864427226_1548863599220; Hm_lvt_2efddd14a5f2b304677462d06fb4f964=1548863599; Hm_lpvt_2efddd14a5f2b304677462d06fb4f964=1548863599‘19 ????user_agent = ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.81 Safari/537.36‘20 ????headers = {21 ????????‘Referer‘: referer,22 ????????‘Cookie‘: cookie,23 ????????‘User-Agent‘: user_agent24 ????????}25 ????# 获取 目录26 ????index_xpath_expression = "//a[@class=‘x-wiki-index-item‘]"27 ????index_data = get_web_page_elements(url, headers=headers, xpath_expression=index_xpath_expression)28 ????for each_index in index_data:29 ????????print(each_index.text + ‘\t\t‘ + each_index.url)

Requests-html 设置 headers

原文地址:https://www.cnblogs.com/mcgill0217/p/10340310.html

知识推荐

我的编程学习网——分享web前端后端开发技术知识。 垃圾信息处理邮箱 tousu563@163.com 网站地图
icp备案号 闽ICP备2023006418号-8 不良信息举报平台 互联网安全管理备案 Copyright 2023 www.wodecom.cn All Rights Reserved