高考分数爬到503的解决方法

10月31日 Python,爬虫技术已关闭评论 Accept-Encoding,headers,requests

重点：当遭遇503错误的时候多数是因为headers错误，出了User-Agent再加一个Accept-Encoding就可以了 import requests url="https://api.eol.cn/gkcx/api/?access_token=&page=1&province_id=33&s...

爬虫 6 selenium调动浏览器爬数据

10月22日 Python,爬虫技术没有评论 selenium

# 以下方法都可以从网页中提取出'你好，蜘蛛侠！'这段文字 find_element_by_tag_name：通过元素的名称选择 # 如<h1>你好，蜘蛛侠！</h1> # 可以使用find_element_by_tag_name('h1') find_element_b...

json 模块的使用

10月22日 Python,爬虫技术没有评论 json.dumps,json.loads

import requests, json url = 'http://ictclas.nlpir.org/nlpir/index6/getWord2Vec.do' headers={'origin':'http://ictclas.nlpir.org', 'referer':'http://ictclas.nlpir.org/nlpir/', 'user-agent':'...

爬虫和机器人聊天

10月22日 Python,爬虫技术没有评论

基础版聊天 import requests, json url = 'http://openapi.tuling123.com/openapi/api/v2' robotName = 'Robot' apiKey = '61f2cfc8b00b41c4a2605b11671367a2' while True: text=(input('发言：')) d...

爬虫 5 带cookies的requests.post请求

10月22日 Python,爬虫技术没有评论 cookies,post,requests

import requests #引入requests。 url = ' https://wordpress-edu-3autumn.localprod.oc.forchange.cn/wp-login.php' #把请求登录的网址赋值给url。 headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 10....

用爬虫完成扇贝的单词测试

10月21日 Python,爬虫技术没有评论 open(),requests

import requests # 先用requests请求链接 link = requests.get('https://www.shanbay.com/api/v1/vocabtest/category/') # 解析请求得到的响应 js_link = link.json() # 让用户选择自己想测的词库，输入数字编...

爬虫 4 excel&csv的创建和读取

10月21日 Python,爬虫技术没有评论 csv,excel

excel的创建和读取 import openpyxl # 写入的代码： wb = openpyxl.Workbook() sheet = wb.active sheet.title = 'new title' sheet['A1'] = '漫威宇宙' rows = [['美国队长','钢铁侠','蜘蛛侠','雷神'],['...

爬虫 3 params&header的运用带参数和请求头的爬虫

10月21日 Python,爬虫技术没有评论 header,params

学习重点： params = {'page':'2'} res = requests.get('https://lvnvl.cn',params = params) import requests # 引用requests模块 url = 'https://c.y.qq.com/base/fcgi-bin/fcg_global_comment_h5.fc...

爬虫 2 BeautifulSoup的运用

10月20日 Python,爬虫技术没有评论 BeautifulSoup

学习重点：Step 1 解析数据Step 2 通过find_all便利标签Step 3 通过.text ['src']['href']提取内容 # 引用requests库 import requests # 引用BeautifulSoup库 from bs4 import BeautifulSoup # 获取数据 res...

爬虫 1 request模块的运用

10月20日 Python,爬虫技术没有评论 encoding,requests

1、爬虫获取文本内容 import requests res=requests.get('https://lvnvl.cn') #print(res.status_code) res.encoding = 'utf-8' novle = res.text print(novle[:]) 2、爬虫获取图片内容（二进制） i...