分享web开发知识

注册/登录|最近发布|今日推荐

主页 IT知识网页技术软件开发前端开发代码编程运营维护技术分享教程案例
当前位置:首页 > 运营维护

au3抓取糗事百科网站

发布时间:2023-09-06 01:12责任编辑:郭大石关键词:暂无标签

au3抓取糗事百科网站

网址:‘http://www.qiushibaike.com/8hr/page/‘ & $pagenum & ‘?s=4512150‘

#include <IE.au3>#include <File.au3>#include <String.au3>#include <Array.au3>#include <Debug.au3>#include <Date.au3>;code try to collect Qiushibaike stories in qiushibaike.comLocal $strUrl1 = "http://www.qiushibaike.com/8hr/page/2?s=4512150"Local $filename1 = "qiushibaike"$filename1 = $filename1 & ‘_‘ & @MON$filename1 = $filename1 ?& @MDAY$filename1 = $filename1 & ‘.txt‘Local $filesave = @TempDir & "\qb.html"Local $pageindexLocal $startindex = 2Local $endindex = 10Local $sHTMLLocal $storycount = 0_FileCreate($filename1)Local $file = FileOpen($filename1, 1)If $file = -1 Then ???MsgBox(0, "Error", "Unable to open file.") ???Exit EndIfFor $pageindex = $startindex To $endindex Step 1 ??$strUrl1 = MakeUpUrl($pageindex) ??Local $hDownload = InetGet($strUrl1, $filesave, 1, 1) ??Do ??????Sleep(250) ??Until InetGetInfo($hDownload, 2) ??Local $nBytes = InetGetInfo($hDownload, 0) ??InetClose($hDownload) ?ConsoleWrite ($pageindex & ‘/‘ & ?$endindex &" --- down bytes = " &$nBytes & @LF) ?$fsize = $nBytes ?;ConsoleWrite($pageindex & ‘- filesize = ‘& $fsize & @LF) ?$ftemp = FileOpen($filesave, 0) ?$getsize= ???FileGetSize ($filesave) ?$sHTML = FileRead($ftemp, $getsize) ?FileClose($ftemp) ?FileDelete($filesave) ?Local $aArray = StringRegExp($sHTML, ‘(?<=<span>)\n+[^/]+\n+(?=</span>)‘, 3) ?ConsoleWrite(" ???array size = " & UBound($aArray) & @CRLF) ?For $i = 0 To (UBound($aArray) - 1) Step 1 ????Local $item = $aArray[$i] ????If StringLen($item) > 0 Then ???$strnum = $storycount +1 ???$strnum = $strnum & "." &@CRLF ???FileWrite($file, $strnum) ???$storycontent = StringReplace($item, @LF, ‘‘) ????$storycontent = $storycontent & @CRLF ????FileWrite($file, $storycontent) ????$storycount = $storycount + 1 ????EndIf ????Next ??NextFileClose($file)MsgBox(0, "QSBK", "Complete, story count = "&$storycount & ‘, story=‘ & $filename1)ExitFunc MakeUpUrl($pagenum) ??$strUrl = ‘http://www.qiushibaike.com/8hr/page/‘ & $pagenum & ‘?s=4512150‘ ??return $strUrl ??EndFunc

au3抓取糗事百科网站

原文地址:http://www.cnblogs.com/greershk/p/7541363.html

知识推荐

我的编程学习网——分享web前端后端开发技术知识。 垃圾信息处理邮箱 tousu563@163.com 网站地图
icp备案号 闽ICP备2023006418号-8 不良信息举报平台 互联网安全管理备案 Copyright 2023 www.wodecom.cn All Rights Reserved