分享web开发知识

注册/登录|最近发布|今日推荐

主页 IT知识网页技术软件开发前端开发代码编程运营维护技术分享教程案例
当前位置:首页 > 运营维护

爬网页获取电话号码

发布时间:2023-09-06 02:08责任编辑:蔡小小关键词:暂无标签

明天补充

 1 import java.io.BufferedReader; 2 import java.io.IOException; 3 import java.io.InputStream; 4 import java.io.InputStreamReader; 5 import java.net.URL; 6 import java.net.URLConnection; 7 import java.util.regex.Matcher; 8 import java.util.regex.Pattern; 9 10 public class URLTest {11 12 ????private URLConnection connection = null;13 ????14 ????public String getDoucument(String url) throws IOException {15 ????????16 ????????URL newUrl = new URL(url);17 ????????connection = newUrl.openConnection();18 ????????InputStream is = connection.getInputStream();19 ????????InputStreamReader isr = new InputStreamReader(is);20 ????????BufferedReader br = new BufferedReader(isr);21 ????????22 ????????String line = "";23 ????????StringBuffer sb = new StringBuffer();24 ????????25 ????????while( (line = ?br.readLine()) != null) {26 ????????????sb.append(line +"\n");27 ????????}28 ????????29 ????????return sb.toString();30 ????????31 ????}32 ????33 ????public String MyFilter( String string ) {//内容过滤器-获取网页上的电话,没有去重34 ????????35 ????????String regex = "18[\\d]{9}";36 ????????Pattern pattern = Pattern.compile(regex);37 ????????Matcher matcher = pattern.matcher(string);38 ????????39 ????????String result = "";40 ????????while(matcher.find()) {41 ????????????result += matcher.group()+"\n";42 ????????}43 ????????44 ????????return result;45 ????}46 ????47 ????public static void main(String[] args) throws IOException {48 ????????49 ????????URLTest test = new URLTest();50 ????????String page = test.getDoucument("https://power.baidu.com/question/1511959129473915060.html?qbl=relate_question_1");51 ????????String result = test.MyFilter(page);52 // ???????System.out.println(page);53 ????????54 ????????if ( result != null ) {55 ????????????System.out.println("从该网页上找到的号码:\n"+result);56 ????????}57 ????????else {58 ????????????System.out.println("该网页上没有电话号码");59 ????????}60 ????????61 ????}62 }

程序效果

爬网页获取电话号码

原文地址:https://www.cnblogs.com/ynhwl/p/9434230.html

知识推荐

我的编程学习网——分享web前端后端开发技术知识。 垃圾信息处理邮箱 tousu563@163.com 网站地图
icp备案号 闽ICP备2023006418号-8 不良信息举报平台 互联网安全管理备案 Copyright 2023 www.wodecom.cn All Rights Reserved