Java爬取网页指定内容-百木园

爬取网页文字：

import org.jsoup.Jsoup;
import org.junit.Test;

import java.io.IOException;

public class Crawling {


    public static void Test() throws IOException {
        Jsoup.connect(\"https://soccer.hupu.com/\").get().body().
                getElementsByClass(\"list-item\"). //class=\"list-item-title\"
                forEach(e->{
            System.out.println(e.text());
        });

    }

    public static void main(String[] args) {
        try {
            Test();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

爬取网页图片地址：

import org.jsoup.Jsoup;
import org.junit.Test;

import java.io.IOException;

public class Crawling {

    public static void Test() throws IOException {
        Jsoup.connect(\"https://soccer.hupu.com/\").get().body().
                getElementsByClass(\"list-item-img\").
                forEach(e->{
            System.out.println(e.attr(\"src\")); //src标签图片地址
        });
    }

    public static void main(String[] args) {
        try {
            Test();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

来源：https://www.cnblogs.com/subtlman/p/15958233.html
本站部分图文来源于网络，如有侵权请联系删除。

Java爬取网页指定内容

相关推荐

热门文章