- 爬取网页文字:
-
import org.jsoup.Jsoup;
import org.junit.Test;
import java.io.IOException;
public class Crawling {
public static void Test() throws IOException {
Jsoup.connect(\"https://soccer.hupu.com/\").get().body().
getElementsByClass(\"list-item\"). //class=\"list-item-title\"
forEach(e->{
System.out.println(e.text());
});
}
public static void main(String[] args) {
try {
Test();
} catch (IOException e) {
e.printStackTrace();
}
}
} - 爬取网页图片地址:
-
import org.jsoup.Jsoup;
import org.junit.Test;
import java.io.IOException;
public class Crawling {
public static void Test() throws IOException {
Jsoup.connect(\"https://soccer.hupu.com/\").get().body().
getElementsByClass(\"list-item-img\").
forEach(e->{
System.out.println(e.attr(\"src\")); //src标签图片地址
});
}
public static void main(String[] args) {
try {
Test();
} catch (IOException e) {
e.printStackTrace();
}
}
}
来源:https://www.cnblogs.com/subtlman/p/15958233.html
本站部分图文来源于网络,如有侵权请联系删除。