如题所述
javaå¯ä»¥ä½¿ç¨jsoupãhtmlparserçå·¥å
·è¿è¡htmlç读åå解æï¼ä»¥ä¸æ¯è¯¦ç»è¯´æï¼
1ãjsoup æ¯ä¸æ¬¾ Java çHTML 解æå¨ï¼å¯ç´æ¥è§£ææ个URLå°åãHTMLææ¬å 容ãå®æä¾äºä¸å¥é常çåçAPIï¼å¯éè¿DOMï¼CSS以å类似äºJQueryçæä½æ¹æ³æ¥ååºåæä½æ°æ®ãæ®è¯´å®æ¯åºäºMITåè®®åå¸çã
jsoupç主è¦åè½å¦ä¸ï¼
ä»ä¸ä¸ªURLï¼æ件æå符串ä¸è§£æHTMLï¼
使ç¨DOMæCSSéæ©å¨æ¥æ¥æ¾ãååºæ°æ®ï¼
å¯æä½HTMLå ç´ ãå±æ§ãææ¬ï¼
示ä¾ä»£ç ï¼
Document doc = Jsoup.parse(input, "UTF-8", "http://www.dangdang.com");
Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
String linkHref = link.attr("href");
String linkText = link.text();
}
2ãhtmlparseræ¯ä¸ä¸ªçº¯çjavaåçhtml解æçåºï¼å®ä¸ä¾èµäºå ¶å®çjavaåºæ件ï¼ä¸»è¦ç¨äºæ¹é ææåhtmlãå®è½è¶ é«é解æhtmlï¼èä¸ä¸ä¼åºéãç°å¨htmlparserææ°çæ¬ä¸º2.0ããæ®è¯´htmlparserå°±æ¯ç®åæ好çhtml解æååæçå·¥å ·ãããæ è®ºä½ æ¯æ³æåç½é¡µæ°æ®è¿æ¯æ¹é htmlçå 容ï¼ç¨äºhtmlparserç»å¯¹ä¼å¿ä¸ä½ç§°èµã
å¨çº¿ææ¡£ï¼ http://www.osctools.net/apidocs/apidoc?api=HTMLParserï¼http://htmlparser.sourceforge.net/project-info.html
示ä¾ä»£ç ï¼
Parser parser = new Parser ("http://www.dangdang.com");
NodeList list = parser.parse (null);
Node node = list.elementAt (0);
NodeList sublist = node.getChildren ();
System.out.println (sublist.size ());
1ãjsoup æ¯ä¸æ¬¾ Java çHTML 解æå¨ï¼å¯ç´æ¥è§£ææ个URLå°åãHTMLææ¬å 容ãå®æä¾äºä¸å¥é常çåçAPIï¼å¯éè¿DOMï¼CSS以å类似äºJQueryçæä½æ¹æ³æ¥ååºåæä½æ°æ®ãæ®è¯´å®æ¯åºäºMITåè®®åå¸çã
jsoupç主è¦åè½å¦ä¸ï¼
ä»ä¸ä¸ªURLï¼æ件æå符串ä¸è§£æHTMLï¼
使ç¨DOMæCSSéæ©å¨æ¥æ¥æ¾ãååºæ°æ®ï¼
å¯æä½HTMLå ç´ ãå±æ§ãææ¬ï¼
示ä¾ä»£ç ï¼
Document doc = Jsoup.parse(input, "UTF-8", "http://www.dangdang.com");
Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
String linkHref = link.attr("href");
String linkText = link.text();
}
2ãhtmlparseræ¯ä¸ä¸ªçº¯çjavaåçhtml解æçåºï¼å®ä¸ä¾èµäºå ¶å®çjavaåºæ件ï¼ä¸»è¦ç¨äºæ¹é ææåhtmlãå®è½è¶ é«é解æhtmlï¼èä¸ä¸ä¼åºéãç°å¨htmlparserææ°çæ¬ä¸º2.0ããæ®è¯´htmlparserå°±æ¯ç®åæ好çhtml解æååæçå·¥å ·ãããæ è®ºä½ æ¯æ³æåç½é¡µæ°æ®è¿æ¯æ¹é htmlçå 容ï¼ç¨äºhtmlparserç»å¯¹ä¼å¿ä¸ä½ç§°èµã
å¨çº¿ææ¡£ï¼ http://www.osctools.net/apidocs/apidoc?api=HTMLParserï¼http://htmlparser.sourceforge.net/project-info.html
示ä¾ä»£ç ï¼
Parser parser = new Parser ("http://www.dangdang.com");
NodeList list = parser.parse (null);
Node node = list.elementAt (0);
NodeList sublist = node.getChildren ();
System.out.println (sublist.size ());
温馨提示:答案为网友推荐,仅供参考
第1个回答 推荐于2018-02-28
如下:
public static String do_post(String url, List<NameValuePair> name_value_pair) throws IOException {
String body = "{}";
DefaultHttpClient httpclient = new DefaultHttpClient();
try {
HttpPost httpost = new HttpPost(url);
httpost.setEntity(new UrlEncodedFormEntity(name_value_pair, StandardCharsets.UTF_8));
HttpResponse response = httpclient.execute(httpost);
HttpEntity entity = response.getEntity();
body = EntityUtils.toString(entity);
} finally {
httpclient.getConnectionManager().shutdown();
}
return body;
}
public static String do_get(String url) throws ClientProtocolException, IOException {
String body = "{}";
DefaultHttpClient httpclient = new DefaultHttpClient();
try {
HttpGet httpget = new HttpGet(url);
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
body = EntityUtils.toString(entity);
} finally {
httpclient.getConnectionManager().shutdown();
}
return body;
}本回答被提问者和网友采纳
public static String do_post(String url, List<NameValuePair> name_value_pair) throws IOException {
String body = "{}";
DefaultHttpClient httpclient = new DefaultHttpClient();
try {
HttpPost httpost = new HttpPost(url);
httpost.setEntity(new UrlEncodedFormEntity(name_value_pair, StandardCharsets.UTF_8));
HttpResponse response = httpclient.execute(httpost);
HttpEntity entity = response.getEntity();
body = EntityUtils.toString(entity);
} finally {
httpclient.getConnectionManager().shutdown();
}
return body;
}
public static String do_get(String url) throws ClientProtocolException, IOException {
String body = "{}";
DefaultHttpClient httpclient = new DefaultHttpClient();
try {
HttpGet httpget = new HttpGet(url);
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
body = EntityUtils.toString(entity);
} finally {
httpclient.getConnectionManager().shutdown();
}
return body;
}本回答被提问者和网友采纳
第2个回答 2018-07-31
可以把html用记事本打开查看,也可以用浏览器进行查看都是可以的。
以浏览器查看html文件源码为例:
在要查看html源码的页面右键,选择查看源代码
2.点击后就会看到具体的代码了(下面为查看源码的代码片断)
第3个回答 2014-11-01
用HttpClient可以发送一次网络请求并获取html文件