前言
最近工作上遇到一個問題,后端有一個定時任務,需要用JAVA每天判斷法定節假日、周末放假,上班等情況,
其實想單獨通過邏輯什么的去判斷中國法定節假日的放假情況,基本不可能,因為國家每一年的假期可能不一樣,是人為設定的;
所以只能依靠其它手段,能想到的比較靠譜的如下:
- 網絡接口:有些數據服務商會提供,要么是收錢的,要么是次數限制,等等各種問題,效果不理想,可控性差,我也沒試過,如:https://www.juhe.cn/docs/api/id/177/aid/601或者http://apistore.baidu.com/apiworks/servicedetail/1116.html
- 在線解析網頁信息,獲取節假日情況:嚴重依賴被解析的網站網頁,所以在選取網站的時候,要找稍微靠譜點的;
- 根據國家規定的法定節假日放假情況,每年錄入系統,這種如果客戶不怕麻煩的話。還是比較靠譜的;
本Demo將選擇第二種來實現;
使用htmlunit在線解析網頁信息,獲取節假日情況
一開始是使用jsoup去解析網頁的,效果不理想,如果網頁是動態生成的時候,用jsoup遇到了各種問題,所以改成了htmlunit,總得來說htmlunit還是很強大的,能夠模擬瀏覽器運行,被譽為java瀏覽器的開源實現;
首先去官網下載相關jar包,以及閱讀相關文檔:
http://htmlunit.sourceforge.net/
我這里解析的網頁是360的萬年歷:
日歷界面如下:
被解析的 HTML格式如下:
實現步驟:
1、加載頁面;
2、循環等待頁面加載完成(可能會有一些動態頁面,是用javascript生成);
3、根據網頁格式解析html內容,并提取關鍵信息存入封裝好的對象;
注意點:
1、難點在于判斷是否休假及假期類型,由于原頁面并沒有標明每一天的假期類型,所以這里的邏輯要自己去實現,詳情參考代碼;
2、之所以有個靜態latestVocationName變量,是防止出現以下情況(出現該情況的概率極低;PS:方法要每天調用一次,該變量才生效):
代碼實現:
定義一個中國日期類:
package com.pichen.tools.getDate; import java.util.Date; public class ChinaDate { /** * 公歷時間 */ private Date solarDate; /** * 農歷日 */ private String lunar; /** * 公歷日 */ private String solar; /** * 是否是 休 */ private boolean isVacation = false; /** * 如果是 休情況下的假期名字 */ private String VacationName = "非假期"; /** * 是否是 班 */ private boolean isWorkFlag = false; private boolean isSaturday = false; private boolean isSunday = false; /** * @return the solarDate */ public Date getSolarDate() { return solarDate; } /** * @param solarDate the solarDate to set */ public void setSolarDate(Date solarDate) { this.solarDate = solarDate; } /** * @return the lunar */ public String getLunar() { return lunar; } /** * @param lunar the lunar to set */ public void setLunar(String lunar) { this.lunar = lunar; } /** * @return the solar */ public String getSolar() { return solar; } /** * @param solar the solar to set */ public void setSolar(String solar) { this.solar = solar; } /** * @return the isVacation */ public boolean isVacation() { return isVacation; } /** * @param isVacation the isVacation to set */ public void setVacation(boolean isVacation) { this.isVacation = isVacation; } /** * @return the vacationName */ public String getVacationName() { return VacationName; } /** * @param vacationName the vacationName to set */ public void setVacationName(String vacationName) { VacationName = vacationName; } /** * @return the isWorkFlag */ public boolean isWorkFlag() { return isWorkFlag; } /** * @param isWorkFlag the isWorkFlag to set */ public void setWorkFlag(boolean isWorkFlag) { this.isWorkFlag = isWorkFlag; } /** * @return the isSaturday */ public boolean isSaturday() { return isSaturday; } /** * @param isSaturday the isSaturday to set */ public void setSaturday(boolean isSaturday) { this.isSaturday = isSaturday; } /** * @return the isSunday */ public boolean isSunday() { return isSunday; } /** * @param isSunday the isSunday to set */ public void setSunday(boolean isSunday) { this.isSunday = isSunday; } }
解析網頁,并調用demo,打印本月詳情,和當天詳情:
package com.pichen.tools.getDate; import java.io.IOException; import java.net.MalformedURLException; import java.text.DateFormat; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.ArrayList; import java.util.Date; import java.util.List; import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; import com.gargoylesoftware.htmlunit.WebClient; import com.gargoylesoftware.htmlunit.html.DomNodeList; import com.gargoylesoftware.htmlunit.html.HtmlElement; import com.gargoylesoftware.htmlunit.html.HtmlPage; public class Main { private static String latestVocationName=""; public String getVocationName(DomNodeList<HtmlElement> htmlElements, String date) throws ParseException{ String rst = ""; boolean pastTimeFlag = false; DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd"); Date paramDate = dateFormat.parse(date); if(new Date().getTime() >= paramDate.getTime()){ pastTimeFlag = true; } //first step //jugde if can get vocation name from html page for(int i = 0; i < htmlElements.size(); i++){ HtmlElement element = htmlElements.get(i); if(element.getAttribute("class").indexOf("vacation")!=-1){ boolean hitFlag = false; String voationName = ""; for(; i < htmlElements.size(); i++){ HtmlElement elementTmp = htmlElements.get(i); String liDate = elementTmp.getAttribute("date"); List<HtmlElement> lunar = elementTmp.getElementsByAttribute("span", "class", "lunar"); String lanarText = lunar.get(0).asText(); if(lanarText.equals("元旦")){ voationName = "元旦"; }else if(lanarText.equals("除夕")||lanarText.equals("春節")){ voationName = "春節"; }else if(lanarText.equals("清明")){ voationName = "清明"; }else if(lanarText.equals("國際勞動節")){ voationName = "國際勞動節"; }else if(lanarText.equals("端午節")){ voationName = "端午節"; }else if(lanarText.equals("中秋節")){ voationName = "中秋節"; }else if(lanarText.equals("國慶節")){ voationName = "國慶節"; } if(liDate.equals(date)){ hitFlag = true; } if(elementTmp.getAttribute("class").indexOf("vacation")==-1){ break; } } if(hitFlag == true && !voationName.equals("")){ rst = voationName; break; } }else{ continue; } } //if first step fail(rarely), get from the latest Vocation name if(rst.equals("")){ System.out.println("warning: fail to get vocation name from html page."); //you can judge by some simple rule //from the latest Vocation name rst = Main.latestVocationName; }else if(pastTimeFlag == true){ //更新《當前時間,且最近一次的可見的假期名 Main.latestVocationName = rst; } return rst; } public List<ChinaDate> getCurrentDateInfo(){ WebClient webClient = null; List<ChinaDate> dateList = null; try{ DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd"); dateList = new ArrayList<ChinaDate>(); webClient = new WebClient(); HtmlPage page = webClient.getPage("http://hao.360.cn/rili/"); //最大等待60秒 for(int k = 0; k < 60; k++){ if(!page.getElementById("M-dates").asText().equals("")) break; Thread.sleep(1000); } //睡了8秒,等待頁面加載完成...,有時候,頁面可能獲取不到,不穩定() //Thread.sleep(8000); DomNodeList<HtmlElement> htmlElements = page.getElementById("M-dates").getElementsByTagName("li"); //System.out.println(htmlElements.size()); for(HtmlElement element : htmlElements){ ChinaDate chinaDate = new ChinaDate(); List<HtmlElement> lunar = element.getElementsByAttribute("span", "class", "lunar"); List<HtmlElement> solar = element.getElementsByAttribute("div", "class", "solar"); chinaDate.setLunar(lunar.get(0).asText()); chinaDate.setSolar(solar.get(0).asText()); chinaDate.setSolarDate(dateFormat.parse(element.getAttribute("date"))); if(element.getAttribute("class").indexOf("vacation")!=-1){ chinaDate.setVacation(true); chinaDate.setVacationName(this.getVocationName(htmlElements, element.getAttribute("date"))); } if(element.getAttribute("class").indexOf("weekend")!=-1 && element.getAttribute("class").indexOf("last")==-1){ chinaDate.setSaturday(true); } if(element.getAttribute("class").indexOf("last weekend")!=-1){ chinaDate.setSunday(true); } if(element.getAttribute("class").indexOf("work")!=-1){ chinaDate.setWorkFlag(true); }else if(chinaDate.isSaturday() == false && chinaDate.isSunday() == false && chinaDate.isVacation() == false ){ chinaDate.setWorkFlag(true); }else{ chinaDate.setWorkFlag(false); } dateList.add(chinaDate); } }catch(Exception e){ e.printStackTrace(); System.out.println("get date from http://hao.360.cn/rili/ error~"); }finally{ webClient.close(); } return dateList; } public ChinaDate getTodayInfo(){ List<ChinaDate> dateList = this.getCurrentDateInfo(); DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd"); for(ChinaDate date: dateList){ if(dateFormat.format(date.getSolarDate()).equals(dateFormat.format(new Date()))){ return date; } } return new ChinaDate(); } public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException { List<ChinaDate> dateList = new Main().getCurrentDateInfo(); ChinaDate today = new Main().getTodayInfo(); DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd"); System.out.println("本月詳情:"); for(ChinaDate date: dateList){ System.out.println(dateFormat.format(date.getSolarDate()) + " " + date.getVacationName()); } System.out.println("------------------------------------------------------------------------"); System.out.println("今日詳情:"); System.out.println("日期:" + today.getSolarDate()); System.out.println("農歷:"+today.getLunar()); System.out.println("公歷:"+today.getSolar()); System.out.println("假期名:"+today.getVacationName()); System.out.println("是否周六:"+today.isSaturday()); System.out.println("是否周日:"+today.isSunday()); System.out.println("是否休假:"+today.isVacation()); System.out.println("是否工作日:"+today.isWorkFlag()); System.out.println("已發生的最近一次假期:" + Main.latestVocationName); } }
運行程序,結果正確:
后續改進措施
當網頁加載失敗的時候,可以多次嘗試;
可以考慮多找幾個網站的日歷進行解析,當其中一個拋出異常的時候,切換到另一個網站解析;
考慮增加郵件通知或短信通知功能,出現任何異常信息都能實時通知系統管理者;
文章列表
留言列表