文章出處

前言

最近工作上遇到一個問題,后端有一個定時任務,需要用JAVA每天判斷法定節假日、周末放假,上班等情況,

其實想單獨通過邏輯什么的去判斷中國法定節假日的放假情況,基本不可能,因為國家每一年的假期可能不一樣,是人為設定的;

所以只能依靠其它手段,能想到的比較靠譜的如下:

  1. 網絡接口:有些數據服務商會提供,要么是收錢的,要么是次數限制,等等各種問題,效果不理想,可控性差,我也沒試過,如:https://www.juhe.cn/docs/api/id/177/aid/601或者http://apistore.baidu.com/apiworks/servicedetail/1116.html
  2. 在線解析網頁信息,獲取節假日情況:嚴重依賴被解析的網站網頁,所以在選取網站的時候,要找稍微靠譜點的;
  3. 根據國家規定的法定節假日放假情況,每年錄入系統,這種如果客戶不怕麻煩的話。還是比較靠譜的;

 本Demo將選擇第二種來實現;

使用htmlunit在線解析網頁信息,獲取節假日情況

一開始是使用jsoup去解析網頁的,效果不理想,如果網頁是動態生成的時候,用jsoup遇到了各種問題,所以改成了htmlunit,總得來說htmlunit還是很強大的,能夠模擬瀏覽器運行,被譽為java瀏覽器的開源實現;

首先去官網下載相關jar包,以及閱讀相關文檔:

http://htmlunit.sourceforge.net/

我這里解析的網頁是360的萬年歷:

http://hao.360.cn/rili/

日歷界面如下:

 

被解析的 HTML格式如下:

實現步驟:

1、加載頁面;

2、循環等待頁面加載完成(可能會有一些動態頁面,是用javascript生成);

3、根據網頁格式解析html內容,并提取關鍵信息存入封裝好的對象;

注意點:

1、難點在于判斷是否休假及假期類型,由于原頁面并沒有標明每一天的假期類型,所以這里的邏輯要自己去實現,詳情參考代碼;

2、之所以有個靜態latestVocationName變量,是防止出現以下情況(出現該情況的概率極低;PS:方法要每天調用一次,該變量才生效):

代碼實現:

定義一個中國日期類:

package com.pichen.tools.getDate;

import java.util.Date;


public class ChinaDate {

    /**
     * 公歷時間
     */
    private Date solarDate;
    
    /**
     * 農歷日
     */
    private String lunar;
    
    /**
     * 公歷日
     */
    private String solar;

    
    /**
     * 是否是  休
     */
    private boolean isVacation = false;
    /**
     * 如果是 休情況下的假期名字
     */
    private String VacationName = "非假期";
    /**
     * 是否是 班
     */
    private boolean isWorkFlag = false;
    
    private boolean isSaturday = false;
    private boolean isSunday = false;
    /**
     * @return the solarDate
     */
    public Date getSolarDate() {
        return solarDate;
    }
    /**
     * @param solarDate the solarDate to set
     */
    public void setSolarDate(Date solarDate) {
        this.solarDate = solarDate;
    }
    /**
     * @return the lunar
     */
    public String getLunar() {
        return lunar;
    }
    /**
     * @param lunar the lunar to set
     */
    public void setLunar(String lunar) {
        this.lunar = lunar;
    }
    /**
     * @return the solar
     */
    public String getSolar() {
        return solar;
    }
    /**
     * @param solar the solar to set
     */
    public void setSolar(String solar) {
        this.solar = solar;
    }

    /**
     * @return the isVacation
     */
    public boolean isVacation() {
        return isVacation;
    }
    /**
     * @param isVacation the isVacation to set
     */
    public void setVacation(boolean isVacation) {
        this.isVacation = isVacation;
    }
    /**
     * @return the vacationName
     */
    public String getVacationName() {
        return VacationName;
    }
    /**
     * @param vacationName the vacationName to set
     */
    public void setVacationName(String vacationName) {
        VacationName = vacationName;
    }
    /**
     * @return the isWorkFlag
     */
    public boolean isWorkFlag() {
        return isWorkFlag;
    }
    /**
     * @param isWorkFlag the isWorkFlag to set
     */
    public void setWorkFlag(boolean isWorkFlag) {
        this.isWorkFlag = isWorkFlag;
    }
    /**
     * @return the isSaturday
     */
    public boolean isSaturday() {
        return isSaturday;
    }
    /**
     * @param isSaturday the isSaturday to set
     */
    public void setSaturday(boolean isSaturday) {
        this.isSaturday = isSaturday;
    }
    /**
     * @return the isSunday
     */
    public boolean isSunday() {
        return isSunday;
    }
    /**
     * @param isSunday the isSunday to set
     */
    public void setSunday(boolean isSunday) {
        this.isSunday = isSunday;
    }
    

}
View Code

解析網頁,并調用demo,打印本月詳情,和當天詳情:

package com.pichen.tools.getDate;
import java.io.IOException;
import java.net.MalformedURLException;
import java.text.DateFormat;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;

import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.DomNodeList;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlPage;


public class Main {
    

    private static String latestVocationName="";
    
    public String getVocationName(DomNodeList<HtmlElement> htmlElements, String date) throws ParseException{
        String rst = "";
        
        boolean pastTimeFlag = false;
        DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd");
        Date paramDate = dateFormat.parse(date);
        if(new Date().getTime() >= paramDate.getTime()){
            pastTimeFlag = true;
        }
        
        //first step   //jugde if can get vocation name from html page
        for(int i = 0; i < htmlElements.size(); i++){
            HtmlElement element = htmlElements.get(i);
            if(element.getAttribute("class").indexOf("vacation")!=-1){
                
                boolean hitFlag = false;
                String voationName = "";
                for(; i < htmlElements.size(); i++){
                    HtmlElement elementTmp = htmlElements.get(i);
                    String liDate = elementTmp.getAttribute("date");
                    
                    List<HtmlElement> lunar = elementTmp.getElementsByAttribute("span", "class", "lunar");
                    String lanarText = lunar.get(0).asText();
                    
                    if(lanarText.equals("元旦")){
                        voationName = "元旦";
                    }else if(lanarText.equals("除夕")||lanarText.equals("春節")){
                        voationName = "春節";
                    }else if(lanarText.equals("清明")){
                        voationName = "清明";
                    }else if(lanarText.equals("國際勞動節")){
                        voationName = "國際勞動節";
                    }else if(lanarText.equals("端午節")){
                        voationName = "端午節";
                    }else if(lanarText.equals("中秋節")){
                        voationName = "中秋節";
                    }else if(lanarText.equals("國慶節")){
                        voationName = "國慶節";
                    }
                    
                    
                    if(liDate.equals(date)){
                        hitFlag = true;
                    }
                    
                    if(elementTmp.getAttribute("class").indexOf("vacation")==-1){
                        break;
                    }
                }
                
                
                if(hitFlag == true && !voationName.equals("")){
                    rst = voationName;
                    break;
                }
                
                
            }else{
                continue;
            }
        }
        
        
        
        //if first step fail(rarely), get from the latest Vocation name
        if(rst.equals("")){
            System.out.println("warning: fail to get vocation name from html page.");

            //you can judge by some simple rule 
            
            //from the latest Vocation name
            rst = Main.latestVocationName;
        }else if(pastTimeFlag == true){
            //更新《當前時間,且最近一次的可見的假期名
            Main.latestVocationName = rst;
        }
        return rst;
    }
    
    
    public List<ChinaDate> getCurrentDateInfo(){
        WebClient webClient = null;
        List<ChinaDate> dateList = null;
        
        try{
            DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd");
            dateList = new ArrayList<ChinaDate>();

            webClient = new WebClient();
            HtmlPage page = webClient.getPage("http://hao.360.cn/rili/");
            
            //最大等待60秒
            for(int k = 0; k < 60; k++){
                if(!page.getElementById("M-dates").asText().equals("")) break;
                Thread.sleep(1000);
            }
            
            //睡了8秒,等待頁面加載完成...,有時候,頁面可能獲取不到,不穩定()
            //Thread.sleep(8000);

            DomNodeList<HtmlElement> htmlElements = page.getElementById("M-dates").getElementsByTagName("li");
            //System.out.println(htmlElements.size());
            
            
            for(HtmlElement element : htmlElements){
                ChinaDate chinaDate = new ChinaDate();
                
                List<HtmlElement> lunar = element.getElementsByAttribute("span", "class", "lunar");
                List<HtmlElement> solar = element.getElementsByAttribute("div", "class", "solar");

                chinaDate.setLunar(lunar.get(0).asText());
                chinaDate.setSolar(solar.get(0).asText());
                chinaDate.setSolarDate(dateFormat.parse(element.getAttribute("date")));
                

                if(element.getAttribute("class").indexOf("vacation")!=-1){
                    chinaDate.setVacation(true);
                    chinaDate.setVacationName(this.getVocationName(htmlElements, element.getAttribute("date")));
                    
                    

                    
                }
                
                if(element.getAttribute("class").indexOf("weekend")!=-1 && 
                   element.getAttribute("class").indexOf("last")==-1){
                    chinaDate.setSaturday(true);
                }
                if(element.getAttribute("class").indexOf("last weekend")!=-1){
                    chinaDate.setSunday(true);
                }
                if(element.getAttribute("class").indexOf("work")!=-1){
                    chinaDate.setWorkFlag(true);
                }else if(chinaDate.isSaturday() == false &&
                         chinaDate.isSunday() == false && 
                         chinaDate.isVacation() == false ){
                    chinaDate.setWorkFlag(true);
                }else{
                    chinaDate.setWorkFlag(false);
                }
                
                dateList.add(chinaDate);
            }
            
            
        }catch(Exception e){
            e.printStackTrace();
            System.out.println("get date from http://hao.360.cn/rili/ error~");
        }finally{
            webClient.close();
        }
        return dateList;
    }
    
    
    public ChinaDate getTodayInfo(){
        List<ChinaDate> dateList = this.getCurrentDateInfo();
        DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd");
        for(ChinaDate date: dateList){
            if(dateFormat.format(date.getSolarDate()).equals(dateFormat.format(new Date()))){
                return date;
            }
        }
        return new ChinaDate();
    }
    

    public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException {

        List<ChinaDate> dateList = new Main().getCurrentDateInfo();
        ChinaDate today = new Main().getTodayInfo();
        DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd");
        
        System.out.println("本月詳情:");
        for(ChinaDate date: dateList){
            System.out.println(dateFormat.format(date.getSolarDate()) + " " + date.getVacationName());
        }

        System.out.println("------------------------------------------------------------------------");
        System.out.println("今日詳情:");
        System.out.println("日期:" + today.getSolarDate());
        System.out.println("農歷:"+today.getLunar());
        System.out.println("公歷:"+today.getSolar());
        System.out.println("假期名:"+today.getVacationName());
        System.out.println("是否周六:"+today.isSaturday());
        System.out.println("是否周日:"+today.isSunday());
        System.out.println("是否休假:"+today.isVacation());
        System.out.println("是否工作日:"+today.isWorkFlag());
        
        System.out.println("已發生的最近一次假期:" + Main.latestVocationName);
    }

}
View Code

運行程序,結果正確:

后續改進措施

當網頁加載失敗的時候,可以多次嘗試;

可以考慮多找幾個網站的日歷進行解析,當其中一個拋出異常的時候,切換到另一個網站解析;

考慮增加郵件通知或短信通知功能,出現任何異常信息都能實時通知系統管理者;

 


文章列表


不含病毒。www.avast.com
arrow
arrow
    全站熱搜

    大師兄 發表在 痞客邦 留言(0) 人氣()