一、前言
最近工作上遇到一个问题,后端有一个定时任务,需要用JAVA每天判断法定节假日、周末放假,上班等情况,其实想单独通过逻辑什么的去判断中国法定节假日的放假情况,基本不可能,因为国家每一年的假期可能不一样,是人为设定的;
所以只能依靠其它手段,能想到的比较靠谱的如下:
1.网络接口:有些数据服务商会提供,要么是收钱的,要么是次数限制,等等各种问题,效果不理想,可控性差,我也没试过,如:
https://www.juhe.cn/docs/api/id/177/aid/601
或者
http://apistore.baidu.com/apiworks/servicedetail/1116.html
2.在线解析网页信息,获取节假日情况:严重依赖被解析的网站网页,所以在选取网站的时候,要找稍微靠谱点的;
3.根据国家规定的法定节假日放假情况,每年录入系统,这种如果客户不怕麻烦的话。还是比较靠谱的;
本Demo将选择第二种来实现;
二、使用htmlunit在线解析网页信息,获取节假日情况
一开始是使用jsoup去解析网页的,效果不理想,如果网页是动态生成的时候,用jsoup遇到了各种问题,所以改成了htmlunit,总得来说htmlunit还是很强大的,能够模拟浏览器运行,被誉为java浏览器的开源实现;
首先去官网下载相关jar包,以及阅读相关文档:
http://htmlunit.sourceforge.net/
我这里解析的网页是360的万年历:
http://hao.360.cn/rili/
日历界面如下:
被解析的 HTML格式如下:
实现步骤:
1、加载页面;
2、循环等待页面加载完成(可能会有一些动态页面,是用javascript生成);
3、根据网页格式解析html内容,并提取关键信息存入封装好的对象;
注意点:
1、难点在于判断是否休假及假期类型,由于原页面并没有标明每一天的假期类型,所以这里的逻辑要自己去实现,详情参考代码;
2、之所以有个静态latestVocationName变量,是防止出现以下情况(出现该情况的概率极低;PS:方法要每天调用一次,该变量才生效):
代码实现:
定义一个中国日期类:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
|
package com.pichen.tools.getDate; import java.util.Date; public class ChinaDate { /** * 公历时间 */ private Date solarDate; /** * 农历日 */ private String lunar; /** * 公历日 */ private String solar; /** * 是否是 休 */ private boolean isVacation = false ; /** * 如果是 休情况下的假期名字 */ private String VacationName = "非假期" ; /** * 是否是 班 */ private boolean isWorkFlag = false ; private boolean isSaturday = false ; private boolean isSunday = false ; /** * @return the solarDate */ public Date getSolarDate() { return solarDate; } /** * @param solarDate the solarDate to set */ public void setSolarDate(Date solarDate) { this .solarDate = solarDate; } /** * @return the lunar */ public String getLunar() { return lunar; } /** * @param lunar the lunar to set */ public void setLunar(String lunar) { this .lunar = lunar; } /** * @return the solar */ public String getSolar() { return solar; } /** * @param solar the solar to set */ public void setSolar(String solar) { this .solar = solar; } /** * @return the isVacation */ public boolean isVacation() { return isVacation; } /** * @param isVacation the isVacation to set */ public void setVacation( boolean isVacation) { this .isVacation = isVacation; } /** * @return the vacationName */ public String getVacationName() { return VacationName; } /** * @param vacationName the vacationName to set */ public void setVacationName(String vacationName) { VacationName = vacationName; } /** * @return the isWorkFlag */ public boolean isWorkFlag() { return isWorkFlag; } /** * @param isWorkFlag the isWorkFlag to set */ public void setWorkFlag( boolean isWorkFlag) { this .isWorkFlag = isWorkFlag; } /** * @return the isSaturday */ public boolean isSaturday() { return isSaturday; } /** * @param isSaturday the isSaturday to set */ public void setSaturday( boolean isSaturday) { this .isSaturday = isSaturday; } /** * @return the isSunday */ public boolean isSunday() { return isSunday; } /** * @param isSunday the isSunday to set */ public void setSunday( boolean isSunday) { this .isSunday = isSunday; } } |
解析网页,并调用demo,打印本月详情,和当天详情:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
|
package com.pichen.tools.getDate; import java.io.IOException; import java.net.MalformedURLException; import java.text.DateFormat; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.ArrayList; import java.util.Date; import java.util.List; import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; import com.gargoylesoftware.htmlunit.WebClient; import com.gargoylesoftware.htmlunit.html.DomNodeList; import com.gargoylesoftware.htmlunit.html.HtmlElement; import com.gargoylesoftware.htmlunit.html.HtmlPage; public class Main { private static String latestVocationName= "" ; public String getVocationName(DomNodeList<HtmlElement> htmlElements, String date) throws ParseException{ String rst = "" ; boolean pastTimeFlag = false ; DateFormat dateFormat = new SimpleDateFormat( "yyyy/MM/dd" ); Date paramDate = dateFormat.parse(date); if ( new Date().getTime() >= paramDate.getTime()){ pastTimeFlag = true ; } //first step //jugde if can get vocation name from html page for ( int i = 0 ; i < htmlElements.size(); i++){ HtmlElement element = htmlElements.get(i); if (element.getAttribute( "class" ).indexOf( "vacation" )!=- 1 ){ boolean hitFlag = false ; String voationName = "" ; for (; i < htmlElements.size(); i++){ HtmlElement elementTmp = htmlElements.get(i); String liDate = elementTmp.getAttribute( "date" ); List<HtmlElement> lunar = elementTmp.getElementsByAttribute( "span" , "class" , "lunar" ); String lanarText = lunar.get( 0 ).asText(); if (lanarText.equals( "元旦" )){ voationName = "元旦" ; } else if (lanarText.equals( "除夕" )||lanarText.equals( "春节" )){ voationName = "春节" ; } else if (lanarText.equals( "清明" )){ voationName = "清明" ; } else if (lanarText.equals( "国际劳动节" )){ voationName = "国际劳动节" ; } else if (lanarText.equals( "端午节" )){ voationName = "端午节" ; } else if (lanarText.equals( "中秋节" )){ voationName = "中秋节" ; } else if (lanarText.equals( "国庆节" )){ voationName = "国庆节" ; } if (liDate.equals(date)){ hitFlag = true ; } if (elementTmp.getAttribute( "class" ).indexOf( "vacation" )==- 1 ){ break ; } } if (hitFlag == true && !voationName.equals( "" )){ rst = voationName; break ; } } else { continue ; } } //if first step fail(rarely), get from the latest Vocation name if (rst.equals( "" )){ System.out.println( "warning: fail to get vocation name from html page." ); //you can judge by some simple rule //from the latest Vocation name rst = Main.latestVocationName; } else if (pastTimeFlag == true ){ //更新《当前时间,且最近一次的可见的假期名 Main.latestVocationName = rst; } return rst; } public List<ChinaDate> getCurrentDateInfo(){ WebClient webClient = null ; List<ChinaDate> dateList = null ; try { DateFormat dateFormat = new SimpleDateFormat( "yyyy/MM/dd" ); dateList = new ArrayList<ChinaDate>(); webClient = new WebClient(); HtmlPage page = webClient.getPage( "http://hao.360.cn/rili/" ); //最大等待60秒 for ( int k = 0 ; k < 60 ; k++){ if (!page.getElementById( "M-dates" ).asText().equals( "" )) break ; Thread.sleep( 1000 ); } //睡了8秒,等待页面加载完成...,有时候,页面可能获取不到,不稳定() //Thread.sleep(8000); DomNodeList<HtmlElement> htmlElements = page.getElementById( "M-dates" ).getElementsByTagName( "li" ); //System.out.println(htmlElements.size()); for (HtmlElement element : htmlElements){ ChinaDate chinaDate = new ChinaDate(); List<HtmlElement> lunar = element.getElementsByAttribute( "span" , "class" , "lunar" ); List<HtmlElement> solar = element.getElementsByAttribute( "div" , "class" , "solar" ); chinaDate.setLunar(lunar.get( 0 ).asText()); chinaDate.setSolar(solar.get( 0 ).asText()); chinaDate.setSolarDate(dateFormat.parse(element.getAttribute( "date" ))); if (element.getAttribute( "class" ).indexOf( "vacation" )!=- 1 ){ chinaDate.setVacation( true ); chinaDate.setVacationName( this .getVocationName(htmlElements, element.getAttribute( "date" ))); } if (element.getAttribute( "class" ).indexOf( "weekend" )!=- 1 && element.getAttribute( "class" ).indexOf( "last" )==- 1 ){ chinaDate.setSaturday( true ); } if (element.getAttribute( "class" ).indexOf( "last weekend" )!=- 1 ){ chinaDate.setSunday( true ); } if (element.getAttribute( "class" ).indexOf( "work" )!=- 1 ){ chinaDate.setWorkFlag( true ); } else if (chinaDate.isSaturday() == false && chinaDate.isSunday() == false && chinaDate.isVacation() == false ){ chinaDate.setWorkFlag( true ); } else { chinaDate.setWorkFlag( false ); } dateList.add(chinaDate); } } catch (Exception e){ e.printStackTrace(); System.out.println( "get date from http://hao.360.cn/rili/ error~" ); } finally { webClient.close(); } return dateList; } public ChinaDate getTodayInfo(){ List<ChinaDate> dateList = this .getCurrentDateInfo(); DateFormat dateFormat = new SimpleDateFormat( "yyyy/MM/dd" ); for (ChinaDate date: dateList){ if (dateFormat.format(date.getSolarDate()).equals(dateFormat.format( new Date()))){ return date; } } return new ChinaDate(); } public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException { List<ChinaDate> dateList = new Main().getCurrentDateInfo(); ChinaDate today = new Main().getTodayInfo(); DateFormat dateFormat = new SimpleDateFormat( "yyyy/MM/dd" ); System.out.println( "本月详情:" ); for (ChinaDate date: dateList){ System.out.println(dateFormat.format(date.getSolarDate()) + " " + date.getVacationName()); } System.out.println( "------------------------------------------------------------------------" ); System.out.println( "今日详情:" ); System.out.println( "日期:" + today.getSolarDate()); System.out.println( "农历:" +today.getLunar()); System.out.println( "公历:" +today.getSolar()); System.out.println( "假期名:" +today.getVacationName()); System.out.println( "是否周六:" +today.isSaturday()); System.out.println( "是否周日:" +today.isSunday()); System.out.println( "是否休假:" +today.isVacation()); System.out.println( "是否工作日:" +today.isWorkFlag()); System.out.println( "已发生的最近一次假期:" + Main.latestVocationName); } } |
运行程序,结果正确: