如下所示:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
|
import pandas as pd path = 'F:/python/python数据分析与挖掘实战/图书配套数据、代码/chapter3/demo/data/catering_fish_congee.xls' data = pd.read_excel(path,header = None ,index_col = 0 ) data.index.name = '日期' data.columns = [ '销售额(元)' ] xse = data[ '销售额(元)' ] print (xse. max ()) print (xse. min ()) print (xse. max () - xse. min ()) fanwei = list ( range ( 0 , 4500 , 500 )) fenzu = pd.cut(xse.values,fanwei,right = False ) #分组区间,长度91 print (fenzu.codes) #标签 print (fenzu.categories) #分组区间,长度8 pinshu = fenzu.value_counts() #series,区间-个数 print (pinshu.index) import matplotlib.pyplot as plt pinshu.plot(kind = 'bar' ) #plt.text(0,29,str(29)) qujian = pd.cut(xse,fanwei,right = False ) data[ '区间' ] = qujian.values data.groupby( '区间' ).median() data.groupby( '区间' ).mean() #每个区间平均数 pinshu_df = pd.DataFrame(pinshu,columns = [ '频数' ]) pinshu_df[ '频率f' ] = pinshu_df / pinshu_df[ '频数' ]. sum () pinshu_df[ '频率%' ] = pinshu_df[ '频率f' ]. map ( lambda x: '%.2f%%' % (x * 100 )) pinshu_df[ '累计频率f' ] = pinshu_df[ '频率f' ].cumsum() pinshu_df[ '累计频率%' ] = pinshu_df[ '累计频率f' ]. map ( lambda x: '%.4f%%' % (x * 100 )) In[ 158 ]: pinshu_df Out[ 158 ]: 频数 频率f 频率 % 累计频率f 累计频率 % [ 0 , 500 ) 29 0.318681 31.87 % 0.318681 31.8681 % [ 500 , 1000 ) 20 0.219780 21.98 % 0.538462 53.8462 % [ 1000 , 1500 ) 12 0.131868 13.19 % 0.670330 67.0330 % [ 1500 , 2000 ) 12 0.131868 13.19 % 0.802198 80.2198 % [ 2000 , 2500 ) 8 0.087912 8.79 % 0.890110 89.0110 % [ 2500 , 3000 ) 3 0.032967 3.30 % 0.923077 92.3077 % [ 3000 , 3500 ) 4 0.043956 4.40 % 0.967033 96.7033 % [ 3500 , 4000 ) 3 0.032967 3.30 % 1.000000 100.0000 % |
以上这篇pandas分区间,算频率的实例就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持服务器之家。
原文链接:https://blog.csdn.net/castingA3T/article/details/79075240