当前位置: 首页 > news >正文

深圳开发公司网站婚纱官网

深圳开发公司网站,婚纱官网,小程序服务器可以做网站吗,wordpress 薄荷主题在时间序列建模任务中#xff0c;模型往往对于缺失数据是比较敏感的#xff0c;大量的缺失数据甚至会导致训练出来的模型完全不可用#xff0c;在我前面的博文中也有写到过数据填充相关的内容#xff0c;感兴趣的话可以自行移步阅读即可#xff1a; 《python 基于滑动平均…在时间序列建模任务中模型往往对于缺失数据是比较敏感的大量的缺失数据甚至会导致训练出来的模型完全不可用在我前面的博文中也有写到过数据填充相关的内容感兴趣的话可以自行移步阅读即可 《python 基于滑动平均思想实现缺失数据填充》 本文的核心目的主要是因为实际项目中有时间序列预测建模的需求这里需要做好前期数据的准备提取和处理工作在这里考虑基于一些常见的处理方法整合实现各种数据填充处理算法集成应用于项目中。 实例数据如下所示 01/01/2011,12,6.97,98,40.5,6.36,2.28,0.09,0.17 01/02/2011,11.7,6.97,98,40.8,6.4,2.06,0.09,0.21 01/03/2011,11.4,6.97,93,53.4,6.64,1.81,0.08,0.15 01/04/2011,9.9,6.96,95,33.5,6.39,2.38,0.09,0.2 01/05/2011,9.2,7.01,99,32.2,6.5,2.23,0.08,0.22 01/06/2011,9.9,6.97,98,32.9,6.74,1.74,0.07,0.13 01/07/2011,9.2,6.93,102,22.4,7.02,1.69,0.09,0.19 01/08/2011,9.6,6.97,104,35.1,7.26,1.79,0.07,0.27 01/09/2011,11.9,6.92,103,25.5,6.13,1.61,0.08,0.18 01/10/2011,12.3,6.96,102,30.9,6.66,1.9,0.06,0.08 01/11/2011,10.7,6.99,97,36.1,7.15,3.73,0.09,0.12 01/12/2011,9.3,6.95,97,34.5,7.66,2.33,0.08,0.13 01/13/2011,9.2,6.98,95,42,8.01,3.05,0.1,0.14 01/14/2011,10.5,6.95,98,30.9,7.01,3.05,0.07,0.13 01/15/2011,11.2,6.94,98,27.2,6.61,3.41,0.08,0.13 01/16/2011,9.1,6.93,93,39.6,7.51,3.4,0.12,0.23 01/17/2011,9.1,6.92,96,31.9,7.07,2.97,0.08,0.22 01/18/2011,10,6.93,95,37.6,7.55,2.64,0.08,0.11 01/19/2011,10.8,6.9,99,33.5,7.4,2.96,0.09,0.13 01/20/2011,10.6,6.86,100,31.8,7,3,0.08,0.12 01/21/2011,9.2,6.8,99,32.6,7.32,2.92,0.07,0.07 01/22/2011,9.4,6.76,99,35.8,7.44,3.62,0.12,0.14 01/23/2011,9.9,6.7,97,35.6,7.19,3.35,0.09,0.15 01/24/2011,10,6.66,99,35.9,7.16,3.18,0.07,0.08 01/25/2011,9.7,6.61,98,34.8,7.31,3.27,0.07,0.12 01/26/2011,9.5,6.54,101,33.5,7.08,3.56,0.08,0.21 01/27/2011,9.9,6.51,102,34.8,6.55,3.54,0.08,0.15 01/28/2011,10.4,6.47,98,20.5,6.46,3.38,0.07,0.09 01/29/2011,9.3,6.52,101,29.8,7.39,3.74,0.08,0.13 01/30/2011,8.2,6.53,102,33.8,7.83,3.51,0.08,0.13 01/31/2011,8.7,6.54,101,27.8,7.65,3.3,0.07,0.15 02/01/2011,9.8,6.58,102,31.4,7.02,3.25,0.07,0.11 02/02/2011,9.8,6.63,102,32.5,7.37,3.93,0.09,0.19 02/03/2011,9.9,6.69,102,32,7.27,3.8,0.08,0.14 02/04/2011,11.9,6.72,99,26.5,6.61,3.53,0.07,0.09 02/05/2011,13.7,6.75,97,24.9,6.31,3.37,0.07,0.09 02/06/2011,15.2,6.77,97,26.2,6.04,4.03,0.09,0.14 02/07/2011,16.5,6.76,92,23.2,5.82,3.61,0.07,0.1 02/08/2011,18.3,6.7,89,21.4,4.93,3.93,0.09,0.22 02/09/2011,18.5,6.72,84,17.5,5.33,3.33,0.07,0.1 02/10/2011,18,6.7,85,21.4,5.31,3.71,0.07,0.13 02/11/2011,15,6.72,88,22.1,6.08,3.49,0.06,0.06 02/12/2011,12.8,6.66,84,23.9,7.15,3.52,0.07,0.13 02/13/2011,12.2,6.61,81,26.9,7.39,3.5,0.07,0.11 02/14/2011,10.7,6.57,83,23.8,7.62,3.57,0.08,0.14 02/15/2011,9.5,6.53,84,27.1,7.88,3.53,0.08,0.12 02/16/2011,9.1,6.51,87,35.2,8.35,3.64,0.09,0.17 02/17/2011,9.8,6.46,94,31,7.87,3.38,0.08,0.15 02/18/2011,10.4,6.45,94,35.4,8.13,3.63,0.1,0.22 02/19/2011,10.6,6.39,86,33.5,7.97,3.5,0.1,0.2 02/20/2011,11.3,6.38,88,37,8.41,3.31,0.08,0.11 02/21/2011,12.5,6.37,89,32.1,7.24,3.34,0.08,0.11 02/22/2011,13.2,6.39,87,37.5,8.09,3.93,0.12,0.12 02/23/2011,14.6,6.4,89,25.6,6.87,3.71,0.08,0.14 02/24/2011,15,6.38,87,19.2,6.19,3.6,0.07,0.12 02/25/2011,16.2,6.36,86,19.5,5.57,3.54,0.07,0.13 02/26/2011,16.4,5.61,79,16.8,4.19,3.68,0.07,0.17 02/27/2011,8.9,2.54,29,15,2.42,3.29,0.07,0.09 02/28/2011,23,6.29,86,26.4,5.45,3.85,0.09,0.12 03/01/2011,22.4,6.43,92,27.4,5.71,1.78,0.07,0.13 03/02/2011,17.5,6.33,89,30.2,6.68,2.2,0.07,0.11 03/03/2011,15.4,6.36,91,29.8,7.01,1.97,0.07,0.07 03/04/2011,13.6,6.31,89,29,7.48,1.81,0.07,0.08 03/05/2011,13.2,6.3,92,25.9,6.89,2.54,0.07,0.1 03/06/2011,13.9,6.3,99,29.2,7.16,1.83,0.06,0.08 03/07/2011,14.4,6.27,98,26,7.05,1.62,0.05,0.07 03/08/2011,14.2,6.25,100,30.1,7.21,1.47,0.06, 03/09/2011,14.6,6.2,102,29.5,7.02,1.46,0.06, 03/10/2011,15.2,6.16,105,24.1,6.69,1.57,0.05,0.28 03/11/2011,15.2,6.13,107,32.5,6.78,1.74,0.07,0.43 03/12/2011,14.4,6.1,105,28.1,7.24,1.64,0.06,0.09 03/13/2011,15.2,6.05,102,27,6.97,1.73,0.06,0.09 03/14/2011,18,6,102,26.5,6.34,1.92,0.06,0.1 03/15/2011,19.5,5.99,99,26.4,6.14,2.17,0.06,0.1 03/16/2011,15.1,6.15,111,32.5,7.01,2.83,0.08,0.13 03/17/2011,14.6,6.33,118,33.2,7.25,2.44,0.07,0.06 03/18/2011,14.6,6.38,122,30.1,7.34,2.88,0.08,0.11 03/19/2011,13.5,6.35,124,32.4,7.66,2.69,0.09, 03/20/2011,15.6,6.26,108,53.9,6.79,2.9,0.14, 03/21/2011,20.8,6.17,95,44.2,5.2,2.31,0.1, 03/22/2011,19.9,6.23,99,47.8,6.87,3.09,0.12,0.11 03/23/2011,15.3,6.31,112,48.2,8.74,2.47,0.11,0.07 03/24/2011,14.6,6.22,114,43.5,8.95,2.68,0.13,0.13 03/25/2011,15.8,6.2,113,32.9,8.6,2.63,0.12,0.08 03/26/2011,15.7,6.16,119,35.6,8.97,2.51,0.1,0.06 03/27/2011,12.7,5.35,108,31.8,8.95,2.21,0.09,0.05 03/28/2011,14.2,6.05,126,25.7,6.67,2.23,0.06, 03/29/2011,,,,,,,, 03/30/2011,,,,,,,, 03/31/2011,,,,,,,, 04/01/2011,0.25,0.25,0.25,0.25,0.25,,0.08,0.51 04/02/2011,7.8,6.36,39,8.4,3.83,0.25,0.03,0.19 04/03/2011,17.2,7.56,147,77.8,8.7,1.13,0.08,0.2 04/04/2011,11.9,7.29,148,56.5,6.06,1.99,0.07,0.28 04/05/2011,14.9,7.12,181,96.6,6.15,2.38,0.08,0.44 04/06/2011,15.5,7.12,189,75.3,6.07,2.43,0.08,0.45 04/07/2011,16.3,7.12,199,13.8,5.53,2.46,0.07,0.38 04/08/2011,16.4,7.19,192,124.7,4.61,2.37,0.08,0.17 04/09/2011,16.3,7.1,198,286.6,5.19,2.62,0.07,0.17 可以看到数据集序列中有明显的缺失值现象如下所示 首先来看最基础的填充处理方式就是零值填充核心实现如下所示 SI SimpleImputer(missing_valuesnp.nan, strategyconstant,fill_value0) result SI.fit_transform(data) 这种方式当然也是最不推荐的方式。 接下来来看均值填充方法 SI SimpleImputer(missing_valuesnp.nan, strategymean) result SI.fit_transform(data) 上面两种填充处理都是基于sklearn模块内置的SimpleImputer方法实现的该方法的参数详情如下所示 class sklearn.impute.SimpleImputer(*, missing_valuesnan, strategy‘mean’, fill_valueNone, verbose0, copyTrue, add_indicatorFalse) 参数含义 missing_valuesint, float, str, (默认)np.nan或是None, 即缺失值是什么。 strategy空值填充的策略共四种选择默认mean、median、most_frequent、constant。mean表示该列的缺失值由该列的均值填充。median为中位数most_frequent为众数。constant表示将空值填充为自定义的值但这个自定义的值要通过fill_value来定义。 fill_valuestr或数值默认为Zone。当strategy constant时fill_value被用来替换所有出现的缺失值missing_values。fill_value为Zone当处理的是数值数据时缺失值missing_values会替换为0对于字符串或对象数据类型则替换为missing_value 这一字符串。 verboseint默认0控制imputer的冗长。 copyboolean默认True表示对数据的副本进行处理False对数据原地修改。 add_indicatorboolean默认FalseTrue则会在数据后面加入n列由0和1构成的同样大小的数据0表示所在位置非缺失值1表示所在位置为缺失值。 仿照我上面的方式还可以构建基于中位数、众数和自定义常量这几种数据填充方式如下所示 #中位数 SI SimpleImputer(missing_valuesnp.nan, strategymedian) result SI.fit_transform(data)#众数 SI SimpleImputer(missing_valuesnp.nan, strategymost_frequent) result SI.fit_transform(data)#自定义常量值 SI SimpleImputer(missing_valuesnp.nan, strategyconstant) result SI.fit_transform(data) 除了这些基于sklearn内置统计方法构建的填充方式之外还可以基于模型来进行填充本质的思想就是优先选取最易填充的维度进行填充之后循环处理即可这里给出基础的代码实现 sortInds np.argsort(X.isnull().sum(axis0)).values for i in sortInds:df Xfillc df.iloc[:,i]df df.iloc[:,df.columns ! i]dfs SimpleImputer(missing_valuesnp.nan,strategyconstant,fill_value0).fit_transform(df)Ytrain fillc[fillc.notnull()] Ytest fillc[fillc.isnull()] Xtrain dfs[Ytrain.index,:] Xtest dfs[Ytest.index,:]model.fit(Xtrain, Ytrain)Ypredict model.predict(Xtest)X.loc[X.iloc[:,i].isnull(),i] Ypredict接下来就是滑动平均的数据填充思想了这部分建议可以看前面的博文实现更加具体详细这里就不再展开了滑动平均的数据填充策略主要包括平均法和加权平均法唯一的区别就是在移动加权的处理方法加入了权重处理。 接下来对比一下差异 #平均 one_index_listlist(range(i-tmp,i))list(range(i1,itmp1)) one_value[data[h] for h in one_index_list] one_value[O for O in one_value if not math.isnan(O)] one_value[new_col_list[h] for h in one_index_list] one_value[O for O in one_value if not math.isnan(O)] new_col_list[i]sum(one_value)/len(one_value)#加权 one_index_listlist(range(i-tmp,i))list(range(i1,itmp1)) one_value[one_col_list[h] for h in one_index_list] weight_list[abs(1/(B-i)) for B in range(i-tmp,i) if not math.isnan(one_col_list[B])][abs(1/(L-i)) for L in range(i1,itmp) if not math.isnan(one_col_list[L])] one_wweightGenerate(weight_list) one_weight_value[one_value[j]*one_w[j] for j in range(len(one_w)) if not math.isnan(one_value[j])] new_col_list[i]sum(one_weight_value) 最后一种就是卡尔曼滤波的数据填充方式这里我主要是基于开源的模块pykalman实现的很简单网上也有很多的实例感兴趣的话可以自行研究下即可。 完成了不同类型数据填充方法的开发后 我们以实际的数据为例来对比下填充后的效果 我们的数据集中共有8个维度的特征数据依次使用上述不同的数据填充算法来对原始数据集进行填充处理可以看到不同填充算法的差异还是比较明显的。 数据量比较多看得可能不够真切这里对数据集抽稀100倍看下对比可视化效果如下所示 这里数据就变得非常地稀疏了接下来我们对其加密10倍再来看下填充算法的对比可视化效果如下所示
http://www.laogonggong.com/news/110616.html

相关文章:

  • asp网站开发实例pdf上海技术网站建设
  • 关于校园网站建设的建议gta5卖公司显示网站正在建设中
  • 局部刷新 文章列表 wordpress做网站seo的公司哪家好
  • wap网站开发价钱做竞价网站需要什么样的空间
  • 网站登录模版 下载登录wordpress的网址
  • 做wps的网站赚钱门户网站 商城系统
  • 档案网站建设经验广告公司名字大全创意
  • 临沂手机网站信息推广技术公司电话wordpress文库
  • 做影视网站违法wordpress搭建虚拟主机
  • 校园网站制度建设闵行网站制作公司
  • 电子商务网站建设指导书生成器
  • 排名好的徐州网站开发上海百度移动关键词排名优化
  • 火车头采集做网站赚钱建设部网站造价咨询
  • 网站的规划与设计织梦网站统计代码
  • 天天网站建设写软文是什么意思
  • 如何给企业做网站公司小程序如何申请
  • 网站开发php有哪些微信软文模板
  • 书画协会网站建设网站文案标准格式
  • 微信公众平台 网站开发怎么选一个适合自己的网站
  • 上海建设厅是哪个网站平台戚里带崇墉炊金馔玉待鸣钟
  • o2o商城网站建设方案上海宣传片拍摄的公司
  • 电子商务网站开发费用调研报告中山 环保 骏域网站建设专家
  • 中国建设银行信用卡旅游卡服务网站品牌推广型网站
  • 在线培训网站怎么做物流网站免费模板
  • 夏邑好心情网站建设有限公司阳山网站建设
  • 备案后可以修改网站吗网站设计与制作是网页吗
  • 网站建设需要哪些硬件软文价格
  • 个人怎么做微信公众号和微网站吗爱站网长尾词挖掘工具
  • 成都网站登记备案查询互联网广告投放公司
  • 上海建站中心夫妻网站开发