我试图将嵌套的小标题列设置为行名并执行kmeans模型,但不确定如何继续进行此部分。
数据如下:
value_1 year_1 value_2 year_2 id_key
1 0.8572629 2006_2007 0.8352446 2006_2007 2006_2007_21267
2 0.9955628 2017_2018 0.9851993 2017_2018 2017_2018_1111711
3 0.9336878 2012_2013 0.9865080 2012_2013 2012_2013_1140536
4 0.8965862 2017_2018 0.9877127 2017_2018 2017_2018_832988
5 0.9659160 2012_2013 0.9855530 2012_2013 2012_2013_715096
6 0.9788319 2012_2013 0.5560681 2012_2013 2012_2013_875045
我应用以下代码(其中x
是下面的数据):
kclust <- x %>%
as_tibble() %>%
group_by(year_1, year_2) %>%
nest(.key = "value")
kclust
哪个给我这个输出:
# A tibble: 13 x 3
year_1 year_2 value
<chr> <chr> <list>
1 2006_2007 2006_2007 <tibble [3 x 3]>
2 2017_2018 2017_2018 <tibble [11 x 3]>
3 2012_2013 2012_2013 <tibble [11 x 3]>
4 2010_2011 2010_2011 <tibble [11 x 3]>
5 2014_2015 2014_2015 <tibble [12 x 3]>
6 2011_2012 2011_2012 <tibble [9 x 3]>
7 2013_2014 2013_2014 <tibble [11 x 3]>
8 2016_2017 2016_2017 <tibble [7 x 3]>
9 2009_2010 2009_2010 <tibble [6 x 3]>
10 2008_2009 2008_2009 <tibble [3 x 3]>
11 2007_2008 2007_2008 <tibble [7 x 3]>
12 2015_2016 2015_2016 <tibble [5 x 3]>
13 2018_2019 2018_2019 <tibble [4 x 3]>
检查kclust$value
给我:
[[12]]
# A tibble: 5 x 3
value_1 value_2 id_key
<dbl> <dbl> <chr>
1 0.943 0.887 2015_2016_1024478
2 0.861 0.571 2015_2016_816284
3 0.759 0.959 2015_2016_1260221
4 0.756 0.921 2015_2016_101829
5 0.981 0.936 2015_2016_709519
[[13]]
# A tibble: 4 x 3
value_1 value_2 id_key
<dbl> <dbl> <chr>
1 0.927 0.959 2018_2019_6201
2 0.888 0.950 2018_2019_1274494
3 0.962 0.995 2018_2019_1011657
4 0.982 0.921 2018_2019_78814
我想在其中的每一个中将id_key
设置为行名。因此,每个小标题的行名均为id_key
,而kmeans模型将在列value_1
和value_2
上执行。
我当前拥有的代码如下:
k_means_centers = 2
kclust <- x %>%
as_tibble() %>%
group_by(year_1, year_2) %>%
nest(.key = "value") %>%
filter(map_int(value, nrow) > 4) %>%
mutate(kmeans = map(value, ~kmeans(.x[[1]],
centers = k_means_centers, iter.max = 10, nstart = 1)),
tidied = map(kmeans, tidy),
glanced = map(kmeans, glance),
augmented = map2(kmeans, value, augment))
但是这是错误的,因为应该将列id_key
设置为行名。
关于前进的任何想法都很棒!
数据:
x <- structure(list(value_1 = c(0.857262918412708, 0.995562776151855,
0.93368775296229, 0.896586197519892, 0.965915992432594, 0.978831872921186,
0.931391986938977, 0.92860612171699, 0.942462944742556, 0.762633664061804,
0.929314203239609, 0.857555211754759, 0.942672735583934, 0.975237093000455,
0.472863198177383, 0.83842400391849, 0.526669740477171, 0.952190151229782,
0.519623395661802, 0.981457763792911, 0.91428464980769, 0.954400181141033,
0.840051106034647, 0.867699181854421, 0.89115631348807, 0.729514655086613,
0.659568217442908, 0.955200325383147, 0.88820423579156, 0.777402590109491,
0.943172514612716, 0.944933061146504, 0.476284928268558, 0.946901135343463,
0.780230224813699, 0.909629399821505, 0.865760792222491, 0.773621382484436,
0.836554542942252, 0.850980529158788, 0.527814114655505, 0.90791799831592,
0.882265024462087, 0.952685269154299, 0.891211744870977, 0.976456274127145,
0.90126924436977, 0.969111672067112, 1, 1, 0.968113370208161,
0.916126244980983, 0.933883953373501, 0.980900126347656, 0.924480004726964,
0.967149874304775, 1, 0.933612247290514, 0.982568394222027, 0.987764537202365,
0.898088994752593, 0.943973029406048, 0.926659428845797, 0.982663249602368,
0.0116889359524374, 0.985030805938289, 0.888240767289578, 0.779528122930639,
0.99485244406698, 0.82816655776856, 0.861023758791347, 1, 0.664694109407606,
0.960818825051411, 0.75945031696856, 0.763886276158968, 0.835629553075541,
0.846110310875411, 0.755847711756697, 0.67196568780797, 0.961888544525641,
0.969861418360086, 1, 0.974427258663929, 0.831055915618247, 0.93600722049079,
0.966106024456998, 0.901338903666627, 0.683877040965164, 0.979457749060513,
0.943143340661096, 1, 0.98087163513475, 0.769732004988478, 0.968733750777874,
0.937393276158036, 0.982135855939805, 1, 0.00987183002576516,
0.764641730481701), year_1 = c("2006_2007", "2017_2018", "2012_2013",
"2017_2018", "2012_2013", "2012_2013", "2012_2013", "2010_2011",
"2014_2015", "2011_2012", "2013_2014", "2016_2017", "2013_2014",
"2011_2012", "2011_2012", "2013_2014", "2009_2010", "2009_2010",
"2013_2014", "2010_2011", "2016_2017", "2016_2017", "2017_2018",
"2013_2014", "2014_2015", "2014_2015", "2017_2018", "2008_2009",
"2007_2008", "2012_2013", "2015_2016", "2017_2018", "2007_2008",
"2012_2013", "2014_2015", "2009_2010", "2006_2007", "2010_2011",
"2010_2011", "2014_2015", "2012_2013", "2017_2018", "2006_2007",
"2013_2014", "2009_2010", "2007_2008", "2012_2013", "2010_2011",
"2014_2015", "2017_2018", "2017_2018", "2011_2012", "2013_2014",
"2016_2017", "2013_2014", "2014_2015", "2011_2012", "2013_2014",
"2010_2011", "2012_2013", "2007_2008", "2017_2018", "2018_2019",
"2013_2014", "2012_2013", "2014_2015", "2018_2019", "2009_2010",
"2008_2009", "2010_2011", "2015_2016", "2010_2011", "2011_2012",
"2007_2008", "2015_2016", "2008_2009", "2011_2012", "2013_2014",
"2015_2016", "2010_2011", "2018_2019", "2016_2017", "2016_2017",
"2011_2012", "2016_2017", "2010_2011", "2017_2018", "2014_2015",
"2007_2008", "2014_2015", "2009_2010", "2017_2018", "2015_2016",
"2010_2011", "2014_2015", "2012_2013", "2018_2019", "2007_2008",
"2011_2012", "2014_2015"), value_2 = c(0.83524458245376, 0.985199346676161,
0.98650800423171, 0.987712680219121, 0.985552973109259, 0.55606807703455,
0.993081550565629, 0.942324451054759, 0.874951001978959, 0.972242235849801,
0.960561835073607, 0.745948805820105, 0.797055662541724, 0.977508894088148,
0.712233681864871, 0.285060053385682, 0.905730331400375, 0.93571084346821,
0.790305033705714, 0.958722926473936, 0.962776635511766, 0.992608325470545,
0.474283965476535, 0.806366773701265, 0.904730345643149, 0.862254279087857,
0.984488707157245, 0.892241046229236, 0.714442964628943, 0.807622124741829,
0.887170731681905, 0.954684589806249, 0.9211778417945, 0.948974567771373,
0.965125469708914, 0.886108424878785, 0.942065878654209, 0.66663307765255,
0.90331177434957, 0.976829922293502, 0.95848533971269, 0.956127315051688,
0.650750852737616, 0.9999724828739, 0.826005013210071, 0.959980346940766,
0.978304048122191, 0.975422514331076, 0.792199553496305, 0.461104040127036,
0.997962170857627, 0.968897881428091, 0.820571356084491, 0.99183854174536,
0.937073215517585, 0.993271661681666, 0.862602069969553, 0.941823773454386,
0.984268864412331, 0.983876968226894, 0.760177556170661, 0.926285514876429,
0.959350334184441, 0.996409752091077, 0.0403289662596409, 0.994749999506845,
0.950154051313514, 0.916520797550305, 0.728271849187279, 0.89835975825379,
0.571018894293857, 0.971731331958454, 0.810499095029711, 0.887497351434693,
0.958925181726699, 0.893189038016587, 0.875143741543741, 0.833284214217249,
0.921240338805686, 0.926586130283117, 0.994798572238072, 0.980971763292719,
0.964016005572769, 0.989376580856801, 0.935519257737914, 0.922845605574439,
0.996381524259124, 0.0351359902186695, 0.953643584869029, 0.937802352885434,
0.902249244386311, 0.719887783612443, 0.936028294902931, 0.809292272844584,
0.974049350800454, 0.781649033147858, 0.920733566350649, 0.998781417653825,
0.0617975853732401, 0.883026179946989), year_2 = c("2006_2007",
"2017_2018", "2012_2013", "2017_2018", "2012_2013", "2012_2013",
"2012_2013", "2010_2011", "2014_2015", "2011_2012", "2013_2014",
"2016_2017", "2013_2014", "2011_2012", "2011_2012", "2013_2014",
"2009_2010", "2009_2010", "2013_2014", "2010_2011", "2016_2017",
"2016_2017", "2017_2018", "2013_2014", "2014_2015", "2014_2015",
"2017_2018", "2008_2009", "2007_2008", "2012_2013", "2015_2016",
"2017_2018", "2007_2008", "2012_2013", "2014_2015", "2009_2010",
"2006_2007", "2010_2011", "2010_2011", "2014_2015", "2012_2013",
"2017_2018", "2006_2007", "2013_2014", "2009_2010", "2007_2008",
"2012_2013", "2010_2011", "2014_2015", "2017_2018", "2017_2018",
"2011_2012", "2013_2014", "2016_2017", "2013_2014", "2014_2015",
"2011_2012", "2013_2014", "2010_2011", "2012_2013", "2007_2008",
"2017_2018", "2018_2019", "2013_2014", "2012_2013", "2014_2015",
"2018_2019", "2009_2010", "2008_2009", "2010_2011", "2015_2016",
"2010_2011", "2011_2012", "2007_2008", "2015_2016", "2008_2009",
"2011_2012", "2013_2014", "2015_2016", "2010_2011", "2018_2019",
"2016_2017", "2016_2017", "2011_2012", "2016_2017", "2010_2011",
"2017_2018", "2014_2015", "2007_2008", "2014_2015", "2009_2010",
"2017_2018", "2015_2016", "2010_2011", "2014_2015", "2012_2013",
"2018_2019", "2007_2008", "2011_2012", "2014_2015"), id_key = c("2006_2007_21267",
"2017_2018_1111711", "2012_2013_1140536", "2017_2018_832988",
"2012_2013_715096", "2012_2013_875045", "2012_2013_891024", "2010_2011_815556",
"2014_2015_39911", "2011_2012_1123360", "2013_2014_916365", "2016_2017_26172",
"2013_2014_732485", "2011_2012_1551152", "2011_2012_1095073",
"2013_2014_709804", "2009_2010_65100", "2009_2010_1018963", "2013_2014_20388",
"2010_2011_1115222", "2016_2017_1646383", "2016_2017_1567892",
"2017_2018_1368007", "2013_2014_205520", "2014_2015_851968",
"2014_2015_46989", "2017_2018_1116521", "2008_2009_23217", "2007_2008_79879",
"2012_2013_709804", "2015_2016_1024478", "2017_2018_1062379",
"2007_2008_877890", "2012_2013_51396", "2014_2015_1064728", "2009_2010_1026214",
"2006_2007_58492", "2010_2011_820027", "2010_2011_2488", "2014_2015_1458891",
"2012_2013_40545", "2017_2018_1326801", "2006_2007_36270", "2013_2014_1140536",
"2009_2010_1396009", "2007_2008_42582", "2012_2013_1637459",
"2010_2011_794323", "2014_2015_4904", "2017_2018_701221", "2017_2018_934612",
"2011_2012_47111", "2013_2014_352947", "2016_2017_1613103", "2013_2014_34408",
"2014_2015_890801", "2011_2012_875570", "2013_2014_812074", "2010_2011_1466258",
"2012_2013_723612", "2007_2008_5513", "2017_2018_30625", "2018_2019_6201",
"2013_2014_1024305", "2012_2013_1466258", "2014_2015_814453",
"2018_2019_1274494", "2009_2010_2488", "2008_2009_106640", "2010_2011_33213",
"2015_2016_816284", "2010_2011_1267238", "2011_2012_1451505",
"2007_2008_38777", "2015_2016_1260221", "2008_2009_1001082",
"2011_2012_817473", "2013_2014_1274057", "2015_2016_101829",
"2010_2011_1020569", "2018_2019_1011657", "2016_2017_789388",
"2016_2017_1004434", "2011_2012_1156039", "2016_2017_1350031",
"2010_2011_205520", "2017_2018_42582", "2014_2015_812074", "2007_2008_1039684",
"2014_2015_751652", "2009_2010_1047699", "2017_2018_101829",
"2015_2016_709519", "2010_2011_861878", "2014_2015_832428", "2012_2013_74208",
"2018_2019_78814", "2007_2008_922224", "2011_2012_314808", "2014_2015_3673"
)), row.names = c(NA, -100L), class = "data.frame")
编辑:
我尝试rownames_to_column()
时运气不佳。