我已经通过游戏性能评级来收集玩家游戏的数据集,我的目标是根据密度曲线对它们进行分组。下面是带有五个随机参与者的密度曲线的图。我们可以看到红色和紫色密度曲线是相似的,但是R中是否有度量或函数可以帮助我对图中的50条曲线进行分组?我想将他们至少分为3-4个小组,例如第一个小组将被提及为红色和紫色线条,代表玩家比坏游戏拥有更好的游戏。
用于绘图和样本数据的简单代码: 列: TOTAL.QBR-玩家游戏等级 ID-每个玩家都唯一
library(sm)
newdata <- structure(list(TOTAL.QBR = c(16.9, 21.1, 22.1, 34.8, 53.6, 58.3,
82, 13, 13.6, 16.3, 18.2, 23.9, 27.3, 27.6, 28.4, 28.6, 29.1,
32.3, 33.1, 34.8, 36.4, 37.1, 37.2, 37.7, 37.8, 39.3, 40.4, 41.8,
42.4, 44.3, 44.5, 47, 47.3, 48.1, 48.7, 49, 49.3, 49.4, 49.8,
50.2, 51, 51.5, 53.5, 53.5, 53.5, 53.6, 55.4, 55.5, 56, 56.3,
56.5, 57, 57.3, 59.6, 60.5, 62.2, 62.8, 62.8, 63.3, 64.5, 65.4,
65.5, 66.4, 67.2, 67.3, 67.4, 67.6, 68.9, 68.9, 69.1, 69.3, 69.9,
71.5, 71.9, 72.5, 72.9, 73.4, 74.1, 74.4, 74.9, 75.2, 75.2, 75.8,
75.9, 76.6, 77.1, 78.2, 78.2, 78.4, 78.6, 78.7, 79.3, 79.8, 79.9,
81.6, 81.8, 82.4, 82.4, 82.9, 82.9, 83.2, 83.5, 83.6, 83.7, 84,
84.4, 84.7, 84.8, 84.9, 85.7, 87.4, 87.8, 88.1, 88.4, 88.5, 88.8,
89.1, 89.4, 89.4, 89.4, 89.5, 90.9, 92.2, 92.7, 93, 93.5, 93.6,
94.9, 95.1, 95.4, 95.6, 96.6, 96.7, 97.6, 97.8, 98.4, 98.6, 99.4,
26.2, 34.6, 42.7, 87, 1.4, 5, 6.8, 7.7, 8.3, 9.6, 9.9, 10.1,
11.1, 12.5, 12.7, 12.9, 13.1, 13.9, 14.3, 14.6, 15.8, 16.6, 19.3,
20.8, 22.8, 23, 23.5, 26.4, 27.2, 28, 30.3, 30.8, 30.9, 31.4,
31.5, 31.8, 32.1, 32.5, 32.6, 33.3, 33.3, 33.9, 34.6, 34.7, 34.7,
34.8, 35.8, 36.8, 37.2, 37.9, 39.9, 40.4, 40.8, 41.1, 41.8, 43.7,
44.2, 45, 45.9, 46.9, 47.2, 47.3, 47.6, 47.8, 49.1, 49.1, 49.3,
51.7, 53.3, 53.6, 54.2, 56.2, 56.4, 56.4, 56.6, 56.9, 59.2, 59.6,
59.7, 59.8, 60, 60.6, 60.8, 61.6, 62, 62.8, 62.9, 62.9, 63.7,
64.5, 65, 65, 65.9, 67.5, 67.6, 67.6, 68.4, 68.8, 69.3, 70.4,
71.2, 71.9, 74, 74.3, 75.4, 76.6, 77.4, 77.5, 77.7, 78, 79.2,
79.2, 79.4, 80.1, 80.7, 81.7, 81.8, 82.7, 83.6, 85, 85.5, 86.7,
87.5, 88.5, 90.1, 90.3, 90.6, 90.7, 92.4, 92.4, 93.1, 93.5, 95.3,
95.7, 95.9, 98.2, 8.5, 9.1, 13.7, 17.6, 23.3, 26.1, 26.4, 27.1,
29.7, 30.4, 33, 36.1, 37.9, 41.5, 42.7, 45, 45.5, 45.5, 46, 46.9,
49.9, 50.6, 50.7, 53.4, 53.5, 56.3, 57.4, 57.9, 58.6, 58.7, 60.2,
60.7, 61.8, 62.2, 62.9, 64.4, 65.6, 66.7, 67, 67.4, 68.2, 68.2,
68.8, 72, 72.9, 73.2, 74.5, 75.8, 76.4, 76.6, 77.6, 79.9, 80.4,
81.2, 81.4, 81.8, 82.8, 84.6, 85.1, 87.2, 87.9, 90, 91.7, 94.5,
94.8, 95.5, 98.4, 1.4, 1.6, 5.6, 9.4, 17.6, 19.5, 24.5, 34.9,
39.6, 42.4, 59, 1, 2.6, 3.1, 5.5, 7.1, 14.8, 15.9, 19.5, 20.1,
21.8, 21.9, 22, 22.8, 25.7, 26.5, 26.5, 26.6, 27.2, 27.8, 28.6,
31, 31.6, 32.4, 34.1, 36, 36.5, 37.6, 38.2, 38.2, 39, 39.1, 39.2,
39.7, 40.3, 40.3, 40.6, 40.6, 41.2, 44.4, 46.2, 47, 47.1, 47.2,
47.4, 47.7, 48, 48.3, 48.5, 48.7, 49.9, 50.4, 50.7, 51.2, 52,
52.2, 52.2, 52.6, 53.7, 55.7, 55.8, 57.6, 58.8, 61.3, 62.4, 62.4,
63.4, 63.6, 63.7, 63.9, 64.6, 66.3, 67.1, 68.5, 69.4, 69.6, 69.9,
70.5, 71.8, 75.8, 76.4, 77.2, 77.4, 78.8, 79.2, 80.5, 81.7, 82.1,
82.4, 83.2, 83.7, 85.4, 89, 90.5, 90.5, 91.1, 92.2, 92.4, 94.9,
96.9, 98.1, 4.1, 8.3, 10.3, 11.4, 12.2, 12.2, 12.9, 14.4, 18.5,
18.7, 18.7, 19.8, 20.9, 21.8, 22.7, 25.9, 25.9, 26, 28.3, 28.3,
28.5, 28.8, 30.9, 32.3, 33.7, 34, 34.5, 36.6, 36.7, 37.8, 38.3,
38.5, 39.9, 41.4, 43.4, 43.6, 44, 44.2, 45.1, 45.4, 45.5, 45.6,
46.3, 46.3, 47.1, 47.5, 48.8, 49.7, 49.9, 51.4, 52.2, 53.1, 54.5,
54.8, 55.8, 56, 56.2, 57.7, 58.2, 58.3, 58.3, 58.8, 58.9, 60.5,
60.6, 60.8, 61.4, 61.4, 61.6, 61.7, 63.6, 64.4, 64.5, 64.7, 64.8,
65, 65.4, 65.6, 65.6, 65.8, 66.2, 66.5, 69.6, 69.7, 69.8, 70.1,
70.4, 71, 71.3, 71.7, 71.7, 71.9, 72.5, 73.4, 73.4, 74.4, 74.7,
74.9, 75.2, 75.5, 76, 76.3, 76.5, 76.6, 76.8, 77.2, 78.1, 78.3,
78.7, 79, 79.7, 79.9, 81, 81.2, 81.3, 81.6, 82.6, 82.7, 82.9,
83, 83.5, 83.5, 83.6, 83.8, 83.8, 84, 84.4, 84.5, 85.3, 85.3,
85.3, 85.5, 86.2, 86.2, 86.6, 86.6, 86.7, 87, 87.1, 87.2, 87.4,
87.9, 88.3, 88.4, 88.5, 89, 89.4, 90.4, 90.5, 91.1, 91.4, 92.3,
93.8, 94.7, 95.2, 95.6, 95.9, 96, 96.2, 96.3, 97.4, 98.9, 99.1,
99.2), ID = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L,
4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L)), row.names = c(5197L, 5571L, 5513L, 5633L, 5533L,
5476L, 5432L, 4493L, 2658L, 3748L, 1245L, 1357L, 1960L, 1800L,
631L, 3470L, 3778L, 4291L, 1064L, 2019L, 4458L, 3391L, 892L,
710L, 1374L, 1163L, 4697L, 1591L, 2537L, 1403L, 3986L, 3419L,
2617L, 4103L, 2705L, 4601L, 1473L, 3808L, 2399L, 48L, 3959L,
3437L, 1176L, 2426L, 4350L, 1982L, 2327L, 2867L, 734L, 4665L,
2366L, 528L, 2453L, 405L, 80L, 647L, 1914L, 4626L, 3050L, 4163L,
587L, 1083L, 3103L, 939L, 1281L, 3882L, 2038L, 1313L, 4009L,
3204L, 9L, 839L, 4538L, 1255L, 1882L, 1430L, 2920L, 4504L, 1817L,
2795L, 2759L, 2884L, 4067L, 4245L, 3167L, 2667L, 104L, 1127L,
1689L, 1488L, 999L, 3601L, 750L, 5022L, 1720L, 1195L, 3286L,
3925L, 810L, 3077L, 486L, 3315L, 1519L, 2965L, 4128L, 4031L,
967L, 1750L, 2549L, 3568L, 4559L, 3015L, 1023L, 3347L, 4365L,
665L, 2818L, 1547L, 4214L, 4304L, 3130L, 2725L, 3535L, 2936L,
777L, 3692L, 1655L, 3899L, 899L, 2469L, 4391L, 1840L, 3480L,
546L, 2059L, 3221L, 4183L, 1628L, 1380L, 1327L, 1295L, 1335L,
5370L, 2315L, 1742L, 4987L, 2495L, 3995L, 1415L, 1015L, 3689L,
544L, 4931L, 4958L, 2959L, 91L, 1044L, 2517L, 5546L, 4146L, 1991L,
2022L, 3968L, 3182L, 4112L, 1067L, 2876L, 4792L, 3304L, 5389L,
5510L, 3448L, 5309L, 313L, 4023L, 1957L, 1832L, 893L, 3120L,
2153L, 1698L, 4725L, 4818L, 5601L, 3367L, 741L, 2073L, 144L,
3329L, 283L, 378L, 2901L, 3932L, 1762L, 201L, 1324L, 5415L, 5188L,
2098L, 228L, 5444L, 4167L, 847L, 5628L, 1893L, 3417L, 501L, 1175L,
5251L, 1557L, 3052L, 5215L, 1086L, 1585L, 791L, 3028L, 1367L,
44L, 974L, 3639L, 1637L, 1262L, 555L, 1282L, 2206L, 3234L, 3085L,
1784L, 1912L, 2392L, 2172L, 673L, 423L, 455L, 3729L, 1466L, 4070L,
641L, 3135L, 3379L, 2035L, 1428L, 1663L, 3201L, 1098L, 1607L,
1196L, 395L, 2231L, 2356L, 2422L, 935L, 1485L, 4033L, 904L, 5554L,
156L, 2819L, 609L, 2990L, 2260L, 1332L, 2L, 1220L, 809L, 2910L,
747L, 2523L, 98L, 5263L, 324L, 3897L, 1120L, 2443L, 1838L, 1117L,
2130L, 689L, 1706L, 1040L, 2183L, 2220L, 1013L, 2489L, 1862L,
1563L, 2680L, 2100L, 2708L, 2737L, 848L, 1789L, 1134L, 1402L,
1588L, 561L, 1761L, 944L, 978L, 915L, 2766L, 2365L, 589L, 2007L,
2610L, 2269L, 2452L, 1915L, 2530L, 2326L, 1492L, 698L, 525L,
2236L, 618L, 2503L, 2294L, 2639L, 489L, 1429L, 1079L, 2139L,
1608L, 2384L, 2034L, 1722L, 1658L, 749L, 2420L, 2792L, 2581L,
2550L, 1942L, 1517L, 637L, 1971L, 869L, 1145L, 1452L, 1874L,
806L, 4495L, 5369L, 4617L, 5395L, 5228L, 5573L, 5287L, 5600L,
5418L, 5303L, 5328L, 31L, 2163L, 1653L, 416L, 2749L, 1804L, 2909L,
181L, 772L, 1565L, 60L, 2514L, 1836L, 2715L, 802L, 3152L, 381L,
1478L, 742L, 2491L, 2775L, 2461L, 3058L, 2375L, 256L, 2101L,
1136L, 1732L, 2123L, 596L, 682L, 1406L, 1926L, 438L, 470L, 1185L,
2215L, 1985L, 1955L, 1500L, 1857L, 2834L, 887L, 3113L, 310L,
2275L, 946L, 3174L, 84L, 535L, 503L, 3089L, 3267L, 2331L, 1440L,
1613L, 917L, 193L, 222L, 2866L, 2998L, 2611L, 1581L, 840L, 2673L,
554L, 1154L, 3233L, 2557L, 1003L, 1197L, 1692L, 106L, 2640L,
334L, 1053L, 2945L, 2387L, 2000L, 1753L, 2291L, 3020L, 1880L,
2916L, 2230L, 1253L, 639L, 266L, 968L, 1518L, 2577L, 2414L, 606L,
2030L, 1022L, 1218L, 1073L, 1654L, 807L, 2058L, 635L, 4442L,
5015L, 2782L, 4299L, 5261L, 715L, 862L, 5488L, 4464L, 4614L,
2077L, 4114L, 5227L, 1900L, 1356L, 1536L, 1445L, 4174L, 5512L,
1140L, 1649L, 2250L, 87L, 2844L, 5361L, 113L, 2339L, 3211L, 1733L,
3906L, 1860L, 3987L, 1161L, 2119L, 3442L, 141L, 764L, 2455L,
5597L, 1954L, 3860L, 2952L, 4513L, 3466L, 5095L, 5066L, 2704L,
5274L, 3522L, 3612L, 2561L, 3957L, 3580L, 558L, 306L, 4348L,
3264L, 5380L, 1495L, 4568L, 2093L, 4540L, 4484L, 1916L, 2997L,
1230L, 1399L, 3636L, 527L, 1467L, 336L, 2732L, 2296L, 5438L,
246L, 4661L, 3104L, 4247L, 3675L, 186L, 3137L, 813L, 3701L, 2140L,
877L, 2169L, 726L, 2886L, 1665L, 2587L, 7L, 3487L, 2001L, 2036L,
1026L, 4745L, 908L, 3022L, 4622L, 5023L, 271L, 3759L, 2357L,
4395L, 1308L, 2475L, 2499L, 2793L, 583L, 4064L, 1813L, 361L,
4004L, 3877L, 3046L, 157L, 3165L, 4833L, 487L, 419L, 451L, 5401L,
2419L, 3406L, 4966L, 4127L, 1361L, 34L, 966L, 1548L, 3789L, 1603L,
1777L, 4710L, 4803L, 5526L, 5612L, 1278L, 1747L, 2850L, 391L,
2911L, 4770L, 4884L, 4185L, 5550L, 3820L, 1168L, 2964L, 2195L,
5315L, 4912L, 3534L, 2520L, 775L, 2257L, 2381L, 1247L, 993L,
5291L, 3722L, 4934L, 1572L), class = "data.frame")
sm.density.compare(newdata$TOTAL.QBR, newdata$ID, ylim=c(0,0.030))
答案 0 :(得分:1)
如果有n条曲线,则用k点处的值表示每个曲线,并形成n x k矩阵X。然后运行
set.seed(123)
kmeans(X, g, nstart = 25)
其中输入g是所需的组数。这将使用25个不同的随机起点执行25次聚类,然后返回其中的最佳点。
自从上面发布问题以来,就添加了数据和一些代码,因此使用了它:
result <- sm.density.compare(newdata$TOTAL.QBR, newdata$ID, ylim=c(0, 0.030),
model = "equal")
X <- result$est
set.seed(123)
km <- kmeans(X, 3, nstart = 25)
# plot each group in a separate color
ts.plot(t(X), col = km$cl)
查找g
的自动方法是使用NbClust软件包。它将从多个角度估计g,我们可以采用最常出现的值。
library(NbClust)
res <- NbClust(t(X), distance = "euclidean", min.nc = 2, max.nc = 8,
method = "kmeans")
tab <- table(res$Best.nc[1, ])
g <- as.numeric(names(tab[which.max(tab)]))
g
## [1] 3
答案 1 :(得分:1)
通常,可以根据时间序列聚类问题来解决此问题。所有clustering and preprocessing approaches都有。
由于您不是要使用特定的算法(度量),而是使用R中的功能,因此这里对许多相关的软件包进行了调查,这些软件包能够处理R中的问题-只需确保您了解确切地说明您的目标是什么,以及哪种聚类算法在您的应用程序中有意义:
flexclust软件包(Leisch 2006)实现了许多分区过程,而 集群 包(Maechler 等。 2017) 更侧重于层次程序及其评估;但是,他们俩都不是 特别针对时间序列数据。像这样的包 TSdist (森 等。 2016)和 TSclust (Montero and Vilar 2014)仅关注时间序列的相异性度量,后者 其中包括一个基于 p 价值观。另一个例子是 pdc 包(Brandmaier,2015年),它实现了一种特定的聚类算法,即 基于排列分布。的 dtw 包(Giorgino 2009)实现了广泛 关于DTW的功能,但不包括可以 在时间序列聚类中非常有用。因此, dtwclust 包可能是可行的方法,并且能够提供一致且用户友好的 与经典和新的聚类算法进行交互的方式,并考虑了 时间序列数据的细微差别。