我有关于电影的数据集。我希望这个dput
输出有助于为某些人重新创建。如果这不是共享数据的有效方式,请告诉我。我试图使用cor
来找到变量之间的相关矩阵,但我得到以下错误。
Error in cor(movie.cpi) : 'x' must be numeric
。
我不明白这个错误,因为我之前在定性变量上使用了cor
。但是,过去的那些变量有两个因素(水平)。类型和内容评级等级别有多个级别。
这是一个非常大的数据集,并且有诸如流派和内容评级等定性变量。有人可以构建一个函数,让我插入两列索引并返回这两列之间的相关性吗?
> dput(droplevels(head(movie.cpi, 6)))
structure(list(num_critic_for_reviews = c(723L, 302L, 813L, 462L,
392L, 324L), director_facebook_likes = c(0L, 563L, 22000L, 475L,
0L, 15L), actor_3_facebook_likes = c(855L, 1000L, 23000L, 530L,
4000L, 284L), actor_1_facebook_likes = c(1000L, 40000L, 27000L,
640L, 24000L, 799L), gross = c(866161204.765035, 364628240.876025,
476821933.103659, 77736216.375, 396596010.723107, 224929913.372765
), genres = structure(c(2L, 1L, 5L, 4L, 3L, 6L), .Label = c("Action|Adventure|Fantasy",
"Action|Adventure|Fantasy|Sci-Fi", "Action|Adventure|Romance",
"Action|Adventure|Sci-Fi", "Action|Thriller", "Adventure|Animation|Comedy|Family|Fantasy|Musical|Romance"
), class = "factor"), num_voted_users = c(886204L, 471220L, 1144337L,
212204L, 383056L, 294810L), cast_total_facebook_likes = c(4834L,
48350L, 106759L, 1873L, 46055L, 2036L), facenumber_in_poster = c(0L,
0L, 0L, 1L, 0L, 1L), num_user_for_reviews = c(3054L, 1238L, 2701L,
738L, 1902L, 387L), content_rating = structure(c(2L, 2L, 2L,
2L, 2L, 1L), .Label = c("PG", "PG-13"), class = "factor"), budget = c(269925874.125874,
353545586.107091, 266006097.560976, 280583231.707317, 304049204.052098,
291233379.183861), title_year = c(2009L, 2007L, 2012L, 2012L,
2007L, 2010L), actor_2_facebook_likes = c(936L, 5000L, 23000L,
632L, 11000L, 553L), imdb_score = c(7.9, 7.1, 8.5, 6.6, 6.2,
7.8), movie_facebook_likes = c(33000L, 0L, 164000L, 24000L, 0L,
29000L), genre_1 = c("Action", "Action", "Action", "Action",
"Action", "Adventure")), .Names = c("num_critic_for_reviews",
"director_facebook_likes", "actor_3_facebook_likes", "actor_1_facebook_likes",
"gross", "genres", "num_voted_users", "cast_total_facebook_likes",
"facenumber_in_poster", "num_user_for_reviews", "content_rating",
"budget", "title_year", "actor_2_facebook_likes", "imdb_score",
"movie_facebook_likes", "genre_1"), na.action = structure(c(47L,
88L, 213L, 316L, 382L, 413L, 423L, 478L, 550L, 552L, 607L, 662L,
668L, 747L, 775L, 799L, 802L, 940L, 1011L, 1033L, 1073L, 1082L,
1085L, 1189L, 1240L, 1262L, 1316L, 1318L, 1320L, 1341L, 1382L,
1385L, 1387L, 1392L, 1433L, 1467L, 1469L, 1511L, 1521L, 1527L,
1531L, 1538L, 1546L, 1568L, 1573L, 1613L, 1656L, 1694L, 1717L,
1731L, 1749L, 1750L, 1795L, 1800L, 1801L, 1802L, 1817L, 1863L,
1864L, 1866L, 1870L, 1886L, 1887L, 1897L, 1929L, 1938L, 1941L,
1946L, 1958L, 1997L, 2002L, 2025L, 2036L, 2039L, 2045L, 2058L,
2076L, 2157L, 2163L, 2165L, 2166L, 2186L, 2188L, 2191L, 2192L,
2196L, 2216L, 2219L, 2224L, 2245L, 2252L, 2265L, 2281L, 2307L,
2310L, 2311L, 2316L, 2319L, 2320L, 2321L, 2322L, 2323L, 2326L,
2327L, 2328L, 2330L, 2340L, 2341L, 2342L, 2346L, 2352L, 2353L,
2358L, 2361L, 2362L, 2375L, 2378L, 2385L, 2388L, 2390L, 2404L,
2405L, 2406L, 2407L, 2411L, 2416L, 2417L, 2449L, 2452L, 2456L,
2478L, 2493L, 2500L, 2501L, 2502L, 2504L, 2507L, 2508L, 2511L,
2560L, 2563L, 2573L, 2577L, 2580L, 2582L, 2583L, 2584L, 2585L,
2587L, 2588L, 2589L, 2593L, 2597L, 2600L, 2602L, 2618L, 2622L,
2630L, 2631L, 2632L, 2638L, 2641L, 2642L, 2643L, 2644L, 2645L,
2646L, 2653L, 2659L, 2660L, 2664L, 2665L, 2668L, 2670L, 2681L,
2684L, 2688L, 2694L, 2698L, 2699L, 2701L, 2707L, 2708L, 2710L,
2712L, 2713L, 2714L, 2715L, 2716L, 2718L, 2719L, 2722L, 2728L,
2729L, 2739L, 2748L, 2750L, 2751L, 2752L, 2753L, 2756L, 2757L,
2762L, 2767L, 2773L, 2774L, 2777L, 2781L, 2782L, 2785L, 2786L,
2787L, 2788L, 2789L, 2790L, 2791L, 2792L, 2793L, 2794L, 2796L,
2801L, 2803L, 2804L, 2806L, 2810L, 2812L, 2813L, 2824L, 2825L,
2828L, 2829L, 2833L, 2837L, 2838L, 2840L, 2841L, 2842L, 2847L,
2848L, 2850L), .Names = c("47", "88", "213", "316", "382", "413",
"423", "478", "550", "552", "607", "662", "668", "747", "775",
"799", "802", "940", "1011", "1033", "1073", "1082", "1085",
"1189", "1240", "1262", "1316", "1318", "1320", "1341", "1382",
"1385", "1387", "1392", "1433", "1467", "1469", "1511", "1521",
"1527", "1531", "1538", "1546", "1568", "1573", "1613", "1656",
"1694", "1717", "1731", "1749", "1750", "1795", "1800", "1801",
"1802", "1817", "1863", "1864", "1866", "1870", "1886", "1887",
"1897", "1929", "1938", "1941", "1946", "1958", "1997", "2002",
"2025", "2036", "2039", "2045", "2058", "2076", "2157", "2163",
"2165", "2166", "2186", "2188", "2191", "2192", "2196", "2216",
"2219", "2224", "2245", "2252", "2265", "2281", "2307", "2310",
"2311", "2316", "2319", "2320", "2321", "2322", "2323", "2326",
"2327", "2328", "2330", "2340", "2341", "2342", "2346", "2352",
"2353", "2358", "2361", "2362", "2375", "2378", "2385", "2388",
"2390", "2404", "2405", "2406", "2407", "2411", "2416", "2417",
"2449", "2452", "2456", "2478", "2493", "2500", "2501", "2502",
"2504", "2507", "2508", "2511", "2560", "2563", "2573", "2577",
"2580", "2582", "2583", "2584", "2585", "2587", "2588", "2589",
"2593", "2597", "2600", "2602", "2618", "2622", "2630", "2631",
"2632", "2638", "2641", "2642", "2643", "2644", "2645", "2646",
"2653", "2659", "2660", "2664", "2665", "2668", "2670", "2681",
"2684", "2688", "2694", "2698", "2699", "2701", "2707", "2708",
"2710", "2712", "2713", "2714", "2715", "2716", "2718", "2719",
"2722", "2728", "2729", "2739", "2748", "2750", "2751", "2752",
"2753", "2756", "2757", "2762", "2767", "2773", "2774", "2777",
"2781", "2782", "2785", "2786", "2787", "2788", "2789", "2790",
"2791", "2792", "2793", "2794", "2796", "2801", "2803", "2804",
"2806", "2810", "2812", "2813", "2824", "2825", "2828", "2829",
"2833", "2837", "2838", "2840", "2841", "2842", "2847", "2848",
"2850"), class = "omit"), row.names = c(NA, 6L), class = "data.frame")