拆分数据帧会产生奇怪的输出

时间:2012-07-20 15:37:03

标签: r list dataframe split

我只是试图根据我的列intervention的值将我的数据帧拆分成几个数据帧,但是当我尝试这样做时,我得到了一些意想不到的输出。

检查我确实有一个名为raw的数据框:

print(class(raw));

产量

[1] "data.frame"

以下是拆分前的数据框:

  position                                   id equation      intervention
1       -2 D9E4262D-5B6D-ADB8-D605-B97D63437064      9,5            corral
2        1 B2FFB0B0-210E-022F-293A-0ABFDDB3DC4B      2,3            corral
3        1 85905A69-50F7-AF73-7A51-08B8FDCFAF2D      1,2         horseshoe
4       -2 76A55530-5A39-6A73-3216-D276EABFA2F6      3,4 test_intervention
5       -1 4CFA5D1B-EA32-8584-A1C9-540D9FFB24CB      3,4 test_intervention

然后我用

groups <- split(raw, raw$intervention);

当我打印

print(groups);

我明白了:

$corral.corral.horseshoe.test_intervention.test_intervention
  position                                   id equation      intervention
1       -2 D9E4262D-5B6D-ADB8-D605-B97D63437064      9,5            corral
2        1 B2FFB0B0-210E-022F-293A-0ABFDDB3DC4B      2,3            corral
3        1 85905A69-50F7-AF73-7A51-08B8FDCFAF2D      1,2         horseshoe
4       -2 76A55530-5A39-6A73-3216-D276EABFA2F6      3,4 test_intervention
5       -1 4CFA5D1B-EA32-8584-A1C9-540D9FFB24CB      3,4 test_intervention

这看起来不像是通过干预分组的数据框列表。还要注意奇怪的一行

$corral.corral.horseshoe.test_intervention.test_intervention

修改

输出(原始):

structure(list(position = list(-2, 1, 1, -2, -1), id = list("D9E4262D-5B6D-ADB8-D605-B97D63437064",
    "B2FFB0B0-210E-022F-293A-0ABFDDB3DC4B", "85905A69-50F7-AF73-7A51-08B8FDCFAF2D",
    "76A55530-5A39-6A73-3216-D276EABFA2F6", "4CFA5D1B-EA32-8584-A1C9-540D9FFB24CB"),
    equation = list("9,5", "2,3", "1,2", "3,4", "3,4"), intervention = list(
        "corral", "corral", "horseshoe", "test_intervention",
        "test_intervention")), .Names = c("position", "id", "equation",
"intervention"), row.names = c(NA, -5L), class = "data.frame")

修改 这是我的完整代码,它很小。

#!/usr/local/bin/Rscript --slave
require("rjson", quietly=TRUE);

# First we need to grab the items from the R api, and save them into a data frame
raw = fromJSON(file="http://some/url.com");

#reformats data into dataframe
raw <- as.data.frame(do.call(rbind,raw));
#we need to create a new dataframe formatted according to the needs of catR
groups <- split(raw, raw$intervention);
print(groups); 
#saveRDS(object=fromJSON(file="http://some/url.com"),file="/home/bitnami/IRT_data/core_standard.rda");

我像这样运行我的代码:

~/Rscript my_R_file.R

1 个答案:

答案 0 :(得分:2)

raw1<-read.table(header=T,text="  position                                   id equation      intervention
1       -2 D9E4262D-5B6D-ADB8-D605-B97D63437064      9,5            corral
2        1 B2FFB0B0-210E-022F-293A-0ABFDDB3DC4B      2,3            corral
3        1 85905A69-50F7-AF73-7A51-08B8FDCFAF2D      1,2         horseshoe
4       -2 76A55530-5A39-6A73-3216-D276EABFA2F6      3,4 test_intervention
5       -1 4CFA5D1B-EA32-8584-A1C9-540D9FFB24CB      3,4 test_intervention")

groups <- split(raw1, raw1$intervention)

> groups
$corral
  position                                   id equation intervention
1       -2 D9E4262D-5B6D-ADB8-D605-B97D63437064      9,5       corral
2        1 B2FFB0B0-210E-022F-293A-0ABFDDB3DC4B      2,3       corral

$horseshoe
  position                                   id equation intervention
3        1 85905A69-50F7-AF73-7A51-08B8FDCFAF2D      1,2    horseshoe

$test_intervention
  position                                   id equation      intervention
4       -2 76A55530-5A39-6A73-3216-D276EABFA2F6      3,4 test_intervention
5       -1 4CFA5D1B-EA32-8584-A1C9-540D9FFB24CB      3,4 test_intervention

和你不一样。似乎工作正常

raw是你的数据帧,raw1是read.table版本

> str(raw1$intervention)
 Factor w/ 3 levels "corral","horseshoe",..: 1 1 2 3 3
> str(raw$intervention)
List of 5
 $ : chr "corral"
 $ : chr "corral"
 $ : chr "horseshoe"
 $ : chr "test_intervention"
 $ : chr "test_intervention"

将输出从RJSON转换为data.frame时,它会创建列表数据框

这一行是你的问题

#reformats data into dataframe
raw <- as.data.frame(do.call(rbind,raw));

使用

 raw2<-data.frame(lapply(raw,unlist))