在R: 我有一个包含许多行但只有一列的数据框。每行都有一长串字符,周期性地用|来标点标记。我想在每次有|时分割字符标记,以便有很多列。
1995-01-01|33.399999999999999|40.299999999999997|35.399999999999999|35.0|37.200000000000003|23.399999999999999|23.199999999999999|47.399999999999999|49.200000000000003|49.200000000000003|48.100000000000001|42.299999999999997|58.200000000000003|17.399999999999999|50.700000000000003|5.2999999999999998|20.600000000000001|38.5|43.299999999999997 etc.
每个字符串以日期开头,然后包含与城市对应的数字。变量名也列为一个字符串,它们需要用“。”分隔。标记
date.abilene_tx.akron_oh.albany_ny.albuquerque_nm.allentown_pa.amarillo_tx.anchorage_ak.asheville_nc.atlanta_ga etc.
非常感谢任何帮助!
答案 0 :(得分:1)
这是一个data.frame,其中包含一列和10行,可能与您的相似:
dat <- "1995-01-01|33.399999999999999|40.299999999999997|35.399999999999999|35.0|37.200000000000003|23.399999999999999|23.199999999999999|47.399999999999999|49.200000000000003|49.200000000000003|48.100000000000001|42.299999999999997|58.200000000000003|17.399999999999999|50.700000000000003|5.2999999999999998|20.600000000000001|38.5|43.299999999999997 "
df <- data.frame(col1 = rep(dat, 10))
这里的data.frame包含基于拆分Col1的新列:
foo <- data.frame(do.call('rbind', strsplit(as.character(df$col1),'|',fixed=TRUE)))
foo
X1 X2 X3 X4 X5 X6
1 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
2 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
3 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
4 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
5 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
6 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
7 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
8 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
9 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
10 1995-01-01 33.399999999999999 40.299999999999997 35.399999999999999 35.0 37.200000000000003
X7 X8 X9 X10 X11
1 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
2 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
3 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
4 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
5 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
6 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
7 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
8 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
9 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
10 23.399999999999999 23.199999999999999 47.399999999999999 49.200000000000003 49.200000000000003
X12 X13 X14 X15 X16
1 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
2 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
3 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
4 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
5 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
6 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
7 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
8 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
9 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
10 48.100000000000001 42.299999999999997 58.200000000000003 17.399999999999999 50.700000000000003
X17 X18 X19 X20
1 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
2 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
3 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
4 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
5 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
6 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
7 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
8 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
9 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
10 5.2999999999999998 20.600000000000001 38.5 43.299999999999997
答案 1 :(得分:1)
您应该使用以下命令从文件加载数据:
dat <- read.table(filename, sep="|")
这将处理以“|”分隔的行但是你说“字符串”用“。”分隔,所以如果它们以某种方式混合在htat文本文件中,你可能需要先用readLines()
输入一些预处理。