我有一个data.table
head(LocalCodes, n= 20)
Local Codes
1: Crane, Indiana 0189
2: Rutland, Vermont 0401
3: NA 5003
4: Naval Air Station Patuxent River, Maryland 5001
5: Williamsburg, Virginia 7408
6: District of Columbia, District of Columbia 0132
7: Newport, Rhode Island 1702
8: NA 1805
9: NA 5306
10: Washington DC, District of Columbia / Kansas City, Missouri 2210
11: Kansas City, Missouri 0503
12: Arlington, Virginia 0501
13: Phoenix, Arizona 0301
14: Washington DC, District of Columbia 0132
15: NA 5001
16: Collbran, Colorado 0303
17: Washington DC, District of Columbia / Norfolk, Virginia 1102
18: Minot, North Dakota 1802
19: Washington DC, District of Columbia 2005
20: Pine Knot, Kentucky 4749
我正在尝试使用Good <- LocalCodes[ , list( LocalCodes$Local <- unlist( strsplit( LocalCodes$Local , " / " ) ) , by=LocalCodes$Codes)]
要在{/ 1上拆分Local
,并在新数据表中保留相同的Codes
。
我一直收到错误消息Error in strsplit(LocalCodes$Local, " / ") : non-character argument
我确实尝试将as.character(LocalCodes$Local)
添加到Good
来消除错误,但是随后data.table无法正常工作。它分隔Local
,但是Codes
不会排成一行,因为Local
现在是一个字符。
有没有办法分离Local
并在正确的Codes
上维护Local
示例:
Local Codes
8: NA 1805
9: NA 5306
10: Kansas City, Missouri 2210
11: Washington DC, District of Columbia 2210
12: Kansas City, Missouri 0503
13: Arlington, Virginia 0501
14: Phoenix, Arizona 0301
15: Washington DC, District of Columbia 0132
16: NA 5001
17: Collbran, Colorado 0303
18: Norfolk, Virginia 1102
19: Washington DC, District of Columbia 1102
使用:Plyr,Dplyr,Data.Table
编辑: 这是dput输出:
dput(head(LocalCodes, n= 20))
structure(list(Local = list("Crane, Indiana", "Rutland, Vermont",
"NA", "Naval Air Station Patuxent River, Maryland", "Williamsburg, Virginia",
"District of Columbia, District of Columbia", "Newport, Rhode Island",
"NA", "NA", "Washington DC, District of Columbia / Kansas City, Missouri",
"Kansas City, Missouri", "Arlington, Virginia", "Phoenix, Arizona",
"Washington DC, District of Columbia", "NA", "Collbran, Colorado",
"Washington DC, District of Columbia / Norfolk, Virginia",
"Minot, North Dakota", "Washington DC, District of Columbia",
"Pine Knot, Kentucky"), Codes = list("0189", "0401", "5003",
"5001", "7408", "0132", "1702", "1805", "5306", "2210", "0503",
"0501", "0301", "0132", "5001", "0303", "1102", "1802", "2005",
"4749")), class = c("data.table", "data.frame"), row.names = c(NA,
-20L)
答案 0 :(得分:1)
我的原始答案未能成功包含多个包含“ /”的项目。我有策略来处理data.table对象的变体,但是在过程中发现不幸的是您的结构是非标准的。请注意,dput输出以
开头structure(list(Local = list(“ Crane,Indiana”,
典型的data.table不是列表列表。这种结构以搞乱data.frame操作而臭名昭著,而且显然也搞乱了data.table操作。这将修复您的数据对象,使其看起来像“普通”数据表。
LocalCodes[ , names(LocalCodes) := lapply(LocalCodes,unlist)]
#> dput(LocalCodes)
# structure(list(Local = c("Crane, Indiana", ...
现在,它不是列表列表。因此,现在尝试分别处理弦线内部与斜线之间不存在斜线的情况,然后将其捆绑在一起:
rbind( LocalCodes[grepl("/",Local) ,
cbind( data.table(Local=unlist( strsplit(Local, split="/")),
Codes=rep(Codes,each=2)))],
LocalCodes[!grepl("/",Local)] )
Local Codes
1: Washington DC, District of Columbia 2210
2: Kansas City, Missouri 2210
3: Washington DC, District of Columbia 1102
4: Norfolk, Virginia 1102
5: Crane, Indiana 0189
6: Rutland, Vermont 0401
7: NA 5003
8: Naval Air Station Patuxent River, Maryland 5001
9: Williamsburg, Virginia 7408
10: District of Columbia, District of Columbia 0132
11: Newport, Rhode Island 1702
snipped-----