我有一个列表,mm
:
head(mm)
[[1]]
[1] "8 1901 - 1908 >>Primus<< sbk"
[[2]]
[1] "12 1901 - 1912 A & B:s skofabriks arbetares sbk."
[[3]]
[1] "5 1907 - 1911 A. B. Elevators arberates sbk"
[[4]]
[1] "5 1901 - 1905 Abk. N.K.B. (Nya Klöfverbladet)"
[[5]]
[1] "2 1904 - 1905 absolutisternas sbk"
[[6]]
[1] "12 1901 - 1912 Aftonbladets personals sbk"
length(mm)
[1] 429
dput(head(mm))
list("8 1901 - 1908 >>Primus<< sbk", "12 1901 - 1912 A & B:s skofabriks arbetares sbk.",
"5 1907 - 1911 A. B. Elevators arberates sbk", "5 1901 - 1905 Abk. N.K.B. (Nya Klöfverbladet)",
"2 1904 - 1905 absolutisternas sbk", "12 1901 - 1912 Aftonbladets personals sbk")
我也有公司名称:
head(unique(data$Name))
[1] ">>Primus<< sbk" "A & B:s skofabriks arbetares sbk." "A. B. Elevators arberates sbk"
[4] "Abk. N.K.B. (Nya Klöfverbladet)" "absolutisternas sbk" "Aftonbladets personals sbk"
length(unique(data$Name))
[1] 429
我正在尝试制作一个新列表,其中每个元素mm
列表的每个元素都会重复出现在data frame
中每个公司的次数:
data[1:20,1:2]
Year Name
1 1901 >>Primus<< sbk
185 1902 >>Primus<< sbk
382 1903 >>Primus<< sbk
597 1904 >>Primus<< sbk
822 1905 >>Primus<< sbk
1059 1906 >>Primus<< sbk
1310 1907 >>Primus<< sbk
1567 1908 >>Primus<< sbk
2 1901 A & B:s skofabriks arbetares sbk.
186 1902 A & B:s skofabriks arbetares sbk.
383 1903 A & B:s skofabriks arbetares sbk.
598 1904 A & B:s skofabriks arbetares sbk.
823 1905 A & B:s skofabriks arbetares sbk.
1060 1906 A & B:s skofabriks arbetares sbk.
1311 1907 A & B:s skofabriks arbetares sbk.
1568 1908 A & B:s skofabriks arbetares sbk.
1827 1909 A & B:s skofabriks arbetares sbk.
2090 1910 A & B:s skofabriks arbetares sbk.
2355 1911 A & B:s skofabriks arbetares sbk.
2602 1912 A & B:s skofabriks arbetares sbk.
dput(data[1:20,1:2])
structure(list(Year = c(1901L, 1902L, 1903L, 1904L, 1905L, 1906L,
1907L, 1908L, 1901L, 1902L, 1903L, 1904L, 1905L, 1906L, 1907L,
1908L, 1909L, 1910L, 1911L, 1912L), Name = c(">>Primus<< sbk",
">>Primus<< sbk", ">>Primus<< sbk", ">>Primus<< sbk", ">>Primus<< sbk",
">>Primus<< sbk", ">>Primus<< sbk", ">>Primus<< sbk", "A & B:s skofabriks arbetares sbk.",
"A & B:s skofabriks arbetares sbk.", "A & B:s skofabriks arbetares sbk.",
"A & B:s skofabriks arbetares sbk.", "A & B:s skofabriks arbetares sbk.",
"A & B:s skofabriks arbetares sbk.", "A & B:s skofabriks arbetares sbk.",
"A & B:s skofabriks arbetares sbk.", "A & B:s skofabriks arbetares sbk.",
"A & B:s skofabriks arbetares sbk.", "A & B:s skofabriks arbetares sbk.",
"A & B:s skofabriks arbetares sbk.")), .Names = c("Year", "Name"
), row.names = c(1L, 185L, 382L, 597L, 822L, 1059L, 1310L, 1567L,
2L, 186L, 383L, 598L, 823L, 1060L, 1311L, 1568L, 1827L, 2090L,
2355L, 2602L), class = "data.frame")
因此,例如'mm [[1]]'会重复8次,因为公司>>Primus<< sbk
出现了8次:
length(data[data$Name==">>Primus<< sbk",2])
[1] 8
我的方法是:
mm=lapply(1:length(maxz),function(x) paste(diffz[[x]]+1,"",minz[[x]],"-",maxz[[x]],"",names(maxz)[[x]]))
hb=lapply(seq_along(mm),function(x,m) rep(mm[[x]],length(data[data$Name==m,2])),m=unique(data$Name))
但是我在上面运行warning
之后得到了这个hb
:
There were 50 or more warnings (use warnings() to see the first 50)
head(warnings())
$`longer object length is not a multiple of shorter object length`
data$Name == m
$`longer object length is not a multiple of shorter object length`
data$Name == m
我做错了什么?:(
EDIT
:
以下是一种有效的解决方法:
最诚挚的问候!
mm=lapply(1:length(maxz),function(x) paste(diffz[[x]]+1,"",minz[[x]],"-",maxz[[x]],"",names(maxz)[[x]]))
names(mm)=names(minz)
hb=lapply(names(mm),function(x) rep(mm[[x]],length(data[data$Name==x,2])))
其中
head(names(minz))
[1] ">>Primus<< sbk" "A & B:s skofabriks arbetares sbk." "A. B. Elevators arberates sbk"
[4] "Abk. N.K.B. (Nya Klöfverbladet)" "absolutisternas sbk" "Aftonbladets personals sbk"
答案 0 :(得分:3)
如果您使用标准数据结构在R:the data.frame
中存储数据,您会发现生活更轻松。以下代码将您的输入转换为数据框,然后使用子集来重复行。
mm <- list("8 1901 - 1908 >>Primus<< sbk", "12 1901 - 1912 A & B:s skofabriks arbetares sbk.",
"5 1907 - 1911 A. B. Elevators arberates sbk", "5 1901 - 1905 Abk. N.K.B. (Nya Klöfverbladet)",
"2 1904 - 1905 absolutisternas sbk", "12 1901 - 1912 Aftonbladets personals sbk")
# Convert to a character vector
m <- unlist(mm)
# Convert multiple character separator to single
m2 <- gsub(" {2, }", ",", m)
# Parse with read.csv
df <- read.csv(text = m2, header = false)
names(df) <- c("n", "years", "company")
# Finally, duplicate each row
df[rep(1:nrow(df), df$n), -1]