我有一个包含两列的数据集。 AccountName和AccountNumber。它有35行。我想用AccountName,AccountNumber和LocationNumber创建一个新的数据帧。 LocationNumber存储在另一个数据框中,其中1列有350行。
所以基本上对于每个帐户名称和号码,foreach位置编号,添加另一行,帐户名称+编号+位置编号。因此,如果我有35个帐号和350个地点,那么最终目标是拥有12,250行。我尝试使用for
循环无济于事。
帐户(姓名|号码)
STR EXP-VACATION ESTIMATE-0200900 200900
STR EXP-HOLIDAY PAY-0200920 200920
STR EXP-SICK PAY-0200930 200930
STR EXP-MISC TIME PAID,NOT WORKED-0200990 200990
位置:
Lo.702-002
Lo.702-003
Lo.702-004
Lo.702-005
每个帐号的最终结果
STR EXP-VACATION ESTIMATE-0200900 200900 Lo.702-002
STR EXP-VACATION ESTIMATE-0200900 200900 Lo.702-003
STR EXP-VACATION ESTIMATE-0200900 200900 Lo.702-004
STR EXP-VACATION ESTIMATE-0200900 200900 Lo.702-005
将产生我想要的结果的PHP代码:
foreach($accounts as $name => $number) {
foreach($locations as $location) {
echo sprintf("%s,%s,%s\n", $name, $number, $location);
}
}
答案 0 :(得分:0)
我的解决方案:
acc.run <- function() {
locFileName <- 'location-list.csv'
accFileName <- 'account-list.csv'
locations <- read.csv(locFileName, sep=',', quote='\"', header=T)
accounts <- read.csv(accFileName, sep=',', quote='\"', header=T)
#Add row numbers
accounts$rowNum <- 1:nrow(accounts)
merged <- merge(accounts, locations)
sorted <- merged[order(merged$rowNum), ]
final <- sorted[, !(names(sorted) %in% c('rowNum'))]
# Random file extension to prevent duplicate/overwriting
rExt <- paste(round(runif(6,10,100)), sep='', collapse='')
write.csv(final, paste('accounts-concat', rExt, '.csv', sep='', collapse=''), row.names=F)
}
让我知道如何改善这一点?
答案 1 :(得分:0)
这是我原来答案的编辑版本, 修改以包含您的测试信息。 这符合您的需求吗?
# Generate some usable test data
accounts <- read.csv(text = "
AccountName|AccountNumber
STR EXP-VACATION ESTIMATE-0200900|200900
STR EXP-HOLIDAY PAY-0200920|200920
STR EXP-SICK PAY-0200930|200930
STR EXP-MISC TIME PAID,NOT WORKED-0200990|200990
", sep = "|")
locations <- read.table(header = TRUE, text = "
Location
Lo.702-002
Lo.702-003
Lo.702-004
Lo.702-005
")$Location
# Combine the data into wide format
df <- cbind(accounts, locations = t(locations))
# Restructure the data in long format
reshape(df, varying = grep("locations", names(df)), direction = "long" )