我有两个数据集。我想根据df1 $ col5中给定的不同字符来创建新时间序列的列表。因此,我还有另一个xts 2,它的列名(标题)适合df1 $ col5的字符。 xts 2的每一列都包含一个时间序列。列表中每个ts的名称在df1 / col1中给出。除此之外,我还必须使用df1 / col3中的值来编辑每个新的时间序列。
听起来很混乱,但这是描述我的问题的最好方法。
如何将df1 / col5字符与xts2的一个标头匹配?
我已经搜索了很长时间,但是没有找到任何解决方案。
df1看起来像这样:
MKZ Elbekilometer MNW91-2010 Fix Pegel Distanz_Pegel_m
1 43420072 178.2 70.70 70.37300 Torgau 18816.2011
2 43435074 172.0 72.09 71.81900 Torgau 11926.9932
3 43435084 177.0 70.93 70.61100 Torgau 15620.9210
4 43435086 171.8 72.14 71.87100 Torgau 11478.2310
5 44425470 172.2 72.05 71.76767 Torgau 17400.7983
6 44425476 172.2 72.05 71.76767 Torgau 14448.9073
7 45426142 154.0 75.91 75.72300 Torgau 21065.9449
8 45440655 125.4 84.18 83.94567 Riesa 17019.6066
9 46431146 117.0 87.17 87.04500 Riesa 23594.2523
10 46440130 116.0 87.45 87.32600 Riesa 10471.5078
MKZ应该是列表中每个新时间序列的名称。 Dependig字符在Pegel中给出,它适合xts2的列名,如下所示:
Schöna Pirna Dresden Meißen Riesa Torgau
1980-01-01 119.583 110.576 105.742 98.522 92.036 79.038
1980-01-02 119.523 111.426 105.652 98.412 91.926 78.908
1980-01-03 119.413 111.316 105.592 98.362 91.876 78.848
1980-01-04 119.123 111.126 105.382 98.222 91.756 78.748
1980-01-05 119.103 110.956 105.282 98.032 91.536 78.488
1980-01-06 118.823 110.786 105.062 97.802 91.396 78.348
1980-01-07 118.783 110.726 104.972 97.722 91.276 78.128
1980-01-08 118.923 110.866 105.102 97.852 91.336 78.088
最后它将看起来像这样:
data<-list()
data[[1]]
file[1] <- 43420072
Torgau newcol
1980-01-01 79.038 79.038+df1/col3
1980-01-02 78.908 78.908+df1/col3
1980-01-03 78.848 78.848+df1/col3
1980-01-04 78.748 78.748+df1/col3
1980-01-05 78.488 78.488+df1/col3
1980-01-06 78.348 78.348+df1/col3
1980-01-07 78.128 78.128+df1/col3
1980-01-08 78.088 78.088+df1/col3
更新
基本上,该答案对于所有基于PegelSchöna的时间序列都适用。 我的代码现在看起来像这样:
library(dplyr)
library(tidyr)
library(tibble)
pegeluntereinander <- Pegelganglinien %>%
rownames_to_column('date') %>%
gather('Pegel','value', -date)
zus <- pegeluntereinander %>%
left_join(Werte) %>% #WITHOUT running df1 %>% filter(MKZ == 43420072)
select(date, Pegel, value, MKZ, `MNW91.2010`) %>% #Added MKZ as column for clarity
mutate(new.col = value + `MNW91.2010`)
files<-Werte$MKZ
workspace<-...
for (i in 1:length(files)){
filename<-Werte[i,][1,1]
zeilen<-which(zus$MKZ==filename)
mkz<-new.df[zeilen, ]
filename<-as.character(filename)
dateinamen_csv=paste(filename, "csv", sep = ".")
speicherpfad_inkl_namen=paste(workspace,dateinamen_csv, sep = "/")
write.csv(mkz, file = speicherpfad_inkl_namen, row.names = F)
}
不了解数据框的名称:
Werte=df1
Pegelganglinien=df2
pegeluntereinander=df2.new
zus=df.new
答案 0 :(得分:1)
1。加载数据
df1 <- read.table(text = ' MKZ Elbekilometer MNW91-2010 Fix Pegel Distanz_Pegel_m
1 43420072 178.2 70.70 70.37300 Torgau 18816.2011
2 43435074 172.0 72.09 71.81900 Torgau 11926.9932
3 43435084 177.0 70.93 70.61100 Torgau 15620.9210
4 43435086 171.8 72.14 71.87100 Torgau 11478.2310
5 44425470 172.2 72.05 71.76767 Torgau 17400.7983
6 44425476 172.2 72.05 71.76767 Torgau 14448.9073
7 45426142 154.0 75.91 75.72300 Torgau 21065.9449
8 45440655 125.4 84.18 83.94567 Riesa 17019.6066
9 46431146 117.0 87.17 87.04500 Riesa 23594.2523
10 46440130 116.0 87.45 87.32600 Riesa 10471.5078')
df2 <- read.table(text = ' Schöna Pirna Dresden Meißen Riesa Torgau
1980-01-01 119.583 110.576 105.742 98.522 92.036 79.038
1980-01-02 119.523 111.426 105.652 98.412 91.926 78.908
1980-01-03 119.413 111.316 105.592 98.362 91.876 78.848
1980-01-04 119.123 111.126 105.382 98.222 91.756 78.748
1980-01-05 119.103 110.956 105.282 98.032 91.536 78.488
1980-01-06 118.823 110.786 105.062 97.802 91.396 78.348
1980-01-07 118.783 110.726 104.972 97.722 91.276 78.128
1980-01-08 118.923 110.866 105.102 97.852 91.336 78.088')
我们首先格式化第二个数据帧,以便我们可以使用dplyr中的left_join
轻松地将其加入。然后,我们在您给出的MKZ(43420072
)上过滤第一个数据帧并合并数据:
2。格式化第二个数据框(df2
),以便我们可以轻松地将其合并
library(dplyr)
library(tidyr)
library(tibble)
df2.new <- df2 %>%
rownames_to_column('date') %>%
gather('Pegel','value', -date)
3。建立新的数据框
# We pick the correct data from the first dataframe
df1 <- df1 %>% filter(MKZ == 43420072)
# Build the new df
new.df <- df2.new %>%
left_join(df1) %>%
select(date, Pegel, value, `MNW91.2010`) %>%
mutate(new.col = value + `MNW91.2010`)
输出
这个新的数据框将显示您的描述方式(针对Plegel == Torgau进行了过滤):
date Pegel value MNW91.2010 new.col
1 1980-01-01 Torgau 79.038 70.7 149.738
2 1980-01-02 Torgau 78.908 70.7 149.608
3 1980-01-03 Torgau 78.848 70.7 149.548
4 1980-01-04 Torgau 78.748 70.7 149.448
5 1980-01-05 Torgau 78.488 70.7 149.188
6 1980-01-06 Torgau 78.348 70.7 149.048
7 1980-01-07 Torgau 78.128 70.7 148.828
8 1980-01-08 Torgau 78.088 70.7 148.788
编辑
输出无需对MKZ进行过滤
new.df <- df2.new %>%
left_join(df1) %>% #WITHOUT running df1 %>% filter(MKZ == 43420072)
select(date, Pegel, value, MKZ, `MNW91.2010`) %>% #Added MKZ as column for clarity
mutate(new.col = value + `MNW91.2010`)
这将为Torgau
产生以下数据帧:
date Pegel value MKZ MNW91.2010 new.col
1 1980-01-01 Torgau 79.038 43420072 70.70 149.738
2 1980-01-01 Torgau 79.038 43435074 72.09 151.128
3 1980-01-01 Torgau 79.038 43435084 70.93 149.968
4 1980-01-01 Torgau 79.038 43435086 72.14 151.178
5 1980-01-01 Torgau 79.038 44425470 72.05 151.088
6 1980-01-01 Torgau 79.038 44425476 72.05 151.088
7 1980-01-01 Torgau 79.038 45426142 75.91 154.948
8 1980-01-02 Torgau 78.908 43420072 70.70 149.608
9 1980-01-02 Torgau 78.908 43435074 72.09 150.998
10 1980-01-02 Torgau 78.908 43435084 70.93 149.838
11 1980-01-02 Torgau 78.908 43435086 72.14 151.048
12 1980-01-02 Torgau 78.908 44425470 72.05 150.958
13 1980-01-02 Torgau 78.908 44425476 72.05 150.958
14 1980-01-02 Torgau 78.908 45426142 75.91 154.818
15 1980-01-03 Torgau 78.848 43420072 70.70 149.548
16 1980-01-03 Torgau 78.848 43435074 72.09 150.938
17 1980-01-03 Torgau 78.848 43435084 70.93 149.778
18 1980-01-03 Torgau 78.848 43435086 72.14 150.988
19 1980-01-03 Torgau 78.848 44425470 72.05 150.898
20 1980-01-03 Torgau 78.848 44425476 72.05 150.898
21 1980-01-03 Torgau 78.848 45426142 75.91 154.758
22 1980-01-04 Torgau 78.748 43420072 70.70 149.448
23 1980-01-04 Torgau 78.748 43435074 72.09 150.838
24 1980-01-04 Torgau 78.748 43435084 70.93 149.678
25 1980-01-04 Torgau 78.748 43435086 72.14 150.888
26 1980-01-04 Torgau 78.748 44425470 72.05 150.798
27 1980-01-04 Torgau 78.748 44425476 72.05 150.798
28 1980-01-04 Torgau 78.748 45426142 75.91 154.658
29 1980-01-05 Torgau 78.488 43420072 70.70 149.188
30 1980-01-05 Torgau 78.488 43435074 72.09 150.578
31 1980-01-05 Torgau 78.488 43435084 70.93 149.418
32 1980-01-05 Torgau 78.488 43435086 72.14 150.628
33 1980-01-05 Torgau 78.488 44425470 72.05 150.538
34 1980-01-05 Torgau 78.488 44425476 72.05 150.538
35 1980-01-05 Torgau 78.488 45426142 75.91 154.398
36 1980-01-06 Torgau 78.348 43420072 70.70 149.048
37 1980-01-06 Torgau 78.348 43435074 72.09 150.438
38 1980-01-06 Torgau 78.348 43435084 70.93 149.278
39 1980-01-06 Torgau 78.348 43435086 72.14 150.488
40 1980-01-06 Torgau 78.348 44425470 72.05 150.398
41 1980-01-06 Torgau 78.348 44425476 72.05 150.398
42 1980-01-06 Torgau 78.348 45426142 75.91 154.258
43 1980-01-07 Torgau 78.128 43420072 70.70 148.828
44 1980-01-07 Torgau 78.128 43435074 72.09 150.218
45 1980-01-07 Torgau 78.128 43435084 70.93 149.058
46 1980-01-07 Torgau 78.128 43435086 72.14 150.268
47 1980-01-07 Torgau 78.128 44425470 72.05 150.178
48 1980-01-07 Torgau 78.128 44425476 72.05 150.178
49 1980-01-07 Torgau 78.128 45426142 75.91 154.038
50 1980-01-08 Torgau 78.088 43420072 70.70 148.788
51 1980-01-08 Torgau 78.088 43435074 72.09 150.178
52 1980-01-08 Torgau 78.088 43435084 70.93 149.018
53 1980-01-08 Torgau 78.088 43435086 72.14 150.228
54 1980-01-08 Torgau 78.088 44425470 72.05 150.138
55 1980-01-08 Torgau 78.088 44425476 72.05 150.138
56 1980-01-08 Torgau 78.088 45426142 75.91 153.998
编辑以将数据写入文件
我想这样写起来可以更简单:
write.mkz = function(df, workspace) {
write.csv(df,paste0(workspace,'/',unique(df$MKZ),".csv"), row.names = F)
return(df)
}
workspace = ...
new.df %>%
group_by(MKZ) %>%
do(write.mkz(., workspace))