我的数据采用以下格式,包含3个日期列
X <- c(24/02/2016, 25/02/2016, 26/02/2016, 29/02/2016, 01/03/2016, 02/03/2016, 03/03/2016, 04/03/2016, 07/03/2016, 08/03/2016, 09/03/2016, 10/03/2016, 11/03/2016, 14/03/2016, 15/03/2016)
Y <- c(26/08/2014, 10/09/2014,24/09/2014, 09/10/2014, 24/02/2016, 09/03/2016, 24/03/2016, 11/04/2016, 26/04/2016)
Z <- c(15/08/2014, 29/08/2014, 15/09/2014, 30/09/2014, 12/02/2016, 29/02/2016, 15/03/2016, 31/03/2016, 15/04/2016)
我想要的输出如下
X Output
24/02/2016 12/02/2016
25/02/2016 NA
26/02/2016 NA
29/02/2016 NA
01/03/2016 NA
02/03/2016 NA
03/03/2016 NA
04/03/2016 NA
07/03/2016 NA
08/03/2016 NA
09/03/2016 29/02/2016
10/03/2016 NA
11/03/2016 NA
14/03/2016 NA
15/03/2016 NA
基本上问题是在X和Y之间存在匹配的地方,我需要在新列中对应于X的Z. 我对R不是很好,所以无法弄清楚如何提出解决方案。有什么想法吗?
答案 0 :(得分:1)
您可以使用match
在基础R中执行此操作,但我发现使用dplyr
包和left_join
更加清晰。
library(dplyr)
# make a data frame with X as a column
X.df <- data.frame(X = c("24/02/2016", "25/02/2016", "26/02/2016", "29/02/2016", "01/03/2016", "02/03/2016", "03/03/2016", "04/03/2016", "07/03/2016", "08/03/2016", "09/03/2016", "10/03/2016", "11/03/2016", "14/03/2016", "15/03/2016"), stringsAsFactors = F)
# make a data frame with Y and Z as columns
YZ.df <- data.frame(Y = c("26/08/2014", "10/09/2014", "24/09/2014", "09/10/2014", "24/02/2016", "09/03/2016", "24/03/2016", "11/04/2016", "26/04/2016"), Z = c("15/08/2014", "29/08/2014", "15/09/2014", "30/09/2014", "12/02/2016", "29/02/2016", "15/03/2016", "31/03/2016", "15/04/2016"), stringsAsFactors = F)
# do a left join, specifying variables X and Y
left_join(X.df, YZ.df, by = c("X" = "Y"))
请注意,如果Y值与X值匹配,则上面会为X创建重复的行。
答案 1 :(得分:1)
为了完整起见,这里有data.table
版补充gatsky's answer:
library(data.table)
data.table(Y, Z)[data.table(X), on = .(Y == X), .(X, Z)]
X Z 1: 24/02/2016 12/02/2016 2: 25/02/2016 NA 3: 26/02/2016 NA 4: 29/02/2016 NA 5: 01/03/2016 NA 6: 02/03/2016 NA 7: 03/03/2016 NA 8: 04/03/2016 NA 9: 07/03/2016 NA 10: 08/03/2016 NA 11: 09/03/2016 29/02/2016 12: 10/03/2016 NA 13: 11/03/2016 NA 14: 14/03/2016 NA 15: 15/03/2016 NA
Z <- c("15/08/2014", "29/08/2014", "15/09/2014", "30/09/2014", "12/02/2016", "29/02/2016", "15/03/2016", "31/03/2016", "15/04/2016")
Y <- c("26/08/2014", "10/09/2014", "24/09/2014", "09/10/2014", "24/02/2016", "09/03/2016", "24/03/2016", "11/04/2016", "26/04/2016")
X <- c("24/02/2016", "25/02/2016", "26/02/2016", "29/02/2016", "01/03/2016", "02/03/2016", "03/03/2016", "04/03/2016", "07/03/2016", "08/03/2016", "09/03/2016", "10/03/2016", "11/03/2016", "14/03/2016", "15/03/2016")
答案 2 :(得分:0)
使用匹配
package com.collections.java.basic;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class SortingDataWhileReading {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("E:\\BUILD\\numbers.txt"));//this file contains several double data.
List<Double> numbers = new ArrayList<Double>();
String line = null;
//String line = br.readLine();
while ((line = br.readLine()) != null) {
String []strNumbers = line.split(" ");
for(String strNumber : strNumbers){
numbers.add((double) Double.parseDouble(strNumber));
}
}
br.close();
Collections.sort(numbers);
System.out.println("minimum value" + numbers.get(0));
System.out.println("minimum value" + numbers.get(numbers.size() - 1));
System.out.println(numbers);
}
}
输出
# Construct data
Z = c("15/08/2014", "29/08/2014", "15/09/2014", "30/09/2014", "12/02/2016", "29/02/2016", "15/03/2016", "31/03/2016", "15/04/2016")
Y = c("26/08/2014", "10/09/2014", "24/09/2014", "09/10/2014", "24/02/2016", "09/03/2016", "24/03/2016", "11/04/2016", "26/04/2016")
df <- data.frame(X = c("24/02/2016", "25/02/2016", "26/02/2016", "29/02/2016", "01/03/2016", "02/03/2016", "03/03/2016", "04/03/2016", "07/03/2016", "08/03/2016", "09/03/2016", "10/03/2016", "11/03/2016", "14/03/2016", "15/03/2016"), stringsAsFactors = F)
# Match df$X to Y and return that index of Z
df$Output<-Z[match(df$X,Y)]