以下data.frame
应该是逆对和一些条件的子集:
> foo
ID Day Period Start End
1 11 1 morning Central Park Alphabet Village
2 11 1 morning Central Park Alphabet Village
3 11 1 evening Alphabet Village Grammercy
4 54 1 morning Union Square Chinatown
5 67 1 morning Midtown Harlem
6 67 1 morning Harlem Midtown
7 69 1 morning Greenpoint Prospect Heights
8 54 1 evening Chinatown Union Square
9 77 1 morning Park Slope Williamsburg
10 73 1 evening Williamsburg Park Slope
11 88 2 morning Grammercy Battery Park
12 88 2 morning Battery Park SoHo
13 88 2 evening Battery Park Grammercy
14 69 2 evening Prospect Heights Greenpoint
15 88 2 evening Grammercy Battery Park
例如,Start
和End
电台逆对必须落在
相同的Day
,具有相同的ID
,而第一个必须在早上发生,第二个发生在晚上。 *编辑:应该注意,只有一个Start-End可用于与End-Start配对。也就是说,一旦形成一对,原始的开始 - 结束就不能再用于形成另一对。例如,记录15
无法与记录13
配对,因为13
已被“占用”。
子集的输出始终为偶数。在这种情况下,它将是:
ID Day Period Start End
3 54 1 morning Union Square Chinatown
7 54 1 evening Chinatown Union Square
10 88 2 morning Grammercy Battery Park
11 88 2 evening Battery Park Grammercy
我不确定是否应该使用subset()
函数以及for循环或如何构造循环。它应该说 - 如果start
和end
等于以下行的end
和start
,ID
= ID
,{ {1}} = Day
和第一条记录的Day
=“早晨”,而第二条记录=“晚上”
我认为代码应该以这样的内容开头:Period
但不确定。我们的想法是保持满足这些条件的所有逆对。任何指导和解释步骤将不胜感激。
样本数据:
if(foo[i-1,"start"] == foo[i,"end"]) & (foo[i-1,"end"] == foo[i,"start"])
答案 0 :(得分:2)
按照' ID',' Day',static void Main(string[] args)
{
try
{
Method1();
Method2();
Method3();
Console.WriteLine("Success");
}
catch (Exception e)
{
Console.WriteLine("Something wrong happened!");
}
Console.ReadLine();
}
private static void Method1()
{
Console.WriteLine("Here is one");
}
private static void Method2()
{
Console.WriteLine("Here is two");
string foo = null;
foo.ToUpper();
}
private static void Method3()
{
Console.WriteLine("Here is three");
}
'期间'进行分组。 filter
元素数大于1(unique
)的位置,然后将ndistinct
列更改为factor
,并执行与{&}中的条件匹配的character
#39;帖子
filter
在 library(dplyr)
foo %>%
group_by(ID, Day) %>%
filter(n_distinct(Period)>1) %>%
mutate(Start = as.character(Start), End = as.character(End)) %>%
filter(Start[1]==End[n()] & Start[n()] == End[1])
# ID Day Period Start End
# (int) (int) (fctr) (chr) (chr)
#1 54 1 morning Union Square Chinatown
#2 54 1 evening Chinatown Union Square
#3 88 2 morning Grammercy Battery Park
#4 88 2 evening Battery Park Grammercy
版本0.5.0及更高版本中,我们可以使用dplyr
mutate_if
答案 1 :(得分:0)
在SQL中,您将使用联合查询的自联接。通过拆分早晚子集,然后将它们合并到 ID , Day 和 Start , End,在基础R中考虑相同的方法(反向配对),最后rbind
然后再一起拆分相应的列:
mdf <- setNames(df[df$Period=='morning',], paste0(colnames(df), "_m"))
edf <- setNames(df[df$Period=='evening',], paste0(colnames(df), "_e"))
rbind(setNames(merge(mdf, edf,
by.x=c("ID_m", "Day_m", "Start_m", "End_m"),
by.y=c("ID_e", "Day_e", "End_e", "Start_e"))[colnames(mdf)], colnames(df)),
setNames(merge(mdf, edf,
by.x=c("ID_m", "Day_m","Start_m", "End_m"),
by.y=c("ID_e", "Day_e", "End_e", "Start_e"))[c("ID_m", "Day_m", "Period_e", "End_m", "Start_m")], colnames(df)))
# ID Day Period Start End
# 1 54 1 morning Union Square Chinatown
# 2 88 2 morning Grammercy Battery Park
# 3 54 1 evening Chinatown Union Square
# 4 88 2 evening Battery Park Grammercy
SQL 对应(在MS Access中工作,返回完全相同的输出)
SELECT t1.*
FROM
(SELECT m.ID, m.Day, m.Period, m.[Start], m.[End]
FROM RDataSet AS m
WHERE (((m.Period)='morning'))) As t1
INNER JOIN
(SELECT e.ID, e.Day, e.Period, e.[Start], e.[End]
FROM RDataSet AS e
WHERE (((e.Period)='evening'))) As t2
ON t1.ID = t2.ID AND t1.Day = t2.Day AND t1.[Start] = t2.[End] AND t1.[End] = t2.[Start]
UNION
SELECT t2.*
FROM
(SELECT m.ID, m.Day, m.Period, m.[Start], m.[End]
FROM RDataSet AS m
WHERE (((m.Period)='morning'))) As t1
INNER JOIN
(SELECT e.ID, e.Day, e.Period, e.[Start], e.[End]
FROM RDataSet AS e
WHERE (((e.Period)='evening'))) As t2
ON t1.ID = t2.ID AND t1.Day = t2.Day AND t1.[Start] = t2.[End] AND t1.[End] = t2.[Start]