当我有两个(或更多)数据框并希望为每个数据集中的每个匹配观察分配唯一ID时,我有一个实际问题,例如:
scope.search = function() {
// If searchText empty, don't search
if (scope.searchText == null || scope.searchText.length < 1)
return;
var url = 'http://suggestqueries.google.com/complete/search?';
url += 'callback=JSON_CALLBACK&client=firefox&hl=en&q='
url += encodeURIComponent(scope.searchText);
$http.defaults.useXDomain = true;
$http({
url: url,
method: 'JSONP',
headers: {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'POST, GET, OPTIONS, PUT',
'Content-Type': 'application/json',
'Accept': 'application/json'
}
}).
success(function(data, status, headers, config) {
// Api returns [ Original Keyword, Searches[] ]
var results = data[1];
if (results.indexOf(scope.searchText) === -1) {
data.unshift(scope.searchText);
}
scope.suggestions = results;
scope.selectedIndex = -1;
}).
error(function(data, status, headers, config) {
console.log('fail');
// called asynchronously if an error occurs
// or server returns response with an error status.
});
非常感谢有关如何访问df2.2的任何帮助。谢谢。
答案 0 :(得分:5)
解决这个问题的一个简单方法是制作哈希:
library(dplyr)
library(digest)
df1 %>%
rowwise() %>%
do( data.frame(., id=digest( paste(.$a1,.$b1,.$c1), algo="md5"),
stringsAsFactors=FALSE)) %>% ungroup()
df2 %>%
rowwise() %>%
do( data.frame(., id=digest( paste(.$a2,.$b2,.$c2), algo="md5"),
stringsAsFactors=FALSE)) %>% ungroup()
会为df1
生成以下内容:
a1 b1 c1 id
1 1 1 white b86fbb78b27f7db2ee50af2d68cce452
2 1 5 red 68d47f544832989834517630e4a2764c
3 1 3 black 724e37192140cb2009cf3d982f2be1e4
4 1 2 white f731b8b38255b8c312543283f8e1c634
5 2 3 red 2d50b86902056a51faad04d2c566faf2
6 2 4 white 9396667cd51d1e1b61b0b22a7767d3d9
7 2 5 black 9ba1f3e04c61c006d3c5382fcad098e6
8 2 1 silver 38dcd29d200c8b33cd38ac78ef9dd751
9 1 5 red 68d47f544832989834517630e4a2764c
10 1 2 green 7d9b1aadfd79de142b234b83d7867b9b
以及df2
的以下内容:
a2 b2 c2 id
1 2 3 black d285febc8ab08e99b11609b98f077e66
2 2 1 blue bfa0405276406ac4bc596daf957dfa11
3 1 3 black 724e37192140cb2009cf3d982f2be1e4
4 1 2 white f731b8b38255b8c312543283f8e1c634
5 2 1 silver 38dcd29d200c8b33cd38ac78ef9dd751
6 2 3 green 67eefe9ee2d82486ded30a268289296b
7 2 4 green d773f58cf144eab15ef459e326494a2f
8 2 5 red 0724318a9f59d3960edfe4e90f9c4eff
9 2 3 blue 6883420cc137ba45b773f642176e9ce6
10 2 5 white 5dea9e63b5fbfb31fb81260cb5a5f41c
答案 1 :(得分:0)
您可以通过编写生成唯一ID的函数,然后将其应用于df1
和df2
的组合来完成您想要的任务。
# Inspiration: http://stackoverflow.com/questions/24119599/how-to-assign-a-unique-id-number-to-each-group-of-identical-values-in-a-column
unique.id <- function(x) as.numeric(factor(x))
(df1.info <- do.call(paste, df1))
# [1] "1 1 white 1" "1 5 red 5" "1 3 black 4" "1 2 white 3" "2 3 red 11"
# [6] "2 4 white 13" "2 5 black 14" "2 1 silver 7" "1 5 red 5" "1 2 green 2"
df2.info <- do.call(paste, df2)
ids <- unique.id(c(df1.info, df2.info))
df1$id <- head(ids, nrow(df1))
df1
# a1 b1 c1 id
# 1 1 1 white 1
# 2 1 5 red 5
# 3 1 3 black 4
# 4 1 2 white 3
# 5 2 3 red 11
# 6 2 4 white 13
# 7 2 5 black 14
# 8 2 1 silver 7
# 9 1 5 red 5
# 10 1 2 green 2
df2$id <- tail(ids, nrow(df2))
df2
# a2 b2 c2 id
# 1 2 3 black 8
# 2 2 1 blue 6
# 3 1 3 black 4
# 4 1 2 white 3
# 5 2 1 silver 7
# 6 2 3 green 10
# 7 2 4 green 12
# 8 2 5 red 15
# 9 2 3 blue 9
# 10 2 5 white 16
答案 2 :(得分:0)
假设您的列完全相同,最简单的方法可能是:
df.all <- rbind(df1, df2)
(您可能需要将列重命名为相同。)
现在在整个数据集中执行与数据表相同的技巧。然后重新分割数据集:
df1 <- df.all[1:nrow(df1),]
df2 <- df.all[- (1:nrow(df1)),]
注意:我不是说数据表技巧是为独特组合生成数字的理想方式!但是你已经写出来了。