在R

时间:2018-08-16 00:06:39

标签: r

很抱歉出现菜鸟问题!我正在尝试计算在篮子x和篮子y中匹配的元素数量。 我有以下数据:

user_id basket.x basket.y
1         1,2,3    2,3,4
2         5,6,7    1,2,7

我已经尝试了以下循环,但是不起作用

df["total"] <- 0
df["TP"] <- 0
for(i in 1:nrow(df)){
 for(j in 1:nrow(df)){
  if(all(df$basket.x[i] %in% df$basket.y[j])){
     df$total <- total + 1
     df$TP <- TP + 1
  }
 }
}

并返回此:

user_id basket.x basket.y   total TP
1         1,2,3    2,3,4     0    0
2         5,6,7    1,2,7     0    0

但是,期望的结果是:

user_id basket.x basket.y   total TP
1         1,2,3    2,3,4     3    2
2         5,6,7    1,2,7     3    1

有人可以指出我犯错的地方吗? 谢谢

运行dput():

structure(list(user_id = c(2957L, 7306L, 10219L, 11290L, 13222L, 
13554L), basket.x = c("13870,22963,1158,18362"),basket.y = 
c("24852,432,47626,33647,6015,1158,24852,24852,24852")
), row.names = c(NA, 
6L), class = "data.frame")

2 个答案:

答案 0 :(得分:4)

正如@JohnColeman所指出的那样,您的dput出了点问题,因此我将其与您的原始示例结合使用。

df = structure(list(user_id = c(2957L, 7306L, 10219L), 
basket.x = c("13870,22963,1158,18362", "1,2,3", "5,6,7"),
basket.y = c("24852,432,47626,33647,6015,1158,24852,24852,24852",
"2,3,4", "1,2,7")
), row.names = c(1L,2L,3L), class = "data.frame")
df
  user_id               basket.x
1    2957 13870,22963,1158,18362
2    7306                  1,2,3
3   10219                  5,6,7
                                           basket.y
1 24852,432,47626,33647,6015,1158,24852,24852,24852
2                                             2,3,4
3                                             1,2,7

使用此数据,我们可以使用strsplit获取列表的各个元素。一旦有了元素,就可以使用intersect查找basket.xbasket.y中的元素。要获取两个购物篮共有多少个元素,我们可以取交点的长度。当然,我们需要将此应用到df的所有行中。放在一起,我们得到

sapply(1:nrow(df), function(i) 
    length(intersect(strsplit(df$basket.x, ",")[[i]],
            strsplit(df$basket.y, ",")[[i]])))
[1] 1 2 1

修改 感谢@thelatemail注意到我编写此代码的方式效率很低。更好的是:

sapply(1:nrow(df), function(i) 
    length(intersect(unlist(strsplit(df$basket.x[[i]], ",")),
            unlist(strsplit(df$basket.y[[i]], ",")))))

答案 1 :(得分:0)

可以通过var valid = function () { var weight = document.getElementById('weight').value; var height = document.getElementById('height').value; /* if (isNaN(weight || height)) { return alert("Value must be a number!"); } */ if (isNaN(weight) || isNaN(height)){ return alert('Value must be a number!'); } /* if (weight || height === "") { return alert("Please enter a value"); } */ if (weight === '' || height === ''){ return alert('Please enter a value'); } else { var result = ((weight / (Math.pow(height, 2))) * 703); var result = parseFloat(result).toFixed(2) // return alert("Your BMI is " + result) return document.getElementById('resultline').innerHTML = ("Your BMI is " + result); } }来替换@ G5W的答案,以替换(很好,隐藏)每个行索引上的循环:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="styles.css">
    <script src="bmi.js"></script>
    <title>BMI Calculator</title>
</head>

<body class="whole">
    <h2>BMI Calculator!</h2>
    <form>
        <section id="whinputs" class="inputs">
            <input id="weight" type="text" placeholder="Enter weight in pounds">
            <input id="height" type="text" placeholder="Enter height in inches">
        </section>

        <section class="buttons">
            <input type="button" onclick="valid()" value="Calculate BMI">
            <input type="reset">
        </section>
    </form>
    <h2 id="resultline"></h2>



</body>

</html>

尽管您必须保存中间Map,但是如果要处理更大的数据集,这应该会明显更快。