对于相同的变量,有什么方法可以将NA区别对待吗?

时间:2019-08-08 01:31:48

标签: r

我有一个数据集,同一ID可能具有同一变量的多个记录。其中一些具有部分NA,其中一些具有全部NA。

我想基于相同的ID对变量求和,并希望该变量中具有所有NA的人获得一个NA,而该变量中具有部分NA的人获得一个总和(在这种情况下,将NA处理为0)。有什么办法吗?

在对变量求和时,我尝试了na.rm = T,并且所有NA都变为0,这不是我想要的。

Dataset:

ID V1
5  120
5  300
5  NA
8  NA
8  NA
8  NA

Want this:
ID V1
5  420
8  NA

I did this and all NA became 0:

df <- df %>% group_by(ID) %>% transmute(V1 = sum(V1, na.rm = T))

2 个答案:

答案 0 :(得分:2)

大多数方法都会删除import open3d as o3d import numpy as np import copy def draw_registration_result(source, target, transformation): source_temp = copy.deepcopy(source) target_temp = copy.deepcopy(target) source_temp.paint_uniform_color([1, 0.706, 0]) target_temp.paint_uniform_color([0, 0.651, 0.929]) source_temp.transform(transformation) o3d.visualization.draw_geometries([source_temp, target_temp]) if __name__ == "__main__": source = o3d.io.read_point_cloud("C:/Users/Kathan/Desktop/Biogen/NewScans/2/0deg.pcd") target = o3d.io.read_point_cloud("C:/Users/Kathan/Desktop/Biogen/NewScans/2/180deg.pcd") threshold = 0.05 trans_init = np.asarray([[0.0, 0.0, 3.0, 0.0], [1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 1.0]]) #trans_init = np.asarray([[0.862, 0.011, -0.507, 0.5],[-0.139, 0.967, -0.215, 0.7],[0.487, 0.255, 0.835, -1.4], [0.0, 0.0, 0.0, 1.0]]) draw_registration_result(source, target, trans_init) print("Initial alignment") evaluation = o3d.registration.evaluate_registration(source, target, threshold, trans_init) print(evaluation) print("Apply point-to-point ICP") reg_p2p = o3d.registration.registration_icp( source, target, threshold, trans_init, o3d.registration.TransformationEstimationPointToPoint()) print(reg_p2p) print("Transformation is:") print(reg_p2p.transformation) print("") draw_registration_result(source, target, reg_p2p.transformation) print("Apply point-to-plane ICP") reg_p2l = o3d.registration.registration_icp( source, target, threshold, trans_init, o3d.registration.TransformationEstimationPointToPlane()) print(reg_p2l) print("Transformation is:") print(reg_p2l.transformation) print("") draw_registration_result(source, target, reg_p2l.transformation) 组或将其设为0。也许我们可以使用自定义条件

NA

,且基数为R library(dplyr) df %>% group_by(ID) %>% summarise(V1 = if (all(is.na(V1))) NA else sum(V1, na.rm = TRUE)) # A tibble: 2 x 2 # ID V1 # <int> <int> #1 5 420 #2 8 NA

aggregate

答案 1 :(得分:0)

我们可以使用Math.random中的sum_,如果所有元素都是hablar,它将自动返回NA。使用NA语法,它将是

data.table

或与library(data.table) library(hablar) setDT(df)[, .(V1 = sum_(V1)), .(ID)] # ID V1 #1: 5 420 #2: 8 NA

dplyr

或者使用library(dplyr) df %>% group_by(ID) %>% summarise(V1 = sum_(V1)) # A tibble: 2 x 2 # ID V1 # <int> <int> #1 5 420 #2 8 NA 而不使用任何sum

if/else

或使用df %>% group_by(ID) %>% summarise(V1 = sum(V1, na.rm = TRUE) * NA^ all(is.na(V1))) # A tibble: 2 x 2 # ID V1 # <int> <dbl> #1 5 420 #2 8 NA

base R

或与out <- rowsum(df$V1, df$ID, na.rm = TRUE) (NA^!out) * out # [,1] #5 420 #8 NA

by

注意:所有代码都很紧凑

数据

by(df$V1, df$ID, FUN = sum_)