Question

我有一个数据集，同一ID可能具有同一变量的多个记录。其中一些具有部分NA，其中一些具有全部NA。

我想基于相同的ID对变量求和，并希望该变量中具有所有NA的人获得一个NA，而该变量中具有部分NA的人获得一个总和（在这种情况下，将NA处理为0）。有什么办法吗？

在对变量求和时，我尝试了na.rm = T，并且所有NA都变为0，这不是我想要的。

Dataset:

ID V1
5  120
5  300
5  NA
8  NA
8  NA
8  NA

Want this:
ID V1
5  420
8  NA

I did this and all NA became 0:

df <- df %>% group_by(ID) %>% transmute(V1 = sum(V1, na.rm = T))

Answer 1

大多数方法都会删除import open3d as o3d import numpy as np import copy def draw_registration_result(source, target, transformation): source_temp = copy.deepcopy(source) target_temp = copy.deepcopy(target) source_temp.paint_uniform_color([1, 0.706, 0]) target_temp.paint_uniform_color([0, 0.651, 0.929]) source_temp.transform(transformation) o3d.visualization.draw_geometries([source_temp, target_temp]) if __name__ == "__main__": source = o3d.io.read_point_cloud("C:/Users/Kathan/Desktop/Biogen/NewScans/2/0deg.pcd") target = o3d.io.read_point_cloud("C:/Users/Kathan/Desktop/Biogen/NewScans/2/180deg.pcd") threshold = 0.05 trans_init = np.asarray([[0.0, 0.0, 3.0, 0.0], [1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 1.0]]) #trans_init = np.asarray([[0.862, 0.011, -0.507, 0.5],[-0.139, 0.967, -0.215, 0.7],[0.487, 0.255, 0.835, -1.4], [0.0, 0.0, 0.0, 1.0]]) draw_registration_result(source, target, trans_init) print("Initial alignment") evaluation = o3d.registration.evaluate_registration(source, target, threshold, trans_init) print(evaluation) print("Apply point-to-point ICP") reg_p2p = o3d.registration.registration_icp( source, target, threshold, trans_init, o3d.registration.TransformationEstimationPointToPoint()) print(reg_p2p) print("Transformation is:") print(reg_p2p.transformation) print("") draw_registration_result(source, target, reg_p2p.transformation) print("Apply point-to-plane ICP") reg_p2l = o3d.registration.registration_icp( source, target, threshold, trans_init, o3d.registration.TransformationEstimationPointToPlane()) print(reg_p2l) print("Transformation is:") print(reg_p2l.transformation) print("") draw_registration_result(source, target, reg_p2l.transformation)组或将其设为0。也许我们可以使用自定义条件

NA

，且基数为R library(dplyr) df %>% group_by(ID) %>% summarise(V1 = if (all(is.na(V1))) NA else sum(V1, na.rm = TRUE)) # A tibble: 2 x 2 # ID V1 # <int> <int> #1 5 420 #2 8 NA

aggregate

Answer 2

我们可以使用Math.random中的sum_，如果所有元素都是hablar，它将自动返回NA。使用NA语法，它将是

data.table

或与library(data.table) library(hablar) setDT(df)[, .(V1 = sum_(V1)), .(ID)] # ID V1 #1: 5 420 #2: 8 NA

dplyr

或者使用library(dplyr) df %>% group_by(ID) %>% summarise(V1 = sum_(V1)) # A tibble: 2 x 2 # ID V1 # <int> <int> #1 5 420 #2 8 NA而不使用任何sum

if/else

或使用df %>% group_by(ID) %>% summarise(V1 = sum(V1, na.rm = TRUE) * NA^ all(is.na(V1))) # A tibble: 2 x 2 # ID V1 # <int> <dbl> #1 5 420 #2 8 NA

base R

或与out <- rowsum(df$V1, df$ID, na.rm = TRUE) (NA^!out) * out # [,1] #5 420 #8 NA

by

注意：所有代码都很紧凑

数据

by(df$V1, df$ID, FUN = sum_)

对于相同的变量，有什么方法可以将NA区别对待吗？

2 个答案:

数据