我有数据
TaskGroup
我想过滤至少一列大于0.50的行。
我正在尝试以下命令:
class TaskGroupType(models.Model):
name = models.CharField(max_length=58, null=True)
def __str__(self):
return self.name
class Project(models.Model):
created = models.DateTimeField(auto_now_add=True)
owner = models.ForeignKey(
UserProjectOwners, null=True, blank=True,
on_delete=models.CASCADE, related_name='Owner'
)
name = models.CharField(max_length=100)
desc = models.CharField(max_length=200, null=True, blank=True)
category = models.CharField(max_length=100, null=True, blank=True)
members = models.ForeignKey(
UserProjectTeam, null=True, blank=True,
on_delete=models.CASCADE, related_name="project"
)
tasktype = models.ManyToManyField(TaskType)
class Meta:
ordering =['created']
verbose_name = "User Table"
verbose_name_plural = verbose_name
def __str__(self):
return self.name
class TaskGroup(models.Model):
created = models.DateTimeField(auto_now_add=True)
name = models.CharField(max_length=280, blank=True)
order = models.IntegerField(null=True, blank=True)
project = models.ForeignKey(
Project,
related_name='taskgroups', null=True, blank=True,
on_delete=models.CASCADE
)
class Meta:
ordering =['created']
verbose_name = "Task Group"
verbose_name_plural = verbose_name
def __str__(self):
return self.name
class Task(models.Model):
SORT_TYPE = (
(1, "Normal"),
(2, "Urgent"),
(3, "Very Urgent"),
)
createDate = models.DateTimeField(auto_now_add=True)
tasklist = models.ForeignKey(
TaskList,
related_name='tasks', null=True, blank=True,
on_delete=models.CASCADE
)
completed = models.BooleanField(default=False)
accomplished = models.DateTimeField(null=True, blank=True)
desc = models.CharField(max_length=380, blank=True)
name = models.CharField(max_length=180, blank=True)
performer = models.ForeignKey(
User,
related_name='Task', null=True, blank=True,
on_delete=models.CASCADE
)
participant = models.ManyToManyField(
User,
related_name='+'
)
startDate = models.DateTimeField(null=True, blank=True)
dueDate = models.DateTimeField(null=True, blank=True)
priority = models.CharField(max_length=100, choices=SORT_TYPE, null=True, blank=True)
order = models.IntegerField(null=True, blank=True)
remark = models.CharField(max_length=400, null=True, blank=True)
class Meta:
ordering =['createDate']
verbose_name = "Task Table"
verbose_name_plural = verbose_name
def __str__(self):
return self.name
我收到以下警告,但没有任何输出:
Name Clust1 Clust2 Clust3
AA 0.0662421 0.01742827 0.02286026
BB 0.7694628 0.03241972 0.02935754
CC 0.1099033 0.52170750 0.28385905
DD 0.2769453 0.30376152 0.24822205
我希望以下数据框:
new.df <- df %>% mutate(confident = ifelse(rowSums(.[,c(1:4)] >= 0.5)>0, 'yes', 'no'))
您是否有办法修正我的代码以获得所需的输出。 谢谢
答案 0 :(得分:1)
我们可以直接使用rowSums
df[rowSums(df[2:4] >= 0.5) > 0, ]
# Name Clust1 Clust2 Clust3
#2 BB 0.76946 0.03242 0.029358
#3 CC 0.10990 0.52171 0.283859
或带有dplyr
和filter_at
的{{1}}版本
any_vars
并且就@thelatemail提到的代码修复而言,您将library(dplyr)
df %>%
filter_at(vars(starts_with("Clust")), any_vars(. >= 0.5))
的第1列包括在rowSums
列中,因此您希望将其子集放在{{1}列中}。同样,我们可以直接使用Name
而不是使用2:4
创建新变量,因此以下操作应该有效。
filter
我们还可以使用mutate
版本,这对于较大的数据集来说会很慢
df %>% filter(rowSums(.[,c(2:4)] >= 0.5) > 0)