Question

我有一个像下面这样的数据集。

df= pd.DataFrame({"a" : [4 ,5], "b" : [7, 8]},    index = [1, 2])

   a  b
1  4  7
2  5  8

要通过ID选择第一条记录和最后一条记录，我可以根据@akarun答案进行以下操作。

   a  b  c
1  4  7  14
2  4  7  15
3  5  8  15
4  5  8  16

但是，如何在data.table中的条件中添加条件。例如：我要在组字段中选择具有“ IN”值的第一条记录。

Answer 1

您需要类似的东西吗？

// Select the node that will be observed for mutations
const targetNode = document.getElementById('myDiv');

// Options for the observer (which mutations to observe)
const config = { attributes: true };

// Callback function to execute when mutations are observed
const callback = function(mutationsList, observer) {
    // Use traditional 'for loops' for IE 11
    for(let mutation of mutationsList) {
        if (mutation.type === 'attributes') {
            if(myDivAttr == "type-1"){
              typeOneFunction();
             }
             else if(myDivAttr == "type-2"){
               typeTwoFunction();
             }
        }
    }
};

// Create an observer instance linked to the callback function
const observer = new MutationObserver(callback);

// Start observing the target node for configured mutations
observer.observe(targetNode, config);

// Later, you can stop observing
observer.disconnect();

或者在基数R中使用library(data.table) df[, .SD[c(which.max(Group == "In"), .N)], by = ID] # ID Use Group #1: 13A Sheet2 In #2: 13A Sheet5 Out

ave

Answer 2

我认为这会起作用。请注意，如果最后一条记录是组=='In'的第一条记录，或者只有一个ID记录，则会创建重复记录：

library('tidyverse')

first_ins = df %>% 
  filter(Group == 'In') %>% 
  group_by(ID) %>% 
  slice(1) %>% 
  ungroup()

output = df %>% 
  group_by(ID) %>% 
  slice(n()) %>% 
  ungroup() %>% 
  bind_rows(first_ins) %>% 
  arrange(ID, Group)

根据条件选择第一条和最后一条记录

2 个答案: