从pandas数据框中删除闰年

时间:2016-01-23 17:17:20

标签: python pandas dataframe

我有这个人。数据帧:

groupedTasks

我想删除对应于2月29日的所有行,导致foll。数据框:

datetime
2012-01-01    125.5010
2012-01-02    125.5010
2012-01-03    125.5010
2012-02-04    125.5010
2012-02-05    125.5010
2012-02-29    125.5010
2012-02-28    125.5010
2016-01-07    125.5010
2016-01-08    125.5010
2016-02-29     81.6237

现在,我只是手动完成:

datetime 2012-01-01 125.5010 2012-01-02 125.5010 2012-01-03 125.5010 2012-02-04 125.5010 2012-02-05 125.5010 2012-02-28 125.5010 2016-01-07 125.5010 2016-01-08 125.5010 。我怎样才能使它适用于所有年份,而不必手动指定行索引。

3 个答案:

答案 0 :(得分:14)

如果您的数据框已经有import java.io.File; import java.io.IOException; import java.util.Scanner; public class test { public int charCountHelper(File handle, Character x) throws IOException { int count = 0; String data; int index; Character[] contents; Scanner inputFile = new Scanner(handle); while(inputFile.hasNext()){ data=inputFile.nextLine(); index = data.length()-1; for(int i = 0; i< data.length(); i++){ contents = new Character[data.length()] ; contents[i] = data.charAt(i); } count += charCount(contents,x,index); } inputFile.close(); return count; } public int charCount(Character[] content, Character x, int index) { if(index < 0){ return 0; // this value represents the character count if the program reaches the beginning of the array and has not found a match. } if (content[index].equals(x)) { return 1 + charCount(content, x, index - 1); } return charCount(content, x, index - 1); // this is the value that gets returned to the original calling method. } } 列作为索引,您可以:

datetime

这应删除所有年份包含2月29日那天的行。

答案 1 :(得分:5)

IIUC你可以屏蔽它并通过loc删除:

def is_leap_and_29Feb(s):
    return (s.index.year % 4 == 0) & 
           ((s.index.year % 100 != 0) | (s.index.year % 400 == 0)) & 
           (s.index.month == 2) & (s.index.day == 29)

mask = is_leap_and_29Feb(df)
print mask
#[False False False False False  True False False False  True]

print df.loc[~mask]
#            datetime
#2012-01-01   125.501
#2012-01-02   125.501
#2012-01-03   125.501
#2012-02-04   125.501
#2012-02-05   125.501
#2012-02-28   125.501
#2016-01-07   125.501
#2016-01-08   125.501

答案 2 :(得分:5)

您可以将日期视为string,看看它是否以02-29结尾:

df = df[~df.index.str.endswith('02-29')]

使用此方法,您可以使用任何字符串比较方法,如contains等。