我正在尝试编写一个for循环来迭代我的索引,只保留那些有重复的索引。
我当前的数据框是两个合并在一起的
0.0102700 0.0308099 0.0616199 0.123240 \
5000000000010 4.330760e-05 4.442720e-05 9.232970e-05 1.994190e-04
5000000000238 6.006910e-04 6.041130e-04 1.220220e-03 2.500240e-03
...
0.00902317 0.0270695 0.0451159 0.0631622 \
5000000000010 6.962980e-05 7.063750e-05 7.165970e-05 7.269680e-05
5000000000234 4.638970e-04 4.716010e-04 4.794320e-04 4.873930e-04
New = pd.concat([SFR_low, SFR_high])
New = New.sort_index()
print(New)
0.00902317 0.0102700 0.0270695 0.0308099 \
5000000000010 6.962980e-05 NaN 7.063750e-05 NaN
5000000000010 NaN 4.330760e-05 NaN 4.442720e-05
5000000000081 6.299210e-05 NaN 6.299320e-05 NaN
5000000000082 NaN 8.176550e-04 NaN 8.172630e-04
我需要一个新的数据框,只保留具有重复索引的行。
答案 0 :(得分:0)
使用Index.duplicated
参数keep=False
:
print (df.index[df.index.duplicated(keep=False)])
Int64Index([1000, 1000, 1002, 1002], dtype='int64')
for i in df.index[df.index.duplicated(keep=False)]:
print (i)
1000
1000
1002
1002
如果需要过滤具有重复索引的行,请使用boolean indexing
:
print (New.index.duplicated(keep=False))
[ True True False False]
print (New[New.index.duplicated(keep=False)])
0.00902317 0.0102700 0.0270695 0.0308099 0.0451159 \
5000000000010 NaN 0.000043 NaN 0.000044 NaN
5000000000010 0.00007 NaN 0.000071 NaN 0.000072
0.0616199 0.0631622 0.123240
5000000000010 0.000092 NaN 0.000199
5000000000010 NaN 0.000073 NaN
答案 1 :(得分:0)
li = [1000,1000,1001,1002,1002]
for i in li:
temp = i
count = 0
for j in li:
if j is temp:
count +=1
if count > 1:
print i
这可以解决您的要求吗?
答案 2 :(得分:0)
在询问前先尝试一些代码: 有很多重复的问题
a = [1000,1000,1001,1002,1002]
c = [x for x in a if a.count(x) > 1]
print c