在pandas中使用shift替换新列值

时间:2017-02-22 16:53:00

标签: python pandas

在IP范围表中,我有该位置的名称和该位置的起始IP地址。 规则是:如果下一行在同一地址范围内,则该位置的结束IP地址是下一行的值 - 1,否则是其范围的最后一个地址。 以下是一个示例数据:

Name    StartRange
loc1    172.28.10.15
loc2    172.28.10.128
loc3    172.28.12.0
loc4    172.28.12.58

预期结果是:

Name    StartRange      EndIP
loc1    172.28.10.15    172.28.10.127
loc2    172.28.10.128   172.28.10.255
loc3    172.28.12.0     172.28.12.57
loc4    172.28.12.58    172.28.12.255

以下是我尝试的代码:

from socket import inet_aton
from struct import unpack

import pandas as pd

mask = unpack(">L", inet_aton('255.255.255.0'))[0]

def getEndIP(startIP, endIP):
    hi = (startIP['StartIP'] & mask) + 255
    return hi if hi < endIP['StartIP'] else endIP['StartIP'] - 1

xls = pd.read_excel("E:\\TEMP\\AllScope.xlsx")
xls['StartIP'] = xls['StartRange'].map(lambda a: unpack(">L", inet_aton(a))[0])
xls = xls.sort_values('StartIP')
xls['EndIP'] = getEndIP(xls['StartIP'], xls['StartIP'].shift(-1))

print xls[['Name', 'StartRange', 'StartIP', 'EndIP']]

但我有一条关键错误消息:

KeyError: 'StartIP'

我做错了什么? (我对熊猫不太熟悉)

更新: 这是跟踪:

runfile('E:/Documents/Projects/Python/Egyéb progik/Network/network.py', wdir='E:/Documents/Projects/Python/Egyéb progik/Network')
Traceback (most recent call last):

  File "<ipython-input-67-6caaa536457c>", line 1, in <module>
    runfile('E:/Documents/Projects/Python/Egyéb progik/Network/network.py', wdir='E:/Documents/Projects/Python/Egyéb progik/Network')

  File "C:\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)

  File "C:\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "E:/Documents/Projects/Python/Egyéb progik/Network/network.py", line 15, in <module>

  File "E:/Documents/Projects/Python/Egyéb progik/Network/network.py", line 9, in getEndIP

  File "C:\Anaconda2\lib\site-packages\pandas\core\series.py", line 603, in __getitem__
    result = self.index.get_value(self, key)

  File "C:\Anaconda2\lib\site-packages\pandas\indexes\base.py", line 2169, in get_value
    tz=getattr(series.dtype, 'tz', None))

  File "pandas\index.pyx", line 98, in pandas.index.IndexEngine.get_value (pandas\index.c:3557)

  File "pandas\index.pyx", line 106, in pandas.index.IndexEngine.get_value (pandas\index.c:3240)

  File "pandas\index.pyx", line 156, in pandas.index.IndexEngine.get_loc (pandas\index.c:4363)

KeyError: 'StartIP'

1 个答案:

答案 0 :(得分:0)

这是一个熊猫解决方案:假设dfl

         StartRange
Name               
loc1   172.28.10.15
loc2  172.28.10.128
loc3    172.28.12.0
loc4   172.28.12.58

我们将int列表中的第一个字符串翻译为算术:

dfl['StartRange']= dfl.StartRange.apply(lambda s : [int(x) for x in s.split('.')])
dfl['EndIP']=dfl.StartRange.shift(-1)
dfl.ix[-1,'EndIP']=[255,255,255,255]  

def adjust(row):
    start,end=row
    return min( start[:3]+[255],end[:3]+[end[3]-1]) 

dfl['EndIP']=dfl.apply(adjust,axis=1)
dfend=dfl.applymap(lambda l : '.'.join([str(x) for x in l]))

然后dfend

         StartRange          EndIP
Name                              
loc1   172.28.10.15  172.28.10.127
loc2  172.28.10.128  172.28.10.255
loc3    172.28.12.0   172.28.12.57
loc4   172.28.12.58  172.28.12.255