是否有可能在phython中使用更大的List?

时间:2019-08-04 09:51:12

标签: python pandas

对于学校,我必须创建一个有关wifisignals的项目,并且我正在尝试将数据放入数据框。 有208.000行数据。 并且,当涉及以下代码时,该代码不会完成。代码就像陷入了无限循环。 但是,当我仅使用1000行时,我的程序就可以运行。所以我认为,如果可能的话,我的清单很小。 phython中是否存在更大的列表?还是因为我使用了错误的编码? 预先感谢。

编辑1: (数据是原始数据框,而wifiinfo是其中的一列) 我有这种格式:

df = pd.DataFrame(columns=['Sender','Time','Date','Place','X','Y','Bezetting','SSID','BSSID','Signal'])

我试图从列SSID中填充BSSIDSignalWifiInfo,为此,我必须拆分数据。

这是1 WifiInfo的样子:

ODISEE@88-1d-fc-41-dc-50:-83,ODISEE@88-1d-fc-2c-c0-00:-72,ODISEE@88-1d-fc-41-d2-d0:-82,CiscoC5976@58-6d-8f-19-14-38:-78,CiscoC5959@58-6d-8f-19-13-f4:-93,SNB@c8-d7-19-6f-be-b7:-99,ODISEE@88-1d-fc-2c-c5-70:-94,HackingDemo@58-6d-8f-19-11-48:-156,ODISEE@88-1d-fc-30-d4-40:-85,ODISEE@88-1d-fc-41-ac-50:-100

我当前的方法如下:

for index, row in data.iterrows():
    bezettingList = list()
    ssidList = list()
    bssidList = list()
    signalList = list()

    #WifiInfo splitting  
    wifis = row.WifiInfo.split(',')
    for wifi in wifis:
        #split wifi and add to List
        ssid, bssid = wifi.split('@')
        bssid, signal = bssid.split(':')
        ssidList.append(ssid)
        bssidList.append(bssid)
        signalList.append(int(signal))

    #add bezettingen to List 
    bezettingen = row.Bezetting.split(',')
    for bezetting in bezettingen:
        bezettingList.append(bezetting) 

    #add list to dataframe
    df.loc[index,'SSID'] = ssidList
    df.loc[index,'BSSID'] = bssidList
    df.loc[index,'Signal'] = signalList
    df.loc[index,'Bezetting'] = bezettingList

df.head()

1 个答案:

答案 0 :(得分:0)

IIUC,您需要先用逗号将行炸开,这样:

    SSID    BSSID   Signal  WifiInfo
0   NaN     NaN     NaN     ODISEE@88-1d-fc-41-dc-50:-83,ODISEE@88- ...

成为这个:

    SSID    BSSID   Signal  WifiInfo
0   NaN     NaN     NaN     ODISEE@88-1d-fc-41-dc-50:-83
1   NaN     NaN     NaN     ODISEE@88-1d-fc-2c-c0-00:-72
2   NaN     NaN     NaN     ODISEE@88-1d-fc-41-d2-d0:-82
3   NaN     NaN     NaN     CiscoC5976@58-6d-8f-19-14-38:-78
4   NaN     NaN     NaN     CiscoC5959@58-6d-8f-19-13-f4:-93
5   NaN     NaN     NaN     SNB@c8-d7-19-6f-be-b7:-99
6   NaN     NaN     NaN     ODISEE@88-1d-fc-2c-c5-70:-94
7   NaN     NaN     NaN     HackingDemo@58-6d-8f-19-11-48:-156
8   NaN     NaN     NaN     ODISEE@88-1d-fc-30-d4-40:-85
9   NaN     NaN     NaN     ODISEE@88-1d-fc-41-ac-50:-100
# use `.explode`
data = data.assign(WifiInfo=data.WifiInfo.str.split(',')).explode('WifiInfo')

现在您可以使用.str.extract

data['SSID'] = data['WifiInfo'].str.extract(r'(.*)@')
data['BSSID'] = data['WifiInfo'].str.extract(r'@(.*):')
data['Signal'] = data['WifiInfo'].str.extract(r':(.*)')
    SSID        BSSID               Signal  WifiInfo
0   ODISEE      88-1d-fc-41-dc-50   -83     ODISEE@88-1d-fc-41-dc-50:-83
1   ODISEE      88-1d-fc-2c-c0-00   -72     ODISEE@88-1d-fc-2c-c0-00:-72
2   ODISEE      88-1d-fc-41-d2-d0   -82     ODISEE@88-1d-fc-41-d2-d0:-82
3   CiscoC5976  58-6d-8f-19-14-38   -78     CiscoC5976@58-6d-8f-19-14-38:-78
4   CiscoC5959  58-6d-8f-19-13-f4   -93     CiscoC5959@58-6d-8f-19-13-f4:-93
5   SNB         c8-d7-19-6f-be-b7   -99     SNB@c8-d7-19-6f-be-b7:-99
6   ODISEE      88-1d-fc-2c-c5-70   -94     ODISEE@88-1d-fc-2c-c5-70:-94
7   HackingDemo 58-6d-8f-19-11-48   -156    HackingDemo@58-6d-8f-19-11-48:-156
8   ODISEE      88-1d-fc-30-d4-40   -85     ODISEE@88-1d-fc-30-d4-40:-85
9   ODISEE      88-1d-fc-41-ac-50   -100    ODISEE@88-1d-fc-41-ac-50:-100

如果您想在列爆炸后保持数据分组,我将首先为每组条目分配一个ID:

data['Group'] = pd.factorize(data['WifiInfo'])[0]+1
    SSID    BSSID   Signal  WifiInfo                                     Group 
0   NaN     NaN     NaN     ODISEE@88-1d-fc-41-dc-50:-83,ODISEE@88- ...  1
1   NaN     NaN     NaN     ASD@22-1d-fc-41-dc-50:-83,QWERTY@88-    ...  2
# after you explode the column
SSID        BSSID           Signal  WifiInfo                        Group 
ODISEE      88-1d-fc-41-dc-50   -83 ODISEE@88-1d-fc-41-dc-50:-83    1
ODISEE      88-1d-fc-2c-c0-00   -72 ODISEE@88-1d-fc-2c-c0-00:-72    1
...
...
ASD         22-1d-fc-41-dc-50   -83 ASD@88-1d-fc-41-dc-50:-83       2
QWERTY      88-1d-fc-2c-c0-00   -72 QWERTY@88-1d-fc-2c-c0-00:-72    2