Good afternoon,
I reading in a pcap and am basically trying to get a dedup'd list of BSSID's & ESSID's. I am still getting duplicates with this code and cannot for the life of me figure out what I am doing wrong:
if not (t[0] in ssid_pairs and ssid_pairs[t[0]] == t[1]):
ssid_pairs[t[0]] = t[1]
of.write(t[0] + ',' + t[1] + ((',' + f + '\n') if verbose else '\n'))
ssid_pairs is a dictionary, t[0] is the bssid & t[1] is the essid. An example of the dictionary is:
{'FF:FF:FF:FF:FF:FF':'MyWIFI',...}
I am still seeing multiple instances of the same key->value pair being written to the file. I put some debugging print statements in and sometimes it will recognize a duplicate, sometimes it will not. This is from a parse pcap with scapy.
Thanks for any help.
*** EDIT: Thanks everyone, I am not really solving my problem the right way with a dictionary. Time to think this through a bit clearer...
答案 0 :(得分:1)
Dictionaries can't have duplicates:
some_data = [('foo', 'bar'),
('bang', 'quux'),
('foo', 'bar'),
('zappo', 'whoo'),
]
mydict = {}
for data in some_data:
mydict[data[0]] = data[1]
import pprint; print(mydict)
The only way you're going to re-write the same data is if you aren't opening your file in 'w'
mode. But
with open('outfile.txt', 'w') as of:
for key in mydict:
of.write('{},{}{}'.format(key, mydict[key], (',' + f + '\n') if verbose else '\n'))
Will never write the same line twice.
答案 1 :(得分:1)
Let's say you get:
t[0] = 'foo'
t[1] = 'bar'
Then we hit your code above:
if not (t[0] in ssid_pairs and ssid_pairs[t[0]] == t[1]):
ssid_pairs[t[0]] = t[1]
of.write(t[0] + ',' + t[1] + ((',' + f + '\n') if verbose else '\n'))
The condition passes (because t[0]
is not in ssid_pairs
), so we set:
ssid_pairs[t[0]] = t[1]
Which gives us:
ssid_pairs = {
'foo': 'bar',
}
In our next iteration of the loop, we read:
t[0] = 'foo'
t[1] = 'gizmo'
Your condition passes (because ssid_pairs[t[0]] != t[1]
), so we set:
ssid_pairs[t[0]] = t[1]
Which gives us:
ssid_pairs = {
'foo': 'gizmo',
}
Then we read the same data we encountered in our first iteration:
t[0] = 'foo'
t[1] = 'bar'
But because we just ssid_pairs['foo'] = 'gizmo'
, your condition will pass again, and you will once again write out the same data.
If you are trying to get a list of unique pairs, then maybe create a set of (essid,bssid)
tuples:
seen_pairs = set()
...
if not ((t[0],t(1)) in seen_pairs):
seen_pairs.add((t[0], t[1]))
of.write(t[0] + ',' + t[1] + ((',' + f + '\n') if verbose else '\n'))