所以现在我有一个字典dataDict
,其密钥的值为'node-key'
,其中node是特定的csv文件头,key是每个文件中可能存在或不存在的字段。我有normalDict['Time']
从dataDict[node-'Time']
开始占用所有时间并将它们整理好。我想为'node-key'
中的每个dataDict
条目制作一个规范化的字典。因此,我遍历normalDict['Time']
的值,如果i
位于dataDict[nodeTime]
,我想将dataDict[nodeKey]
的值添加到与dataDict[nodeTime]
相同的位置normalDict[nodeKey]
,如果normalDict['Time']
的值不在dataDict[nodeTime]
,我想将'nan'
追加到normalDict[nodeKey]
。
(我的脚本当前是图形dataDict [nodeKey] vs dataDict [nodeTime],我希望将时间标准化为包含所有节点在一个轴上的时间,并在dataDict [nodeKey]中添加'nan'到值没有的位置发生)
编辑:澄清: 所以我要说我有
所以说那个
dataDict['1-Time'] = ['12:00','1:00','2:00']
dataDict['2-Time'] = ['12:30','1:30','2:30','3:30']
我还拥有与列出的两次具有相同项目数的键的值:
dataDict['1-lattitude']=['0','1','2']
dataDict['2-lattitude']= ['1','2','3','3']
和
normalDict['Time'] = ['12:00','12:30','1:00','1:30','2:00','2:30','3:30']
我想要
normalDict['1-lattitude'] = [ '0', 'nan', '1', 'nan', 2', 'nan', 'nan']`
和
normalDict['2-lattitude'] = ['nan', '1', 'nan', '2', 'nan', '3', '3']
因此,normalDict中的每个键都具有与normalDict['Time']
这是我的方法,我已经评论了关于如何访问特定密钥的各个项目在我脑海中有意义的特定行,但由于解压缩错误,我知道它不正确。任何帮助都会受到赞赏,因为我可能会在以后的脚本中遇到这个问题。
def normalizeDataByField(self, fileName,keyNames):
#this normalizes time how I want
setTimes = set()
listTimes = []
tupleTime=[]
for i in range(len(fileName)):
node = self.deriveNodeName(fileName[i])
nodeTime= '%s-Time' %(node)
for key in dataDict:
if 'Time' in key and node in key:
for i in dataDict[key]:
setTimes.add(i)
listTimes+=setTimes
listTimes.sort()
normalDict['Time'] = listTimes
for a in range(len(fileName)):
node = node = self.deriveNodeName(fileName[a])
nodeTime='%s-Time' %(node)
for key in keyNames:
nodeKey= '%s-%s' %(node,key)
for i,j in normalDict['Time'],dataDict[nodeKey]: #this is my flaw in logic as I get ValueError: too many values to unpack
print "looking for %s in dataDict[nodeTime]" %(i)
if i in dataDict[nodeTime]:
print "%s found in dataDict[%s]" %(i,nodeTime)
normalDict[nodeKey].append(j)
else:
print "%s not found in dataDict[%s]. Appending 'nan'" %(i,nodeTime)
normalDict[nodeKey].append('nan')
答案 0 :(得分:0)
所以我相信我能够回答我自己的问题,但它有点凌乱。我决定将每个dataDict[nodeKey]
存储到列表dataList
中,并根据列表索引normalDict[nodeKey]
将值附加到j
。我在for循环创建normalDict['Time']
时也发现了一个错误。
def normalizeDataByField(self, fileName,keyNames):
setTimes = set()
listTimes = []
tupleTime=[]
for i in range(len(fileName)):
node = self.deriveNodeName(fileName[i])
nodeTime= '%s-Time' %(node)
for key in dataDict:
if nodeTime in key:
for i in dataDict[key]:
setTimes.add(i)
listTimes+=setTimes
listTimes.sort()
normalDict['Time'] = listTimes
for a in range(len(fileName)):
node = node = self.deriveNodeName(fileName[a])
nodeTime='%s-Time' %(node)
for key in keyNames:
if 'Time' not in key:
nodeKey= '%s-%s' %(node,key)
dataList = dataDict[nodeKey]
j=0
for i in normalDict['Time']:
print "looking for %s in dataDict[nodeTime]" %(i)
if i in dataDict[nodeTime]:
print "%s found in dataDict[%s]. Appending value." %(i,nodeTime)
try:
normalDict[nodeKey].append(dataList[j])
j+=1
print "j = ",j
except:
print "Index out of range. j = ", j
else:
print "%s not found in dataDict[%s]. Appending 'nan'" %(i,nodeTime)
normalDict[nodeKey].append('nan')
for key in normalDict:
print "len(normalDict[%s] = %d" %(key, len(normalDict[key]))
最后一个for循环显示normalDict中的所有键具有相同的长度。
也许这不是最漂亮的解决方案,但我能够创建一个包含我需要的价值的字典,并且可以使用'' nan'必要时与normalDict['Time']
答案 1 :(得分:0)
time_key='1-Time'
lattitude_key='1-lattitude'
def normalize(normalDict, dataDict, lattitude_key, time_key):
normalDict[lattitude_key]=['nan' for _ in range(len(normalDict['Time']))]
for i, e in enumerate(dataDict[time_key]):
normalDict[lattitude_key][normalDict['Time'].index(e)]=dataDict[lattitude_key][i]