我有dict
:
{'Hours Outside Sprint': [5.25, 5.0, 0.0],
'Sprint End': ['2017-02-14', '2017-02-14', '2017-02-14'],
'Sprint Start': ['2017-01-31', '2017-01-31', '2017-01-31'],
'Status': ['done', 'done', 'done'],
'Story': ['SPGC-14075', 'SPGC-9456', 'SPGC-9445'],
'Story Actual (Hrs)': [11.0, 12.75, 0.0],
'Story Estimate (Hrs)': [16.0, 12.0, 0.0]}
我认为这是一项相当简单的任务,但目前解决方案并不明显。我想要做的是遍历此dict
并进行以下操作:
[[done, 2017-02-14, SPGC-14075, 16.0, 5.25, 2017-01-31, 11.0], ... ]
所以每个列表的第一个元素都在一起,所有第二个元素都在一起,依此类推,直到我有一个列表列表。我该怎么做?
编辑:
以下是pandas数据框看起来产生上述字典的内容:
Story Status Story Estimate (Hrs) Story Actual (Hrs) Hours Outside Sprint Sprint Start Sprint End
0 SPGC-14075 done 16.0 11.00 5.25 2017-01-31 2017-02-14
1 SPGC-9456 done 12.0 12.75 5.00 2017-01-31 2017-02-14
2 SPGC-9445 done 0.0 0.00 0.00 2017-01-31 2017-02-14
iterrows
会有效吗?
答案 0 :(得分:1)
以下是我将如何在Python中执行此操作:
df_dict = {'Status': [u'done', u'done', u'done'], 'Sprint End': ['2017-02-14', '2017-02-14', '2017-02-14'], 'Story': [u'SPGC-14075', u'SPGC-9456', u'SPGC-9445'], 'Story Estimate (Hrs)': [16.0, 12.0, 0.0], 'Hours Outside Sprint': [5.25, 5.0, 0.0], 'Sprint Start': ['2017-01-31', '2017-01-31', '2017-01-31'], 'Story Actual (Hrs)': [11.0, 12.75, 0.0]}
result = []
lengthOfFirstArrInDict = len(df_dict[df_dict.keys()[0]])
for i in range(0, lengthOfFirstArrInDict):
nestedList = []
for key in df_dict.keys():
nestedList.append(df_dict[key][i])
result.append(nestedList)
print(result)
这是输出:
[['done', '2017-02-14', 'SPGC-14075', 16.0, 5.25, '2017-01-31', 11.0], ['done', '2017-02-14', 'SPGC-9456', 12.0, 5.0, '2017-01-31', 12.75], ['done', '2017-02-14', 'SPGC-9445', 0.0, 0.0, '2017-01-31', 0.0]]
答案 1 :(得分:1)
df.iterrows
提供了一个非常巧妙的解决方案。确保切出行索引:
(i[0] = row_index; i[1] = row_values
)
df = pd.DataFrame(df_dict)
#re-order columns (may not be necessary depending on your original df)
df = df[['Status','Sprint End','Story','Story Estimate (Hrs)','Hours Outside Sprint','Sprint Start','Story Actual (Hrs)']]
values = [i[1].tolist() for i in df.iterrows()]
答案 2 :(得分:1)
每当你需要在Python中组合来自两个或多个序列的连续元素时,请考虑zip()
:
from pprint import pprint
data = {'Hours Outside Sprint': [5.25, 5.0, 0.0],
'Sprint End': ['2017-02-14', '2017-02-14', '2017-02-14'],
'Sprint Start': ['2017-01-31', '2017-01-31', '2017-01-31'],
'Status': ['done', 'done', 'done'],
'Story': ['SPGC-14075', 'SPGC-9456', 'SPGC-9445'],
'Story Actual (Hrs)': [11.0, 12.75, 0.0],
'Story Estimate (Hrs)': [16.0, 12.0, 0.0]}
# desired order of items in the result
key_order = ('Status', 'Sprint End', 'Story', 'Story Estimate (Hrs)',
'Hours Outside Sprint', 'Sprint Start', 'Story Actual (Hrs)')
pprint([x[0] for x in zip(data[k] for k in key_order)])
输出:
[['done', 'done', 'done'],
['2017-02-14', '2017-02-14', '2017-02-14'],
['SPGC-14075', 'SPGC-9456', 'SPGC-9445'],
[16.0, 12.0, 0.0],
[5.25, 5.0, 0.0],
['2017-01-31', '2017-01-31', '2017-01-31'],
[11.0, 12.75, 0.0]]
答案 3 :(得分:0)
map(lambda x: list(x),zip(*map(lambda (k,v): v, df_dict.iteritems())))
或
map(lambda x: list(x),zip(*df_dict.values()))
你可以逐个删除绝对方法调用,看看你得到的每一步
它只不过是对数据进行转换。
*df_dict.values()
表示您可以将列表作为参数传递给需要参数的函数,如下所示:
def fun(arg1, arg2, arg3 ...)
答案 4 :(得分:0)
你可以试试这个:
df_dict = {'Status': [u'done', u'done', u'done'], 'Sprint End': ['2017-02-14', '2017-02-14', '2017-02-14'], 'Story': [u'SPGC-14075', u'SPGC-9456', u'SPGC-9445'], 'Story Estimate (Hrs)': [16.0, 12.0, 0.0], 'Hours Outside Sprint': [5.25, 5.0, 0.0], 'Sprint Start': ['2017-01-31', '2017-01-31', '2017-01-31'], 'Story Actual (Hrs)': [11.0, 12.75, 0.0]}
vals = df_dict.values()
final_data = list(map(list, zip(*vals)))
print(final_data)
输出:
[[16.0, 5.25, 11.0, '2017-02-14', 'done', 'SPGC-14075', '2017-01-31'], [12.0, 5.0, 12.75, '2017-02-14', 'done', 'SPGC-9456', '2017-01-31'], [0.0, 0.0, 0.0, '2017-02-14', 'done', 'SPGC-9445', '2017-01-31']]