我是python的初学者。我正在开发一个项目,我有以下模式的数据:
json文件中的数据如下所示:
“price_time”:[1398823200,1403154000,1403247600,1403301600,1403380800], “PRICE_VALUE”:[901,909,918,927,936],], “salesRank_value”:[2176,2318,2192,1801,1829]
df.head()命令如下所示:
>>> df.head()
1974-12-11 20:55:21
price_time [1398823200, 1403154000, 1403247600, 140330160...
price_value [901, 909, 918, 927, 936, 945, 954, 963, 972, ...
rating_time [1475972640]
rating_value [43]
review_count [6558, 6560, 6561, 6562, 6564, 6566, 6568, 656...
df = pd.read_json('results.json')
In [] : print(df.head())
output :
price_time [1398823200, 1403154000, 1403247600, 140330160...
price_value [901, 909, 918, 927, 936, 945, 954, 963, 972, ...
salesRank_value [2176, 2318, 2192, 1801, 1829, 2207, 1757, 177...
我想将这些数据转换为以下模式:
price_time price_value salesRank_value
1398823200 901 2176
1403154000 909 2318
1403247600 918 2192
依旧...... 我写的代码在这里,但我无法得到理想的结果:
import pandas as pd
df1={}
df1['price_time'] = df.loc['price_time']
df1['price_value'] = df.loc['price_value']
print(df1)
output:
{'price_value': 1974-12-11 20:55:21 [901, 909, 918, 927, 936, 945, 954, 963, 972, ...
Name: price_value, dtype: object, 'price_time': 1974-12-11 20:55:21 [1398823200, 1403154000, 1403247600, 140330160...
Name: price_time, dtype: object}
答案 0 :(得分:0)
price_time = [1398823200, 1403154000, 1403247600, 140330160]
price_value = [901, 909, 918, 927]
salesRank_value = [2176, 2318, 2192, 1801]
listdata = zip(price_time,price_value,salesRank_value)
print listdata
答案 1 :(得分:0)
我猜你在单个字符串中有数据(行由换行符区分)或在文件中然后你可以使用下面的一个字符串。 假设单个字符串变量data = df.head()中的数据如下所示:
'price_time [1398823200, 1403154000, 1403247600]\nprice_value [901, 909, 918]\nsalesRank_value [2176, 2318, 2192]'
您可以使用以下内容获取所需的数组:
array=[a.split() for a in data.replace("[","").replace(",","").replace("]","").split('\n')]
输出(2D数组,每个内部数组包含每一行,第一个元素作为行名称并保留为数据):
[['price_time', '1398823200', '1403154000', '1403247600'], ['price_value', '901', '909', '918'], ['salesRank_value', '2176', '2318', '2192']]
如果您拥有文件data.txt中的数据,请执行以下操作:
price_time [1398823200, 1403154000, 1403247600]
price_value [901, 909, 918]
salesRank_value [2176, 2318, 2192]
然后使用以下内容:
array=[line.replace("[","").replace(",","").replace("]","").split() for line in open('data.txt')]
再次输出二维数组中的输出:
[['price_time', '1398823200', '1403154000', '1403247600'], ['price_value', '901', '909', '918'], ['salesRank_value', '2176', '2318', '2192']]
对于您提供的json文件数据:
"price_time":[1398823200,1403154000,1403247600,1403301600,1403380800],"price_value":[901,909,918,927,936],"salesRank_value":[2176,2318,2192,1801,1829]
使用它而不需要pandas:
array=[b.split() for b in open('data.json').read().replace('"',"").replace(":["," ").replace("],","\n").replace(","," ").replace("]","").split('\n')]
print array
(有一种更简洁的方法来删除非字母数字字符,但因为我需要格式化字符串,因为我想要使用它) 像早期的2D数组中的输出:
[['price_time', '1398823200', '1403154000', '1403247600', '1403301600', '1403380800'], ['price_value', '901', '909', '918', '927', '936'], ['salesRank_value', '2176', '2318', '2192', '1801', '1829']]
以表格形式查看结果:
for z in range(len(array[0])):
temp=''
for y in range(len(array)):
temp+=array[y][z]+'\t'
temp+='\n'
print temp
输出:
price_time price_value salesRank_value
1398823200 901 2176
1403154000 909 2318
1403247600 918 2192
1403301600 927 1801
1403380800 936 1829
要获得更漂亮的输出,请使用:
s = [[str(e) for e in row] for row in array]
lens = [max(map(len, col)) for col in zip(*s)]
fmt = ' '.join('{{:{}}}'.format(x) for x in lens)
table = [fmt.format(*row) for row in s]
print '\n'.join(table)
输出:
price_time 1398823200 1403154000 1403247600 1403301600 1403380800
price_value 901 909 918 927 936
salesRank_value 2176 2318 2192 1801 1829