我正在尝试创建一个包含4列'name''age''weight''height'的csv数据集以及此列的100行随机数据,但是第一步中的代码给了我一行而不是100行,如何我可以解决此问题,如何在csv文件中进行转换?
`import random
import pandas as pd
import numpy as np
person="person"
personList =[person+str(i) for i in range(100)]
ageList=[random.randint(1,90) for i in range(100)]
weightList=[random.randint(40,150) for i in range(100)]
heightList=[random.randint(140,210) for i in range(100)]
raw_data={'Name':[personList],
'Age':[ageList],
'Weight':[weightList],
'Height':[heightList]}
df = pd.DataFrame([raw_data])
print(df)`
答案 0 :(得分:1)
不要将值作为“列表列表”传递,即删除外部[ ]
:
raw_data={'Name': personList,
'Age': ageList,
'Weight': weightList,
'Height': heightList}
df = pd.DataFrame(raw_data)
要输出为csv,请使用:
df.to_csv('./filename.csv')
[出]
Name Age Weight Height
0 person0 23 59 158
1 person1 50 66 199
2 person2 18 100 183
3 person3 4 60 144
4 person4 14 123 188
5 person5 12 40 141
6 person6 44 65 171
7 person7 50 96 166
8 person8 82 114 166
9 person9 86 142 178
10 person10 51 93 142
11 person11 1 59 166
12 person12 61 138 152
13 person13 46 92 164
14 person14 25 103 195
15 person15 24 42 150
16 person16 33 123 186
17 person17 44 64 193
18 person18 40 118 159
19 person19 25 134 196
20 person20 5 117 178
...
另一种方法是使用numpy.random
,其中大多数方法都有一个size
参数:
import random
import pandas as pd
import numpy as np
person="person"
n = 100
personList = [person+str(i) for i in range(n)]
ageList = np.random.randint(1,90, size=n)
weightList = np.random.randint(40,150, size=n)
heightList = np.random.randint(140,210, size=n)
raw_data={'Name': personList,
'Age': ageList,
'Weight': weightList,
'Height': heightList}
df = pd.DataFrame(raw_data)
答案 1 :(得分:0)
numpy
非常擅长构建随机数组,并且pandas
在内部使用numpy
数组。所以我的建议是使用:
...
ageList=np.random.randint(1,91,100) # note the +1 on highest value for np.random.randint
weightList=np.random.randint(40,151,100)
heightList=np.random.randint(140,211,100)
raw_data={'Name':[personList],
'Age':[ageList],
'Weight':[weightList],
'Height':[heightList]}
df = pd.DataFrame(raw_data) # note passing a mapping and not a sequence