我想基于下面3个列表中的随机值,用3列20行填充Pandas DataFrame。我无法弄清楚我在做什么错。有什么建议吗?
import random
import pandas as pd
import numpy as np
tests= ['TestA', 'TestB', 'TestC', 'TestD']
projects = ['AK', 'AA', 'JH', 'WM']
number = [10, 100, 200, 1000, 2000]
df = pd.DataFrame()
for i in range(1,21):
df = df.append(
{'TEST': random.choice(tests),
'PROJ': random.choice(projects),
'NUMBER': random.choice(number)})
答案 0 :(得分:2)
您可以使用np.random.choice
:
tests= ['TestA', 'TestB', 'TestC', 'TestD']
projects = ['AK', 'AA', 'JH', 'WM']
number = [10, 100, 200, 1000, 2000]
num_rows = 20
# for repeatability, drop in actual code
np.random.seed(1)
df = pd.DataFrame({
'TEST': np.random.choice(tests, size=num_rows),
'PROJ': np.random.choice(projects, size=num_rows),
'NUMBER': np.random.choice(number, size=num_rows)
})
输出:
TEST PROJ NUMBER
0 TestB JH 100
1 TestD AA 100
2 TestA JH 100
3 TestA AK 100
4 TestD WM 10
5 TestB AK 2000
6 TestD JH 100
7 TestB AK 10
8 TestD AA 10
9 TestA JH 1000
10 TestA JH 200
11 TestB AK 100
12 TestA WM 10
13 TestD WM 1000
14 TestB AA 100
15 TestA AA 100
16 TestC WM 1000
17 TestB JH 2000
18 TestC AK 10
19 TestA JH 100
答案 1 :(得分:2)
追加时忽略索引...
for i in range(1,21):
df = df.append(
{'TEST': random.choice(tests),
'PROJ': random.choice(projects),
'NUMBER': random.choice(number)},
ignore_index=True)
答案 2 :(得分:1)
与@ quang-hoang的版本非常相似,不同之处在于它使用的是random.choices
:
import random
import pandas as pd
import numpy as np
tests= ['TestA', 'TestB', 'TestC', 'TestD']
projects = ['AK', 'AA', 'JH', 'WM']
number = [10, 100, 200, 1000, 2000]
df = pd.DataFrame()
## add a random.seed if you want reproducibility
_t = random.choices(tests,k=20)
_p = random.choices(projects,k=20)
_n = random.choices(number,k=20)
pd.DataFrame({'Test':_t,'Project':_p,'Number':_n})