我有一个具有以下格式的数据框列:
col1 col2
A [{'Id':42,'prices':['30',’78’]},{'Id': 44,'prices':['20','47',‘89’]}]
B [{'Id':47,'prices':['30',’78’]},{'Id':94,'prices':['20']},{'Id':84,'prices':['20','98']}]
如何将其转换为以下内容?
col1 Id price
A 42 ['30',’78’]
A 44 ['20','47',‘89’]
B 47 ['30',’78’]
B 94 ['20']
B 84 ['20','98']
我当时正在考虑使用apply和lambda作为解决方案,但不确定如何。
编辑:为了重新创建此数据框,我使用以下代码:
data = [['A', "[{'Id':42,'prices':['30','78']},{'Id': 44,'prices':['20','47','89']}]"],
['B', "[{'Id':47,'prices':['30','78']},{'Id':94,'prices':['20']},{'Id':84,'prices':['20','98']}]"]]
df = pd.DataFrame(data, columns = ['col1', 'col2'])
答案 0 :(得分:5)
在col2
列中是否有列表的解决方案:
print (type(df['col2'].iat[0]))
<class 'list'>
L = [{**{'col1': a}, **x} for a, b in df[['col1','col2']].to_numpy() for x in b]
df = pd.DataFrame(L)
print (df)
col1 Id prices
0 A 42 [30, 78]
1 A 44 [20, 47, 89]
2 B 47 [30, 78]
3 B 94 [20]
4 B 84 [20, 98]
如果有字符串:
print (type(df['col2'].iat[0]))
<class 'str'>
import ast
L = [{**{'col1': a}, **x} for a, b in df[['col1','col2']].to_numpy() for x in ast.literal_eval(b)]
df = pd.DataFrame(L)
print (df)
col1 Id prices
0 A 42 [30, 78]
1 A 44 [20, 47, 89]
2 B 47 [30, 78]
3 B 94 [20]
4 B 84 [20, 98]
为了更好地理解,可以使用:
import ast
L = []
for a, b in df[['col1','col2']].to_numpy():
for x in ast.literal_eval(b):
d = {'col1': a}
out = {**d, **x}
L.append(out)
df = pd.DataFrame(L)
print (df)
col1 Id prices
0 A 42 [30, 78]
1 A 44 [20, 47, 89]
2 B 47 [30, 78]
3 B 94 [20]
4 B 84 [20, 98]
答案 1 :(得分:2)
将“数据”的第二个参数视为列表。
data= [
['A', [{'Id':42,'prices':['30','78']},{'Id': 44,'prices':['20','47','89']}]],
['B', [{'Id':47,'prices':['30','78']}, {'Id':94,'prices':['20']},{'Id':84,'prices':
['20','98']}]]
]
t_list = []
for i in range(len(data)):
for j in range(len(data[i][1])):
t_list.append((data[i][0], data[i][1][j]['Id'], data[i][1][j]['prices']))
df = pd.DataFrame(t_list, columns=['col1', 'id', 'price'])
print(df)
col1 id price
0 A 42 [30, 78]
1 A 44 [20, 47, 89]
2 B 47 [30, 78]
3 B 94 [20]
4 B 84 [20, 98]
答案 2 :(得分:2)
您可以在此处将df.explode
与pd.Series.apply
和df.set_index
和df.reset_index
一起使用
public static void main(String[] args) {
Scanner number = new Scanner(System.in);
System.out.print("Enter something : ");
String userInput = number.nextLine();
if (userInput.startsWith("J") && userInput.length() > 3) {
System.out.println("Your name is over 3 letters and starts with the letter J!");
} else if (userInput.length() < 3) {
System.out.println("Your name is too short!");
} else if (userInput.length() > 10) {
System.out.println("Your name is too long!");
}
}
当 df.set_index('col1').explode('col2')['col2'].apply(pd.Series).reset_index()
col1 Id prices
0 A 42 [30, 78]
1 A 44 [20, 47, 89]
2 B 47 [30, 78]
3 B 94 [20]
4 B 84 [20, 98]
为字符串时,使用ast.literal_eval
col2