I have a question regarding the structure of my code. I have the following csv
name product country
A game1 USA
A game2 USA
B bis World
.
.
Basically, the name of each vendor appears multiple times (as many as the number of products the vendor has). My purpose is to create a csv which contains the name of the vendor, number of products and the country(if the value is "world" I will assign 5 or else 1). So far I have not managed to do using a more algorithmic mindset. Instead I have used the next code
df = pd.read_csv("testtest.csv")
num_listings = df['vendor_name'].value_counts().to_dict()
print(num_listings)
and then I converted the dictionary to a csv file. I assume that using a for loop could make my code easier since I could use a counter and as long as the name remains the same just use that counter. I do not know how should i approach it. I already tried the following but it did not work.
ds = pd.read_csv("testtest.csv", index_col = 'vendor_name')
x=0
for index in ds:
if ds['index'] == ds['index']:
x=x+1
print(x)
Any help?
答案 0 :(得分:1)
Use groupby.agg
with a dictionary of aggregation functions for each column.
import pandas as pd
d = {'product': pd.Series.nunique,
'country': lambda x: 5 if (x=='World').any() else 1}
df.groupby('name').agg(d).reset_index()
name product country
0 A 2 1
1 B 1 5