Python Panda CSV管理

时间:2019-01-03 18:48:14

标签: python pandas csv

您可以通过Juypter Ipython快速查询熊猫。我已经编写了以下代码,并通过其他一些自动化工作来尝试为朋友做生意。如果我想像在Excel中一样使用“-”作为分隔符将第一列分为2,我该如何通过Ipython在Pandas中做到这一点?因此,说“ Red Bull-225825”的描述将变为“ Red Bull”,并且将在描述的左侧创建一个新列,称为“ XYZ”,其值为225825。 null值为null。

import pandas as pd
df.columns = df.iloc[1]
df = pd.read_csv("3.csv", skiprows=range(0, 2))
df[['Description','Total Qty','Total Sales']].dropna().to_csv("new1.csv",index=False)

Description needs splitting

谢谢

3 个答案:

答案 0 :(得分:0)

import pandas as pd
d = {'Description': ['Red Bull-225825'], 'TotalQty': [61], 'TotalSales' : [90.89]}
df = pd.DataFrame(data=d)
df[['Description','XYZ']] = df['Description'].str.split('-',expand=True)
df = df[['XYZ', 'Description', 'TotalQty', 'TotalSales']]
df

enter image description here

答案 1 :(得分:0)

这是我的看法:

import pandas as pd
from io import StringIO

TESTDATA = StringIO("""Description,TotalQty,TotalSales
ACME, 11, 1
Evil Corp, 10, 2
Google-Alphabet, 100, 0""")

df = pd.read_csv(TESTDATA, sep=",")

def splitfun(row):
    if '-' in row['Description']:
        val1, val2 = row['Description'].split('-')
        return pd.Series({'Description': val1, 'AfterDash': val2})
    else:
        return pd.Series({'Description': row['Description'], 'AfterDash': None})

df[['Description','AfterDash']]=df.apply(splitfun, axis=1)

print(df)

  Description  TotalQty  TotalSales AfterDash
0        ACME        11           1      None
1   Evil Corp        10           2      None
2      Google       100           0  Alphabet

答案 2 :(得分:0)

datadict = {'Desc': ['Sale', 'Red Bull-968313', 'Lotto', 'ABC-11123'],
            'Total Qty': [1,2,3,4],
            'Total Sale': [5,6,7,8]
            }

import pandas as pd
df = pd.DataFrame.from_dict(datadict)
print (df)
#              Desc  Total Qty  Total Sale
#0             Sale          1           5
#1  Red Bull-968313          2           6
#2            Lotto          3           7
#3        ABC-11123          4           8

df['Desc Number'] = df['Desc'].str.split('-')
df['Desc'] = [i[0] for i in df['Desc Number']]
df['Desc Number'] = [i[1] if len(i)>1 else None for i in df['Desc Number']]
df = df[['Desc Number', 'Desc', 'Total Qty', 'Total Sale']]
print (df)

#  Desc Number      Desc  Total Qty  Total Sale
#0        None      Sale          1           5
#1      968313  Red Bull          2           6
#2        None     Lotto          3           7
#3       11123       ABC          4           8

此答案将说明您需要的None /空值