使用python单个输入csv文件

时间:2014-08-20 12:28:41

标签: python csv pandas

一直在寻找一个看起来非常简单的问题的答案,但找不到它,希望有人可以提供帮助。

我尝试使用python将csv文件的一行中的字段与主代码列表进行比较,并添加到它出现的次数。比如在excel中使用countif()和vlookup()函数。

这个想法是当相关文件到达我的目录时,我可以运行这个脚本而不必打开excel并手动完成工作。

麻烦的是,我似乎无法自己提取字段"与if语句等一起使用。

以下代码突出了我一直试图采取的方向。

首先我尝试了CSV模块。它成功地从文件中检索了一行 - 但是我无法使用" .split(",")[n]方法获取特定字段,因为它返回单个字母,而不是整个领域(我不明白为什么)。

即使我可以返回该字段,CSV的某些数据字段中也有逗号,因此有效地没有固定的字段编号,因为此方法存在问题。 (我尝试将它转换为纯文本中的.txt,虽然有效,但不是"可行的")

接下来我尝试了pandas,但是我无法将行排除,只有列作为一个整体,因此第二个print语句而不是打印行2打印出整个数据集列的第二个实例< / p>

我真正想要的是能够将第5行第4列视为适当的字符串/数字。

任何帮助非常感谢

os.system('clear')
identifier = []
data_line1 = []
data_line2=[]
data = []
io = []

path = '/home/data'
newest = max(glob.iglob('/home/data/*.csv'), key=os.path.getmtime) #assigns the name of the newest file in the dir to newest.

for i in range(0,5):
    with open(newest, 'rtU') as file:
        line1 = list(file)[i]
        data_line1.append(line1)

        line2 = pandas.read_csv(newest,sep=",", usecols=(4,5,16),header=1)
        data_line2.append(line2)

print "###################################################################################################"
print  data_line1[4]

print "###################################################################################################"
print data_line2[1]

示例数据集(对不起,所有行都是相同的,不能使用真实数据,因为它不是我的):

31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout

以上脚本的输出显示行(对于方法1)或对应于列表编号的数据组的实例(对于方法2):

###################################################################################################
31/07/14 17:44,Standard P,727013,,,1002821,some info in here,a thing here,35,4.93,172.55,0,another thing here,some stuff here,a place here,"surname,  name",0000-6677-009899-09,572011,knockout

###################################################################################################
   Unnamed: 4  1002821  0000-6677-009899-09
0         NaN  1002821  0000-6677-009899-09
1         NaN  1002821  0000-6677-009899-09
2         NaN  1002821  0000-6677-009899-09
3         NaN  1002821  0000-6677-009899-09
4         NaN  1002821  0000-6677-009899-09
5         NaN  1002821  0000-6677-009899-09
6         NaN  1002821  0000-6677-009899-09
7         NaN  1002821  0000-6677-009899-09

我迫切希望找到适合拼图的两种方法......

有问题的真实文件有一堆条目,所以如果有人能指出我做错的方向,那就太棒了:)

P.S。新手到python并做这样的事情来帮助我学习,如果答案真的很简单,那么道歉......

0 个答案:

没有答案