将pandas导入为np 来自pandas import Series,DataFrame
dframe_final.to_csv('C:/Program Files/source/csv_data/2015/Merged files/jjj_all_merged.csv')
我有这部分代码,我需要在此csv文件的末尾添加一个新列,并将其命名为“New_name”。 并根据不同的标准填充它:
例如,如果cell1为“a”且cell2为“b”且cell3为“1”且cell4为“2或5”,则输入“OK” 如果没有,请输入“NOT OK”或留空。
Column 1 Column 2 Column 3 Column 4 "New_name"
a b 1 2 "OK"
a b 1 5 "OK"
c d e f "NOT OK"
请帮助!!! :)
答案 0 :(得分:1)
IIUC:
mask = (df['Column 1'] == 'a') & (df['Column 2'] == 'b') & (df['Column 3'] == '1') & (df['Column 4'].isin(['2','5']))
df['new_value'] = np.where(mask,'OK','NOT OK')
输出:
Column 1 Column 2 Column 3 Column 4 "New_name" new_value
0 a b 1 2 "OK" OK
1 a b 1 5 "OK" OK
2 c d e f "NOT OK" NOT OK
答案 1 :(得分:0)
就我而言
a
选项中的 df = pd.read_csv('csv_file.csv')
表示'追加'。然后我将csv文件导入为' pd.DataFrame'。
import pandas as pd
import csv
我不知道这是不是最好的方法。
# The first argument is the column where you want to find id
# I'm unsure about what you want to subtract; subtracting the entry from
# the count columns corresponds to setting the entry to 0
some_function <- function(col, id, df) {
idx <- which(colnames(df) == col);
df[df[, idx] == id, idx + 2] <- 0;
return(df);
}
some_function("test", "two", df);
# test hyp testcount hypcount
#1 one two 3 3
#2 two one 0 3
#3 three onetwo 5 6
#4 one one 3 3
#5 onetwo two 6 3
some_function("hyp", "two", df)
# test hyp testcount hypcount
#1 one two 3 0
#2 two one 3 3
#3 three onetwo 5 6
#4 one one 3 3
#5 onetwo two 6 0
您将需要以上两件事。