如何使用pandas仅从文本文件中循环特定列?

时间:2017-07-28 11:12:17

标签: python pandas

我只想做这个循环:

for col in result.columns:
    result[col] = result[col].str.strip("{} ")

列" 1H.L"和" 1H_2.L"因为其他列不是字符串。

我的代码是:

import pandas as pd

result = {}
text = 'fe'
filename = 'fe_yellow.xpk'

if text == 'ee':
    df = pd.read_csv('peaks_ee.xpk', sep=" ",skiprows=5)

    shift1= df["1H.P"]
    shift2= df["1H_2.P"]

    if filename == 'ee_pinkH1.xpk':
        mask = ((shift1>5.1) & (shift1<6)) & ((shift2>7) & (shift2<8.25))
    elif filename == 'ee_pinkH2.xpk':
       mask = ((shift1>3.25)&(shift1<5))&((shift2>7)&(shift2<8.5))

result = df[mask]
result = result[["1H.L","1H.P","1H_2.L","1H_2.P"]]

for col in result.columns:
    result[col] = result[col].str.strip("{} ")
result.drop_duplicates(keep='first', inplace=True)

tclust_atom=open("tclust_ppm.txt","w+")
result.to_string(tclust_atom, header=False)

我正在阅读的文件:

label dataset sw sf
1H 1H_2
NOESY_F1eF2e.nv
4807.69238281 4803.07373047
600.402832031 600.402832031
1H.L 1H.P 1H.W 1H.B 1H.E 1H.J 1H.U 1H_2.L 1H_2.P 1H_2.W 1H_2.B 1H_2.E 1H_2.J 1H_2.U vol int stat comment flag0 flag8 flag9
0 {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
1 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
2 {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
3 {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
4 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {2.H1'} 5.90291 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
5 {2.H1'} 5.90291 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
6 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
7 {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
8 {1.H1'} 5.82020 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0
9 {1.H8} 8.13712 0.05000 0.10000 ++ {0.0} {} {2.H8} 7.61004 0.05000 0.10000 ++ {0.0} {} 0.0 100.0000 0 {} 0 0 0

我希望我的输出看起来像

1.H1' 5.82020 0.3
2.H8 7.61004 0.3
1.H8 8.13712 0.3
2.H1' 5.90291 0.3

第一栏来自栏目&#34; 1H.L&#34;和&#34; 1H_2.L&#34;第二个来自&#34; 1H.P&#34;和&#34; 1H_2.P&#34;而第三列只是想要我想为每一行写。我怎么能这样做?

2 个答案:

答案 0 :(得分:1)

为什么你不能直接前进,

for col in result.columns:
    if col == ("1H.L" | "1H_2.L"):
        result[col] = result[col].str.strip("{} ")

答案 1 :(得分:1)

您可以简单地传递列名列表,即

result = pd.DataFrame({"1H.L":['{Nice}','{SO}'],"1H_2.L":['{Nice}','{SO}'],"2H.L":['Nice','SO']})

for col in ['1H.L','1H_2.L']:
    result[col] = result[col].str.strip("{} ")

输出:

   1H.L 1H_2.L  2H.L
0  Nice   Nice  Nice
1    SO     SO    SO