我有一个数据框中的列,我想读入R并尝试提取。
AB_lst <- read.csv("tableOut.csv", stringsAsFactors = FALSE)
AB_mass <- AB_lst$StructCalc
AB_mass_numeric <- as.numeric(AB_mass)
我希望AB_mass_numeric是一个数字向量,但每当我怀疑上面的代码时我得到
warning message: NAs introduced by coercion
当我做头(AB_mass)时输出如下:
"370.104704 ..." "365.173393 ..." "312.062840 ..." "266.151261 ..." "372.120355 ..." "210.088660 ..."
为什么会发生这种错误?如何修复它,以便我有一个带有这些值的数字类型向量?我认为这个问题与“......”有关,但我并不确定。 AB_lst的一个例子如下。
X CAS.RN Name Formula Mass
1 2 28458-24-4 (+)-Averufanin; Avermutin C20 H18 O7 370.353 g_mol
2 3 23402-09-7 (+)-Brevianamide A C21 H23 N3 O3 365.426 g_mol
3 4 1162-65-8 (-)-Aflatoxin-B1; Aflatoxin B; Aflatoxin FB1 C17 H12 O6 312.274 g_mol
4 5 26057-70-5 (-)-Avenaciolide C15 H22 O4 266.333 g_mol
5 6 5803-62-3 (-)-Averantin C20 H20 O7 372.369 g_mol
6 7 20421-31-2 (-)-Canadensolide C11 H14 O4 210.226 g_mol
Sources
1 [F] Aspergillus versicolor
2 [F] Penicillium brevicompactum, P. viridicatum
3 [F] Aspergillus flavus, A. parasiticus, P.puberulum, P.sp., Asp.sulphureus, P. ostianus; "MunissiMUF2
4 [F] Aspergillus avenaceusIsolation extraction with (EtOAc, 3, filt.) chromatogr. with (Sil-G, ) crystallizat. with (Et2O-Hex)
5 [L] [F] Aspergillus versicolor; Solorina crocea
6 [F] Penicillium canadense, Aspergillus tamariiIsolation chromatogr. with (Sil-G, Benz-EtOAc) ion exchange with (XAD-2, MeOH)
C.NMR
1 SIM (187.0 S C2 +-4.3 96*) (181.9 S C8 +-1.5 192*) (164.9 S C18 +-1.4 31*) (164.1 S C10 +-1.443*) (162.8 S C9 +-0.5 11*) (161.0 S C6 +-1.9 26*) (135.9 S C7 +-9.8 217*) (135.9 S C4 +-9.8 217*) (119.7 S C5+-0.6 9*) (118.9 S C1 +-9.8 141*) (109.2 S C3 +-1.2 43*) (108.7 D C12 +-2.3 18*) (108.6 D C15 +-0.6 34*)(108.4 D C14 +-2.0 31*) (74.8 D C11 +-1.9 6*) (74.8 D C22 +-0.2 5*) (32.4 T C26 +-1.2 51*) (28.9 T C24 +-1.08*) (23.3 T C25 +-0.1 5*) (20.9 Q C27 +-1.9 52*)
2 SIM (203.4 S C11 +-1.6 Int) (173.7 S C10 +-0.0 1*) (170.3 S C3 +-1.4 Int) (160.5 S C15 +-0.2 8*)(134.7 D C27 +-3.3 98*) (124.6 D C23 +-0.1 9*) (124.5 D C26 +-6.4 338*) (120.2 S C13 +-0.4 8*) (111.8 D C24+-0.2 9*) (69.8 S C5 +-0.0 1*) (69.0 S C2 +-1.6 Int) (65.2 S C1 +-1.6 Int) (55.5 D C7 +-1.7 Int) (44.2 T C19 +-0.01*) (31.2 S C9 +-4.0 Int) (31.1 T C14 +-0.0 1*) (29.7 T C20 +-0.0 1*) (28.7 T C12 +-1.7 Int) (25.4 T C25 +-0.0 1*)(12.5 Q C22 +-9.8 Int) (12.5 Q C21 +-9.8 Int)
3 SIM-EXP (117.0 - 117.0, 1) (176.5 - 176.9, 2) (161.0 - 161.4, 3) (152.5 - 152.2, 4) (103.7 - 104.8, 5)(107.5 - 107.4, 7) (165.3 - 164.5, 8) (113.2 - 113.0, 10) (154.7 - 155.6, 11) (47.8 - 47.7, 12) (90.6 - 90.3, 13)(200.6 - 200.7, 14) (29.0 - 28.9, 15) (144.8 - 145.1, 18) (102.3 - 102.5, 19) (35.0 - 34.9, 20) (56.4 - 55.8, 23) ;SIM-EXP (117.0 - 117.0, 1) (176.9 - 176.5, 2) (161.4 - 161.0, 3) (152.2 - 152.5, 4) (104.9 - 103.7, 5) (107.4 -107.5, 7) (164.5 - 165.3, 8) (113.0 - 113.2, 10) (155.6 - 154.7, 11) (47.7 - 47.8, 12) (90.3 - 90.6, 13) (200.7 -200.6, 14) (28.9 - 29.0, 15) (145.1 - 144.8, 18) (102.5 - 102.3, 19) (34.9 - 35.0, 20) (55.9 - 56.4, 23)
4 SIM (173.4 S C5 +-3.0 Int) (168.3 S C4 +-0.7 Int) (134.0 S C6 +-2.4 Int) (121.8 T C11 +-0.1 2*) (72.9D C8 +-0.6 Int) (71.8 D C1 +-3.8 Int) (40.2 D C2 +-2.1 Int) (30.9 T C15 +-3.4 4300*) (29.2 T C18 +-3.0 4236*)(28.8 T C17 +-3.4 27891*) (28.5 T C16 +-3.1 1809*) (27.1 T C13 +-3.8 3*) (24.9 T C12 +-1.1 Int) (23.9 T C14 +-3.7 5242*) (14.7 Q C19 +-4.0 6903*)
5 SIM (187.0 S C2 +-4.3 96*) (181.9 S C7 +-1.5 192*) (164.9 S C16 +-1.4 31*) (164.1 S C10 +-1.443*) (162.8 S C9 +-0.5 11*) (161.0 S C6 +-1.9 26*) (135.9 S C5 +-9.8 217*) (135.9 S C4 +-9.8 217*) (121.7 S C8+-0.0 1*) (118.9 S C1 +-9.8 141*) (109.2 S C3 +-1.2 43*) (108.7 D C11 +-2.3 18*) (108.6 D C13 +-0.6 34*)(108.4 D C12 +-2.0 31*) (67.8 D C17 +-0.0 1*) (31.9 T C23 +-3.5 8*) (31.0 T C26 +-2.6 430*) (25.7 T C24 +-0.35*) (23.9 T C25 +-3.7 5242*) (14.7 Q C27 +-4.0 6903*)
6 SIM (171.5 S C2 +-1.0 Int) (170.0 S C5 +-1.5 Int) (133.8 S C6 +-0.7 Int) (124.0 T C11 +-0.0 Int) (79.3D C8 +-1.1 Int) (74.0 D C3 +-1.4 Int) (47.3 D C1 +-7.9 Int) (30.2 T C12 +-0.3 Int) (27.1 T C13 +-0.1 4*) (23.1 TC14 +-4.4 148*) (14.7 Q C15 +-4.0 6903*)
C.NMR.Struct
1 simulated ...
2 simulated ...
3 simulated ...; experimental ...
4 simulated ...
5 simulated ...
6 simulated ...
H.NMR
1
2
3 CDCl3: (2.56, H4) (3.34, H5) (6.38, H9) (6.75, J=7.0, H13) (4.72, J=7.0, 3.0, H14) (5.42, J=3.0,3.0,H15) (6.40, H16) (3.93, H17)
4
5 [3513]
6
MS.Spectra UV.A UV.B
1
2
3 (312, 100%, M+) (284) (269) (256) (241) (227) (199) (185) (171)
4
5
6
UV.N
1
2
3 MeOH: (220, 25600) (265, 13400) (362, 21800) (EtOH): (223, 25600) (265, 13400) (362, 21800)
4 MeOH: (210, 10000)
5
6 MeOH: (210, 1OOOO)
UV
1
2
3 220 265 362 ...; 223 265 362 ...; light
4 210 ...
5
6 210 ...
IR.Spectra
1
2
3 KBr (1754) (1701) (1615) (1595) (1429) (1356) (1229) (1130) (977) (824) ...
4
5
6
Toxicity Solubility
1
2
3 LD50 = (1, peros) hepatotoxic good in MeOH, Chl, hardly in Hex
4 good in MeOH, Et2O, hardly in W
5
6 good in EtOAc, Chl, hardly in W, base
Activity
1
2
3 (B.subt., 15) (S.aureus, ) (Mycob.sp., ) (Fungi, 10) (Nocardia sp., 20)
4 (B.subt., 200) (Phyt.fungi, 1)(antibiotic)
5 (bacteria, +) (fungi, -)
6 (Phyt.fungi, ) (Fungi, )
Appearance MeltingP TLC
1 -271
2 (175)-(180)
3 fluorescence emission 425 nm; white, yellow, cryst. (268-269)
4 (+)-form; also (-)-form, (+-)- form found white, cryst. (54-6)
5 (233-4) (0.48, EtOAc_cHex 1:1)
6 white, cryst. (46-7.5)
StructCalc Group
1 370.104704 ...
2 365.173393 ...
3 312.062840 ... aflatoxin, neutral
4 266.151261 ... dilactone deriv., neutral
5 372.120355 ...
6 210.088660 ... dilactone deriv., neutral
Remarks
1 *C,H also (+-)-form found
2
3 *C (see H),H,I,M (see I),U EXP = 2nd val in CDCl3: C-OMe_COO were exchanged
4
5 *H,M
6 also (+-)-form
References
1 Thomson II, 487; Horak, R. et al., J. Chem. Soc., Perkin Trans. 1 (1985) 345
2 Williams, R. M. et al., J. Am. Chem. Soc., 111(8), 3064-5 1989
3 Cole_Cox, 15; Nature,192,1096,1961; 198,1056,1963; Endeavour 22,75,1963; JACS,85,1706, 1963;87, 882, 1965; Forsch., 31, 118, 1974; Exp., 23,187,1967; J. Bact.,93,59,1967; Appl. Micr.,14,403,1966; Z. Allg.Mikr.,12,593,1972; Bioch.J., 114,289,1969; Bact. Rev.,41, 822,1977;30,460,1966; CA,89,36786;CR Ser. D,285,201, 1978; AAC,16,277,1979
4 JCS,5385,1963; Nature,203,1382,1964; JACS,91,7208,1969;95, 7923,1973;97,3870,1975;JOC,38,2489,1973; CC,538,1973; Aust. J. Chem.,18,373,1965
5 Thomson II, 483; Townsend, Craig A., Tetrahedron Lett. 1986, 27(8), 887-8; Turner II, 187,188, 191
6 TL,727,1968,3233,1978; Tsuboi, S. et al., J. Org. Chem., 51 (1986) 4944
CA REG
1 DA:A-915 28458-24-4; 73346-80-2
2 DA:B-138; 120:186930; 110:189072 23402-09-7
3 DA:A-096; 108:218732k; 114:97857t 1162-65-8
4 DA:A-904 26057-70-5; 16993-42-3; 20223-76-1
5 DA:A-905; 105:133600d 5803-62-3
6 DA:C-013 20421-31-2
ChemClass
1 no charge; oxygen heterocycle; carbocycle; aromatic; alicycle; large ring; fused rings; 6ring;
2 no charge; nitrogen heterocycle; carbocycle; aromatic; alicycle; large ring; fused rings;
3 no charge; oxygen heterocycle; carbocycle; aromatic; alicycle; large ring; fused rings; 5ring;
4 no charge; oxygen heterocycle; alicycle; large ring; fused rings; 5ring; 8ring; ester; lactone;
5 no charge; carbocycle; aromatic; large ring; fused rings; 6ring; 10ring; 14ring; ketone;
6 no charge; oxygen heterocycle; alicycle; large ring; fused rings; 5ring; 8ring; ester; lactone;
Opt.Rot X.1
1
2 aD25: (+413 EtOH)
3 (-480, DMF) (-559, Chl); aD25 (-562, c=0.115, CHCL3)
4 aD25:(-41.6, Chl) (-41, EtOH)
5 aD22 (-178, c 0.37, EtOH)
6 aD:(-141, Chl)
提前致谢
答案 0 :(得分:4)
如果从角色向量StructCalc中删除这些尾随句点,您应该会获得成功:
StructCalc <- as.numeric( gsub("[ ][.]+", "", StructCalc) )
如果你取出所有的句号,那么你就会失去小数位&#34;。
> sc <- scan(what="",sep=",")
1: 370.104704 ...
2: 365.173393 ...
3: 312.062840 ...
4: 266.151261 ...
5: 372.120355 ...
6: 210.088660 ...
7:
Read 6 items
> sub("[ ][.]+","",sc)
[1] "370.104704" "365.173393" "312.062840" "266.151261" "372.120355" "210.088660"
> as.numeric(sub("[ ][.]+","",sc))
[1] 370.1047 365.1734 312.0628 266.1513 372.1204 210.0887
> print( as.numeric(sub("[ ][.]+","",sc)), digits=16)
[1] 370.104704 365.173393 312.062840 266.151261 372.120355 210.088660
答案 1 :(得分:2)
看起来这三个点是个问题。你可以用以下方法清理它:
a = "1 ..."
as.numeric(a)
# Doesn't work #
b = gsub("[.]", "", a)
as.numeric(b)
# works #