我对这些东西不熟悉,如果这是一个愚蠢的问题,那就很抱歉。我有这些数据按国内生产总值(从事实本书)列出国家。在这里它被编译成一个单独的字符串:
'\ n1 \ t欧洲联盟\ t $ 15,970,000,000,000 \ n2 \ t美国\ t $
15,940,000,000,000 \ n3 \ t中国\ t $ 12,610,000,000,000 \ n4 \ tIndia \ t $
4,761,000,000,000 \ n5 \ t日本\ t $ 4,704,000,000,000 \ n6 \ tGermany \ t $
3,250,000,000,000 \ n7 \ tRussia \ t $ 2,555,000,000,000 \ n8 \ tBrazil \ t $
2,394,000,000,000 \ n9 \ tUnited Kingdom \ t $
2,375,000,000,000 \ n10 \ t法国\ t $ 2,291,000,000,000 \ n11 \ tItaly \ t $
1,863,000,000,000 \ n12 \ tMexico \ t $ 1,788,000,000,000 \ n13 \ tKorea, 南\ $ $ 1,640,000,000,000 \ n14 \ tCanada \ t $
1,513,000,000,000 \ n15 \ t西班牙\ t $ 1,434,000,000,000 \ n16 \ tIndonesia \ t $ 1,237,000,000,000 \ n17 \ tTurkey \ t $ 1,142,000,000,000 \ n18 \ tIran \ t $
1,016,000,000,000 \ n19 \ t澳大利亚\ t $ 986,700,000,000 \ n20 \ tSaudi 阿拉伯\ t $ 921,700,000,000 \ n21 \ tTaiwan \ t $
918,300,000,000 \ n22 \ tPoland \ t $ 814,100,000,000 \ n23 \ tArgentina \ t $
7553亿\ N24 \ tNetherlands \ T $
718,600,000,000 \ n25 \ tThailand \ t $ 662,600,000,000 \ n26 \ tSouth 非洲\ t $ 592,000,000,000 \ n27 \ tEgypt \ t $
548,800,000,000 \ n28 \ t巴基斯坦\ t $ 523,900,000,000 \ n29 \ tColombia \ t $ 511,100,000,000 \ n30 \ t马来西亚\ t $ 506,700,000,000 \ n31 \ tNigeria \ t $
4555亿\ N32 \ tPhilippines \ T $
431,300,000,000 \ n33 \ tBelgium \ t $ 427,200,000,000 \ n34 \ tVenezuela \ t $ 408,500,000,000 \ n35 \ tSweden \ t $ 399,400,000,000 \ n36 \ tHong Kong \ t $
3755亿\ N37 \ tSwitzerland \ T $
369,400,000,000 \ n38 \ t奥地利\ t $ 364,900,000,000 \ n39 \ t乌克兰\ t $
3407亿\ N'
我的脚本经历了这一点,我希望它在for循环中比较x+1
每次n
n
之后的整数,如果在{{1}之后有那个数字n
}}。我要求它检查它是否与计数器相同,如果是,我打印"here's a line"
。这是我的剧本:
DataCounter = 1
for x in data:
if x == "n":
if x+1 == DataCounter:
print("new line")
print(DataCounter)
DataCounter = DataCounter + 1
我知道这不完美,我知道我的例子只有9个,但它会为我做的工作(可能有歧义)。我遇到了麻烦,因为它比较了DataCounter,它是一个int x+1
是一个字符串。我该怎么办?这是错误:
Traceback (most recent call last):
File "<pyshell#85>", line 3, in <module>
if x+1 == DataCounter:
TypeError: Can't convert 'int' object to str implicitly
答案 0 :(得分:0)
\t
和\n
分别是单个字符。你不能将它们视为两个字符。
您可以将整个事物一次性转换为(索引,国家/地区,GDP)三元组列表,如下所示:
data2 = [(int(a), b, int(c.strip('$').replace(',', '')))
for a, b, c in [s.split("\t")
for s in data.strip("\n").split("\n")]]
然后,循环列表并测试您想要的任何值都要容易得多。
答案 1 :(得分:0)
import re
import decimal
def money(s, reg=re.compile('[^0-9\.+-]+')):
return decimal.Decimal(reg.sub('', s))
data = data.strip().split('\n') # break the text into a string per country
data = [row.split('\t') for row in data] # break each country into three strings
data = [[int(n), name, money(val)] for n,name,val in data] # convert data types appropriately
数据现在看起来像
[
[1, 'European Union', Decimal('15970000000000')],
[2, 'United States', Decimal('15940000000000')],
[3, 'China', Decimal('12610000000000')],
[4, 'India', Decimal('4761000000000')],
[5, 'Japan', Decimal('4704000000000')],
# ...
[38, 'Austria', Decimal('364900000000')],
[39, 'Ukraine', Decimal('340700000000')]
]