我有一个csv文件,其格式如下(仅显示相关行):
Global equity - 45%/45.1%
Private Investments - 25%/21%
Hedge Funds - 17.5%/18.1%
Bonds & cash - 12.5%/15.3%
我写了一个正则表达式来查找每次出现的数字(即45%/ 45.1%等),并且我试图写它以使它只保留斜杠标记后面的数字。这是我写的:
with open('sheet.csv','rU') as f:
rdr = csv.DictReader(f,delimiter=',')
row1 = next(rdr)
assets = str(row1['Asset Allocation '])
finnum = re.sub(r'(\/[0-9]+.)','#This is where I want to replace with just the numbers after the slash',assets)
print(finnum)
所需的输出:
Global equity - 45.1%
Private Investments - 21%
etc...
如果我不知道我想要的数字的索引,这是否可能?
答案 0 :(得分:2)
你可以试试这个(' \ d +%/')正则表达式来删除无用的数据。
import re
string = 'Global equity - 45%/45.1%'
re.sub(r'\d+%/', '', string) # 'Global equity - 45.1%'
答案 1 :(得分:2)
如果专门寻找该模式,您可以使用基于组的替换和连接:
replace = lambda s: s.group(1) + ' ' + s.group(3)
re.sub(r'(.*) (\d+%/)(\d+%)', replace, 'Hedge Funds - 17.5%/18.1%')
然后只需删除不需要的内容:
val = 'Hedge Funds - 17.5%/18.1%'
re.sub(r'\d+%/', '', val)
或者,如果您不想使用正则表达式:
val = 'Hedge Funds - 17.5%/18.1%'
replaced = val[0:val.find(' - ')] + ' - ' + val[val.find('%/') + 2:]
答案 2 :(得分:2)
如果您不想替换并需要在代码的其他部分使用这些值。你可以:
import re
cleanup = re.compile(r"(^.+?)-\s.+?\/(.+?)$",re.MULTILINE)
f = open(file_name, 'r')
text = f.read()
for match in cleanup.finditer(text):
print match.group(1),match.group(2)
答案 3 :(得分:1)
您还可以将第一个号码之前和class MultipleKeyValueFilter implements Filter {
protected $kvPairs;
public function __construct($kvPairs) {
$this->kvPairs = $kvPairs;
}
public function filter($item) {
$result = true;
foreach ($this->kvPairs as $key => $value) {
if ($item[$key] !== $value)
$result &= false;
}
return $result;
}
}
class MultipleKeyComparator implements Comparator {
protected $keys;
public function __construct($keys) {
$this->keys = $keys;
}
之后的内容分组:
/
输出:
import re
s = 'Hedge Funds - 17.5%/18.1%'
print re.sub('(.*-) .*/(.*)', '\g<1> \g<2>', s)