我有以下数据框:
IP Service Status CPU Memory
0 10.58.1.73 service: StorageService null cpu: 22% memory: 11%
0 10.58.1.99 service: StorageService null cpu: 25% memory: 37%
0 10.58.1.114 service: StorageService null cpu: 39% memory: 2%
0 10.58.1.82 service: StorageService null cpu: 50% memory: 96%
0 10.58.1.53 service: StorageService null cpu: 29% memory: 36%
0 10.58.1.1 service: StorageService null cpu: 54% memory: 6%
0 10.58.1.15 service: StorageService null cpu: 28% memory: 30%
0 10.58.1.4 service: StorageService null cpu: 5% memory: 48%
0 10.58.1.69 service: StorageService null cpu: 21% memory: 57%
0 10.58.1.5 service: StorageService null cpu: 4% memory: 2%
0 10.58.1.136 service: StorageService null cpu: 98% memory: 74%
0 10.58.1.43 service: StorageService null cpu: 36% memory: 23%
0 10.58.1.6 service: StorageService null cpu: 61% memory: 25%
0 10.58.1.137 service: StorageService null cpu: 76% memory: 66%
0 10.58.1.83 service: StorageService null cpu: 92% memory: 35%
0 10.58.1.39 service: StorageService null cpu: 35% memory: 17%
我需要将CPU列提取为字符串。我尝试使用此命令:
cpu = df2.CPU.str.extract(r'([\d]+))', expand=False)
但是我认为我的RegEx已关闭。解决这个问题的最佳方法是什么?
答案 0 :(得分:2)
考虑一个常见的cpu:
前缀-简单替换即可完成:
df2.CPU.str.replace('cpu: ', '').str[:-1]
或更简单的切片:
df2.CPU.str[5:-1]
答案 1 :(得分:1)
错误消息告诉您正则表达式中的细微错误在哪里,这是一个多余的右括号:
re.error: unbalanced parenthesis at position 7
df.CPU.str.extract(r'([\d]+))', expand=False)
^
您打算输入:
df.CPU.str.extract(r'([\d]+)', expand=False)
效果很好。
答案 2 :(得分:0)
您可以在此处获取arround正则表达式。我建议:
df2.CPU.str.split(' ').str[1]
这将在空格字符处分割字符串,并选择第二个元素,即百分比。