将十进制转换为罗马数字

时间:2017-06-12 20:10:48

标签: python regex pandas dictionary replace

d_hsp={"1":"I","2":"II","3":"III","4":"IV","5":"V","6":"VI","7":"VII","8":"VIII",
       "9":"IX","10":"X","11":"XI","12":"XII","13":"XIII","14":"XIV","15":"XV",
       "16":"XVI","17":"XVII","18":"XVIII","19":"XIX","20":"XX","21":"XXI",
       "22":"XXII","23":"XXIII","24":"XXIV","25":"XXV"}
HSP_OLD['tryl'] = HSP_OLD['tryl'].replace(d_hsp, regex=True)

HSP_OLD是一个数据框,trylHSP_OLD的一列,以下是tryl中值的一些示例:

SAF/HSP: Secondary diagnosis E code 1

SAF/HSP: Secondary diagnosis E code 11

我使用字典替换,它适用于1-10,但是对于11,它将变为“II”,对于12,它将变为“III”。

2 个答案:

答案 0 :(得分:3)

很抱歉,没有注意到你不仅仅是更新字段,而且你实际上想要在最后更换一个数字,但即使是这样的话 - 将数字正确地转换为罗马数字比将数字转换为更好映射每次可能出现的情况(如果数字大于25,您的代码会发生什么?)。所以,这是一种方法:

ROMAN_MAP = [(1000, 'M'), (900, 'CM'), (500, 'D'), (400, 'CD'), (100, 'C'), (90, 'XC'),
             (50, 'L'), (40, 'XL'), (10, 'X'), (9, 'IX'), (5, 'V'), (4, 'IV'), (1, 'I')]

def romanize(data):
    if not data or not isinstance(data, str):  # we know how to work with strings only
        return data
    data = data.rstrip()  # remove potential extra whitespace at the end
    space_pos = data.rfind(" ")  # find the last space before the number
    if space_pos != -1:
        try:
            number = int(data[space_pos + 1:])  # get the number at the end
            roman_number = ""
            for i, r in ROMAN_MAP:  # loop-reduce substitution based on the ROMAN_MAP
                while number >= i:
                    roman_number += r
                    number -= i
            return data[:space_pos + 1] + roman_number  # put everything back together
        except (TypeError, ValueError):
            pass  # couldn't extract a number
    return data

现在,如果我们将数据框创建为:

HSP_OLD = pd.DataFrame({"tryl": ["SAF/HSP: Secondary diagnosis E code 1",
                                 None,
                                 "SAF/HSP: Secondary diagnosis E code 11",
                                 "Something else without a number at the end"]})

我们可以轻松地在整个专栏中应用我们的功能:

HSP_OLD['tryl'] = HSP_OLD['tryl'].apply(romanize)

结果是:

                                         tryl
0       SAF/HSP: Secondary diagnosis E code I
1                                        None
2      SAF/HSP: Secondary diagnosis E code XI
3  Something else without a number at the end

当然,您可以根据需要调整romanize()功能,以搜索字符串中的任何数字并将其转换为罗马数字 - 这只是如何在结尾处快速查找数字的示例字符串。

答案 1 :(得分:2)

您需要保留项目的顺序,并开始使用最长的子字符串进行搜索。

您可以在此处使用vs_installer.exe modify --installPath "D:\Program Files (x86)\Microsoft Visual Studio\2017\Professional" --Add Microsoft.Net.ComponentGroup.4.7.DeveloperTools --focusedUi 。要初始化它,请使用元组列表。初始化时,你可以在这里反转它,但是你也可以稍后再做。

     Id    | Name  | count |  Group_number      
     ------+-------+-------+--------------
     1     | cdd   |  50   |       0  
     2     | cdd   |  15   |       0  
     3     | cdd   |  0    |       0  
     4     | cdd   |  25   |       0   
     5     | cdd   |  11   |       0