数据字典逻辑到物理缩写名称转换

时间:2015-09-21 03:17:30

标签: python dictionary

我需要将逻辑数据字典转换为物理(缩写)数据字典 - 我在下面给出了4个用例。

需要这个伪代码/要求的帮助:

style.css coding. 
'* =23. Footer
-------------------------------------------------------------- */

.site-footer {
    background-color: #fff;
    color: #afafaf;
    font-size: 14px;
    text-align: center;
    vertical-align: middle;

}

.site-footer a {
    color: #afafaf;
}

.site-info {
    padding: 10px 0;
    margin-top: 10px;
    border-top: #eee solid 1px;
}

#colophon ul {
    list-style-type: none;
    padding-left: 0;
    list-style-position: inside;
    }

#sidebar-footer {
    text-align: left;
    word-spacing: -0.29em;
    }

/* === Footer Menu  === */

.footer-menu {
    display: inline-block;
    float: left;
    margin: 0 0 10px;
}

.footer-menu ul {
    float: left;
    list-style: outside none none;
    margin: 0;
    padding: 0;
}

.footer-menu ul li{
    display:inline-block;
    padding-right: 10px;
}'

footer-banner.css

    #colophon {
    padding: 40px 0;
    margin-top: 0px;
    border-top: none;
    }

到目前为止,我可以找出用于匹配任何可能的单个最大表达式的代码片段:

# empty dict declaration
refDict = {}

# to catch and report on any 'not-found' dictionary words to replace
noMatchFound = {}

# read from a dictionary of comma delimited dictionary
# with open('dictionary.csv') as inputDict:
#     for line in inputDict:
#         busTerm, busAbbr = line.split(',')
#         refDict[busTerm] = busAbbr.replace("\n","")

# sample data dictionary entries
refDict = {
         'user': 'USR',
         'call': 'CALL', 
         'detail': 'DTL', 
         'record': 'REC', 
         'call detail record': 'CDR',
         'count', 'CNT'}


input_string1="user call detail record"
# output should be "USR_CDR"
# noMatchFound - will be empty - since all are matched and replaced

input_string2="user test call detail record"
# output should be "USR_TEST_CDR"
# noMatchFound - should have an entry "TEST" with a refernce to "user test call detail record"

input_string3="user call count detail record"
# output should be "USR_CALL_CNT_DTL_REC"
# noMatchFound - will be empty - since all are matched and replaced

input_string4="user call  detail record count"
# output should be "USR_CDR_CNT"
# noMatchFound - will be empty - since all are matched and replaced

1 个答案:

答案 0 :(得分:0)

你的想法是从字典中最大的字符串开始尝试replace(),并检查给定字典的每个可能的替换,从长到短。

这完全符合您的预期:

refDict = {
         'user': 'USR',
         'call': 'CALL',
         'detail': 'DTL',
         'record': 'REC',
         'call detail record': 'CDR',
         'count': 'CNT'}

sorted_ref = sorted( refDict.items(), key=lambda x:len(x[0]), reverse = True )

def do_work(input_string):
    noMatchFound = {}
    rval = input_string[:]
    for key, value in sorted_ref:
        rval = rval.replace(key, value)
    not_founds = [x for x in rval.split() if x.islower()]
    for not_found in not_founds:
        noMatchFound[not_found] = input_string
        rval = rval.replace(not_found, not_found.upper())
    rval = '_'.join( rval.split() )
    return rval, noMatchFound

inputs = ["user call detail record", "user test call detail record",
          "user call count detail record","user call detail record count"]

for inp in inputs:
    print inp
    output, noMatchFound = do_work(inp)
    print output
    print noMatchFound
    print '---'

输出:

user call detail record
USR_CDR
{}
---
user test call detail record
USR_TEST_CDR
{'test': 'user test call detail record'}
---
user call count detail record
USR_CALL_CNT_DTL_REC
{}
---
user call detail record count
USR_CDR_CNT
{}