Question

我有一个dictionary，其中包含密钥，例如

"Car"
"Card Payment"

我有一个字符串description，例如我想在“向tesco支付卡”中找到与字符串相对应的字典中的项目。

我已经尝试过了：

var category = dictionary.SingleOrDefault(p => description.ToLowerInvariant().Contains(p.Key)).Value;

当前，这导致字典中同时返回“汽车”和“卡付款”，并且我的密码SingleOrDefault炸毁了。

如何实现我想要的？我考虑过在空格中添加键前缀和后缀，但是我必须对描述做同样的事情-我认为这可以用，但是有点脏。有没有更好的方法？我不反对将Dictionary更改为其他类型，只要性能不会受到太大影响。

上述示例的必需结果：仅获得“卡付款”

Answer 1

您可以在OrderByDescending条件之后尝试使用linq Take和where。找到最匹配的单词值。

var category = dictionary
               .Where(p => description.ToLowerInvariant().Contains(p.Key.ToLowerInvariant()))
               .OrderByDescending(x => x.Key.Length)
               .Take(1);

c# online

我将使用List<string>来包含您的密钥，因为没有任何理由需要使用key和value集合。

List<string> keys = new List<string>();
keys.Add("Car");
keys.Add("Card Payment");

string description = "Card payment to tesco";

var category = keys
        .Where(p => description.ToLowerInvariant().Contains(p.ToLowerInvariant()))
        .OrderByDescending(x => x.Length)
        .Take(1)
        .FirstOrDefault();

注意

OrderBy键值长度desc可以确保哪个键是最匹配的单词值。

Answer 2

我在这里使用{ "extends": "google", "plugins": [ "mocha", "flowtype" ], "rules": { "strict": [ "error", "never" ], "comma-dangle": [ "error", "never" ], "object-curly-spacing": [ "error", "always" ], "require-jsdoc": [ "error", { "require": { "FunctionDeclaration": true, "MethodDefinition": true, "ClassDeclaration": false, "ArrowFunctionExpression": false, "FunctionExpression": false } } ], "indent": [ "error", 2, { "SwitchCase": 1 } ], "semi": [ "warn", "always" ], "no-console": [ "warn", { "allow": [ "debug", "error" ] } ], "max-len": [ "off" ], "no-unused-vars": [ "error", { "varsIgnorePattern": "should" } ], "flowtype/boolean-style": [ 2, "boolean" ], "flowtype/define-flow-type": 1, "flowtype/delimiter-dangle": [ 2, "never" ], "flowtype/generic-spacing": [ 2, "never" ], "flowtype/no-primitive-constructor-types": 2, "flowtype/no-types-missing-file-annotation": 2, "flowtype/no-weak-types": 2, "flowtype/object-type-delimiter": [ 2, "comma" ], "flowtype/require-parameter-type": 2, "flowtype/require-return-type": [ 2, "always", { "annotateUndefined": "never" } ], "flowtype/require-valid-file-annotation": 2, "flowtype/semi": [ 2, "always" ], "flowtype/space-after-type-colon": [ 2, "always" ], "flowtype/space-before-generic-bracket": [ 2, "never" ], "flowtype/space-before-type-colon": [ 2, "never" ], "flowtype/type-id-match": [ 2, "^([A-Z][a-z0-9]+)+Type$" ], "flowtype/union-intersection-spacing": [ 2, "always" ], "flowtype/use-flow-type": 1, "flowtype/valid-syntax": 1 }, "env": { "es6": true, "node": true }, "settings": { "flowtype": { "onlyFilesWithFlowAnnotation": false } } } 键和List<string>查找所需的键。
尝试一下。

System.Text.RegularExpressions

在此处.NET Fiddle

Answer 3

您正在滥用字典。通过扫描按键，您将不会从字典中获得任何性能提升。更糟糕的是，在这种情况下，简单的列表会更快。如果您通过键查询值，则字典会提供恒定时间访问权限（O(1)。

if (dictionary.TryGetValue(key, out var value)) { ...

要想利用这一优势，您将需要一种更微妙的方法。主要的困难在于，有时键可能包含多个单词。因此，我建议采用两种方法：在第一级存储单个单词键，在第二级存储组合键和值。

示例：要存储的键值对：

["car"]: categoryA
["card payment"]: categoryB
["payment"]: categoryC

我们将字典构建为

var dictionary = new Dictionary<string, List<KeyValuePair<string, TValue>>> {
    ["car"] = new List<KeyValuePair<string, TValue>> {
        new KeyValuePair("car", categoryA)
    },
    ["card"] = new List<KeyValuePair<string, TValue>> {
        new KeyValuePair("card payment", categoryB)
    },
    ["payment"] = new List<KeyValuePair<string, TValue>> {
        new KeyValuePair("card payment", categoryB),
        new KeyValuePair("payment", categoryC)
    }
};

当然，实际上，我们将使用算法来执行此操作。但是这里的重点是显示结构。如您所见，主键"payment"的第三个条目包含两个条目：一个代表"card payment"，另一个代表"payment"。

添加值的算法如下：

将键拆分为单个单词。
对于每个单词，使用该单词作为主键来创建字典条目，并将键值对存储在列表中作为字典值。第二个键是原始键，可能由几个单词组成。

您可以想象，步骤2要求您测试是否已经存在具有相同主键的条目。如果是，则将新条目添加到现有列表中。否则，创建一个具有单个条目的新列表，然后将其插入字典中。

检索这样的条目：

将键拆分为单个单词。
对于每个单词，使用 true 检索现有的字典条目，并因此将字典快速查找（！）到List<List<KeyValuePair<string, TValue>>>中。
使用SelectMany将列表列表平铺为单个List<KeyValuePair<string, TValue>>
按密钥长度降序排列，并测试说明中是否包含密钥。找到的第一个条目是结果。

您还可以结合步骤2和3，并将单个词典条目的列表条目直接添加到主列表中。

字典查找我们想要包含在字符串中的键的位置

3 个答案: