给定范围为ISO 639-2/T范围的语言代码,如何以编程方式查找匹配的macrolanguage代码,如果匹配存在?
例如,如何从“nob”(NorwegianBokmål,范围个体)到“nor”(挪威语,范围macrolangauge)?
一般情况下,同一个国家/地区可能存在多种不属于同一宏语言的个别语言,因此仅按国家/地区分组会产生误报。
java.util.locale了解ISO 639三字母语言代码并识别上述示例中的两个代码,但没有范围或宏语言的概念。
启发式,没有误报也对我的情况有帮助。
答案 0 :(得分:1)
您可以列出自己的宏语言以及相应的单个语言。
这是我前段时间所做的选择:
public static final Map<String, String> macroLanguages = new HashMap<>();
static {
macroLanguages.put("aao", "ara"); //https://iso639-3.sil.org/code/ara
macroLanguages.put("abh", "ara");
macroLanguages.put("abv", "ara");
macroLanguages.put("acm", "ara");
macroLanguages.put("acq", "ara");
macroLanguages.put("acw", "ara");
macroLanguages.put("acx", "ara");
macroLanguages.put("acy", "ara");
macroLanguages.put("adf", "ara");
macroLanguages.put("aeb", "ara");
macroLanguages.put("aec", "ara");
macroLanguages.put("afb", "ara");
macroLanguages.put("ajp", "ara");
macroLanguages.put("apc", "ara");
macroLanguages.put("apd", "ara");
macroLanguages.put("arb", "ara");
macroLanguages.put("arq", "ara");
macroLanguages.put("ars", "ara");
macroLanguages.put("ary", "ara");
macroLanguages.put("arz", "ara");
macroLanguages.put("auz", "ara");
macroLanguages.put("avl", "ara");
macroLanguages.put("ayh", "ara");
macroLanguages.put("ayl", "ara");
macroLanguages.put("ayn", "ara");
macroLanguages.put("ayp", "ara");
macroLanguages.put("bbz", "ara");
macroLanguages.put("pga", "ara");
macroLanguages.put("shu", "ara");
macroLanguages.put("ssh", "ara");
macroLanguages.put("ekk", "est"); //https://iso639-3.sil.org/code/est
macroLanguages.put("vro", "est");
macroLanguages.put("bos", "hbs"); //https://iso639-3.sil.org/code/hbs
macroLanguages.put("hrv", "hbs");
macroLanguages.put("srp", "hbs");
macroLanguages.put("cnr", "hbs");
macroLanguages.put("ltg", "lav"); //https://iso639-3.sil.org/code/lav
macroLanguages.put("lvs", "lav");
macroLanguages.put("nno", "nor"); //https://iso639-3.sil.org/code/nor
macroLanguages.put("nob", "nor");
macroLanguages.put("aae", "sqi"); //https://iso639-3.sil.org/code/sqi
macroLanguages.put("aat", "sqi");
macroLanguages.put("aln", "sqi");
macroLanguages.put("als", "sqi");
macroLanguages.put("ydd", "yid"); //https://iso639-3.sil.org/code/yid
macroLanguages.put("yih", "yid");
macroLanguages.put("ccx", "zha"); //https://iso639-3.sil.org/code/zha
macroLanguages.put("ccy", "zha");
macroLanguages.put("zch", "zha");
macroLanguages.put("zeh", "zha");
macroLanguages.put("zgb", "zha");
macroLanguages.put("zgm", "zha");
macroLanguages.put("zgn", "zha");
macroLanguages.put("zhd", "zha");
macroLanguages.put("zhn", "zha");
macroLanguages.put("zlj", "zha");
macroLanguages.put("zln", "zha");
macroLanguages.put("zlq", "zha");
macroLanguages.put("zqe", "zha");
macroLanguages.put("zyb", "zha");
macroLanguages.put("zyg", "zha");
macroLanguages.put("zyj", "zha");
macroLanguages.put("zyn", "zha");
macroLanguages.put("zzj", "zha");
macroLanguages.put("cdo", "zho"); //https://iso639-3.sil.org/code/zho
macroLanguages.put("cjy", "zho");
macroLanguages.put("cmn", "zho");
macroLanguages.put("cpx", "zho");
macroLanguages.put("czh", "zho");
macroLanguages.put("czo", "zho");
macroLanguages.put("gan", "zho");
macroLanguages.put("hak", "zho");
macroLanguages.put("hsn", "zho");
macroLanguages.put("lzh", "zho");
macroLanguages.put("mnp", "zho");
macroLanguages.put("nan", "zho");
macroLanguages.put("wuu", "zho");
macroLanguages.put("yue", "zho");
macroLanguages.put("cnp", "zho");
macroLanguages.put("csp", "zho");
macroLanguages.put("pes", "fas"); //https://iso639-3.sil.org/code/fas
macroLanguages.put("prs", "fas");
}