with open('currencies.txt') as f:
content = f.read()
print(content)
str1 = content
print('str1: ', str1)
>>> str1: Albanian Lek (L) | ALL | 3526\nAlgerian Dinar (\u062f.\u062c) | DZD | 3537\nArgentine Peso ($) | ARS | 2821\nArmenian Dram (\u058f) | AMD | 3527\nAustralian Dollar ($) | AUD | 2782\nAzerbaijani Manat (\u20bc) | AZN | 3528\nBahraini Dinar (.\u062f.\u0628) | BHD | 3531\nBangladeshi Taka (\u09f3) | BDT | 3530\nBelarusian Ruble (Br) | BYN | 3533\nBermudan Dollar ($) | BMD | 3532\nBolivian Boliviano (Bs.) | BOB | 2832\nBosnia-Herzegovina Convertible Mark (KM) | BAM | 3529\nBrazilian Real (R$) | BRL | 2783\nBulgarian Lev (\u043b\u0432) | BGN | 2814\nCambodian Riel (\u17db) | KHR | 3549\nCanadian Dollar ($) | CAD | 2784\nChilean Peso ($) | CLP | 2786\nChinese Yuan (\u00a5) | CNY | 2787\nColombian Peso ($) | COP | 2820\nCosta Rican Col\u00f3n (\u20a1) | CRC | 3534\nCroatian Kuna (kn) | HRK | 2815\nCuban Peso ($) | CUP | 3535\nCzech Koruna (K\u010d) | CZK | 2788\nDanish Krone (kr) | DKK | 2789\nDominican Peso ($) | DOP | 3536\nEgyptian Pound (\u00a3) | EGP | 3538\nEuro (\u20ac) | EUR | 2790\nGeorgian Lari (\u20be) | GEL | 3539\nGhanaian Cedi (\u20b5) | GHS | 3540\nGuatemalan Quetzal (Q) | GTQ | 3541\nHonduran Lempira (L) | HNL | 3542\nHong Kong Dollar ($) | HKD | 2792\nHungarian Forint (Ft) | HUF | 2793\nIcelandic Kr\u00f3na (kr) | ISK | 2818\nIndian Rupee (\u20b9) | INR | 2796\nIndonesian Rupiah (Rp) | IDR | 2794\nIranian Rial (\ufdfc) | IRR | 3544\nIraqi Dinar (\u0639.\u062f) | IQD | 3543\nIsraeli New Shekel (\u20aa) | ILS | 2795\nJamaican Dollar ($) | JMD | 3545\nJapanese Yen (\u00a5) | JPY | 2797\nJordanian Dinar (\u062f.\u0627) | JOD | 3546\nKazakhstani Tenge (\u20b8) | KZT | 3551\nKenyan Shilling (Sh) | KES | 3547\nKuwaiti Dinar (\u062f.\u0643) | KWD | 3550\nKyrgystani Som (\u0441) | KGS | 3548\nLebanese Pound (\u0644.\u0644) | LBP | 3552\nMacedonian Denar (\u0434\u0435\u043d) | MKD | 3556\nMalaysian Ringgit (RM) | MYR | 2800\nMauritian Rupee (\u20a8) | MUR | 2816\nMexican Peso ($) | MXN | 2799\nMoldovan Leu (L) | MDL | 3555\nMongolian Tugrik (\u20ae) | MNT | 3558\nMoroccan Dirham (\u062f.\u0645.) | MAD | 3554\nMyanma Kyat (Ks) | MMK | 3557\nNamibian Dollar ($) | NAD | 3559\nNepalese Rupee (\u20a8) | NPR | 3561\nNew Taiwan Dollar ($) | TWD | 2811\nNew Zealand Dollar ($) | NZD | 2802\nNicaraguan C\u00f3rdoba (C$) | NIO | 3560\nNigerian Naira (\u20a6) | NGN | 2819\nNorwegian Krone (kr) | NOK | 2801\nOmani Rial (\u0631.\u0639.) | OMR | 3562\nPakistani Rupee (\u20a8) | PKR | 2804\nPanamanian Balboa (B/.) | PAB | 3563\nPeruvian Sol (S/.) | PEN | 2822\nPhilippine Peso (\u20b1) | PHP | 2803\nPolish Z\u0142oty (z\u0142) | PLN | 2805\nPound Sterling (\u00a3) | GBP | 2791\nQatari Rial (\u0631.\u0642) | QAR | 3564\nRomanian Leu (lei) | RON | 2817\nRussian Ruble (\u20bd) | RUB | 2806\nSaudi Riyal (\u0631.\u0633) | SAR | 3566\nSerbian Dinar (\u0434\u0438\u043d.) | RSD | 3565\nSingapore Dollar ($) | SGD | 2808\nSouth African Rand (Rs) | ZAR | 2812\nSouth Korean Won (\u20a9) | KRW | 2798\nSouth Sudanese Pound (\u00a3) | SSP | 3567\nSovereign Bolivar (Bs.) | VES | 3573\nSri Lankan Rupee (Rs) | LKR | 3553\nSwedish Krona (\tkr) | SEK | 2807\nSwiss Franc (Fr) | CHF | 2785\nThai Baht (\u0e3f) | THB | 2809\nTrinidad and Tobago Dollar ($) | TTD | 3569\nTunisian Dinar (\u062f.\u062a) | TND | 3568\nTurkish Lira (\u20ba) | TRY | 2810\nUgandan Shilling (Sh) | UGX | 3570\nUkrainian Hryvnia (\u20b4) | UAH | 2824\nUnited Arab Emirates Dirham (\u062f.\u0625) | AED | 2813\nUruguayan Peso ($) | UYU | 3571\nUzbekistan Som (so'm) | UZS | 3572\nVietnamese Dong (\u20ab) | VND | 2823\n \nAlong with these four precious metals:\n \nPrecious Metal | Currency Code | CoinMarketCap ID\n---------|---------------|-------------\nGold Troy Ounce | XAU | 3575\nSilver Troy Ounce | XAG | 3574\nPlatinum Ounce | XPT | 3577\nPalladium Ounce
print(type(str1))
>>> <lass 'str'>
strexample = "Alba 3526 (\u062f.\u062c) | Peso ($) | ARS | 28 (\u058f) ani Manat (\u20bc)"
print(strexample)
>>> Alba 3526 (د.ج) | Peso ($) | ARS | 28 (֏) ani Manat (₼)
如何从字符串中提取所有Unicode字符? 我想看到正确的货币符号,例如strexample显示正确。
currency.txt的内容(一行):
Albanian Lek (L) | ALL | 3526\nAlgerian Dinar (\u062f.\u062c) | DZD | 3537\nArgentine Peso ($) | ARS | 2821\nArmenian Dram (\u058f) | AMD | 3527\nAustralian Dollar ($) | AUD | 2782\nAzerbaijani Manat (\u20bc) | AZN | 3528\nBahraini Dinar (.\u062f.\u0628) | BHD | 3531\nBangladeshi Taka (\u09f3) | BDT | 3530\nBelarusian Ruble (Br) | BYN | 3533\nBermudan Dollar ($) | BMD | 3532\nBolivian Boliviano (Bs.) | BOB | 2832\nBosnia-Herzegovina Convertible Mark (KM) | BAM | 3529\nBrazilian Real (R$) | BRL | 2783\nBulgarian Lev (\u043b\u0432) | BGN | 2814\nCambodian Riel (\u17db) | KHR | 3549\nCanadian Dollar ($) | CAD | 2784\nChilean Peso ($) | CLP | 2786\nChinese Yuan (\u00a5) | CNY | 2787\nColombian Peso ($) | COP | 2820\nCosta Rican Col\u00f3n (\u20a1) | CRC | 3534\nCroatian Kuna (kn) | HRK | 2815\nCuban Peso ($) | CUP | 3535\nCzech Koruna (K\u010d) | CZK | 2788\nDanish Krone (kr) | DKK | 2789\nDominican Peso ($) | DOP | 3536\nEgyptian Pound (\u00a3) | EGP | 3538\nEuro (\u20ac) | EUR | 2790\nGeorgian Lari (\u20be) | GEL | 3539\nGhanaian Cedi (\u20b5) | GHS | 3540\nGuatemalan Quetzal (Q) | GTQ | 3541\nHonduran Lempira (L) | HNL | 3542\nHong Kong Dollar ($) | HKD | 2792\nHungarian Forint (Ft) | HUF | 2793\nIcelandic Kr\u00f3na (kr) | ISK | 2818\nIndian Rupee (\u20b9) | INR | 2796\nIndonesian Rupiah (Rp) | IDR | 2794\nIranian Rial (\ufdfc) | IRR | 3544\nIraqi Dinar (\u0639.\u062f) | IQD | 3543\nIsraeli New Shekel (\u20aa) | ILS | 2795\nJamaican Dollar ($) | JMD | 3545\nJapanese Yen (\u00a5) | JPY | 2797\nJordanian Dinar (\u062f.\u0627) | JOD | 3546\nKazakhstani Tenge (\u20b8) | KZT | 3551\nKenyan Shilling (Sh) | KES | 3547\nKuwaiti Dinar (\u062f.\u0643) | KWD | 3550\nKyrgystani Som (\u0441) | KGS | 3548\nLebanese Pound (\u0644.\u0644) | LBP | 3552\nMacedonian Denar (\u0434\u0435\u043d) | MKD | 3556\nMalaysian Ringgit (RM) | MYR | 2800\nMauritian Rupee (\u20a8) | MUR | 2816\nMexican Peso ($) | MXN | 2799\nMoldovan Leu (L) | MDL | 3555\nMongolian Tugrik (\u20ae) | MNT | 3558\nMoroccan Dirham (\u062f.\u0645.) | MAD | 3554\nMyanma Kyat (Ks) | MMK | 3557\nNamibian Dollar ($) | NAD | 3559\nNepalese Rupee (\u20a8) | NPR | 3561\nNew Taiwan Dollar ($) | TWD | 2811\nNew Zealand Dollar ($) | NZD | 2802\nNicaraguan C\u00f3rdoba (C$) | NIO | 3560\nNigerian Naira (\u20a6) | NGN | 2819\nNorwegian Krone (kr) | NOK | 2801\nOmani Rial (\u0631.\u0639.) | OMR | 3562\nPakistani Rupee (\u20a8) | PKR | 2804\nPanamanian Balboa (B/.) | PAB | 3563\nPeruvian Sol (S/.) | PEN | 2822\nPhilippine Peso (\u20b1) | PHP | 2803\nPolish Z\u0142oty (z\u0142) | PLN | 2805\nPound Sterling (\u00a3) | GBP | 2791\nQatari Rial (\u0631.\u0642) | QAR | 3564\nRomanian Leu (lei) | RON | 2817\nRussian Ruble (\u20bd) | RUB | 2806\nSaudi Riyal (\u0631.\u0633) | SAR | 3566\nSerbian Dinar (\u0434\u0438\u043d.) | RSD | 3565\nSingapore Dollar ($) | SGD | 2808\nSouth African Rand (Rs) | ZAR | 2812\nSouth Korean Won (\u20a9) | KRW | 2798\nSouth Sudanese Pound (\u00a3) | SSP | 3567\nSovereign Bolivar (Bs.) | VES | 3573\nSri Lankan Rupee (Rs) | LKR | 3553\nSwedish Krona (\tkr) | SEK | 2807\nSwiss Franc (Fr) | CHF | 2785\nThai Baht (\u0e3f) | THB | 2809\nTrinidad and Tobago Dollar ($) | TTD | 3569\nTunisian Dinar (\u062f.\u062a) | TND | 3568\nTurkish Lira (\u20ba) | TRY | 2810\nUgandan Shilling (Sh) | UGX | 3570\nUkrainian Hryvnia (\u20b4) | UAH | 2824\nUnited Arab Emirates Dirham (\u062f.\u0625) | AED | 2813\nUruguayan Peso ($) | UYU | 3571\nUzbekistan Som (so'm) | UZS | 3572\nVietnamese Dong (\u20ab) | VND | 2823\n \nAlong with these four precious metals:\n \nPrecious Metal | Currency Code | CoinMarketCap ID\n---------|---------------|-------------\nGold Troy Ounce | XAU | 3575\nSilver Troy Ounce | XAG | 3574\nPlatinum Ounce | XPT | 3577\nPalladium Ounce
那为什么要打印(str1)== currency.txt内容?
答案 0 :(得分:-1)
您的文件似乎包含 literal Unicode转义符,这表明用于生成文件的方法是错误的。
您可以通过对文件进行编码然后使用 unicode-escape 编解码器进行解码来解决此问题。
>>> # repr of the data
>>> content
'Alba 3526 (\\u062f.\\u062c) | Peso ($) | ARS | 28 (\\u058f) ani Manat (\\u20bc)\n'
>>> # str of the data
>>> print(content)
Alba 3526 (\u062f.\u062c) | Peso ($) | ARS | 28 (\u058f) ani Manat (\u20bc)
>>> print(content.encode('latin-1').decode('unicode-escape'))
Alba 3526 (د.ج) | Peso ($) | ARS | 28 (֏) ani Manat (₼)
latin-1编码将文本转换为字节,unicode-escape解码取消转义unicode文字,以便Python正确解释它们。