TIdHTTP Get方法在加载xml文件时显示奇怪的字符(不显示俄语字符)

时间:2011-02-21 10:11:41

标签: c++ xml c++builder

我正在使用TIdHTTP从url:http://nbt.tj/?c=4&id=28&lg=ru&d=21-02-2011&export=xmlout下载xml文件(货币值)并向我显示奇怪的字符

这是我的代码

UnicodeString s =serv->Get("http://nbt.tj/?c=4&id=28&lg=ru&d=13-10-2009&export=xmlout");
cxMemo1->Text=s;

我尝试将TIdHTTP charset属性设置为windows-1251,但它们完全相同 这是输出

<?xml version="1.0" encoding="windows-1251" ?>
<ValCurs Date="13/10/2009" name="Êîòèðîâêè âàëþò óñòàíàâëèâàåìûå åæåäíåâíî">
 <Valute ID="036">
   <CharCode>AUD</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Àâñòðàëèéñêèé äîëëàð</Name> 
   <Value>3,9651</Value> 
  </Valute>
 <Valute ID="944">
   <CharCode>AZN</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Àçåðáàéäæàíñêèé ìàíàò</Name> 
   <Value>5,4526</Value> 
  </Valute>
 <Valute ID="826">
   <CharCode>GBP</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Àíãëèéñêèé ôóíò ñòåðëèíãîâ</Name> 
   <Value>6,9160</Value> 
  </Valute>
 <Valute ID="051">
   <CharCode>AMD</CharCode> 
   <Nominal>100</Nominal> 
   <Name>Àðìÿíñêèõ äðàìîâ</Name> 
   <Value>1,1353</Value> 
  </Valute>
 <Valute ID="971">
   <CharCode>AFN</CharCode> 
   <Nominal>10</Nominal> 
   <Name>Àôãàíñêèõ àôãàíè</Name> 
   <Value>0,8816</Value> 
  </Valute>
 <Valute ID="974">
   <CharCode>BYR</CharCode> 
   <Nominal>100</Nominal> 
   <Name>Áåëîðóññêèõ ðóáëåé</Name> 
   <Value>0,1598</Value> 
  </Valute>
 <Valute ID="981">
   <CharCode>GEL</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Ãðóçèíñêèé ëàðè</Name> 
   <Value>2,6111</Value> 
  </Valute>
 <Valute ID="208">
   <CharCode>DKK</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Äàòñêàÿ êðîíà</Name> 
   <Value>0,8680</Value> 
  </Valute>
 <Valute ID="784">
   <CharCode>AED</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Äèðõàì  ÎÀÝ</Name> 
   <Value>1,1926</Value> 
  </Valute>
 <Valute ID="840">
   <CharCode>USD</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Äîëëàð ÑØÀ</Name> 
   <Value>4,3806</Value> 
  </Valute>
 <Valute ID="978">
   <CharCode>EUR</CharCode> 
   <Nominal>1</Nominal> 
   <Name>ÅÂÐÎ</Name> 
   <Value>6,4705</Value> 
  </Valute>
 <Valute ID="356">
   <CharCode>INR</CharCode> 
   <Nominal>10</Nominal> 
   <Name>Èíäèéñêèõ ðóïèé</Name> 
   <Value>0,9428</Value> 
  </Valute>
 <Valute ID="364">
   <CharCode>IRR</CharCode> 
   <Nominal>1000</Nominal> 
   <Name>Èðàíñêèõ ðèàëîâ</Name> 
   <Value>0,4413</Value> 
  </Valute>
 <Valute ID="352">
   <CharCode>ISK</CharCode> 
   <Nominal>10</Nominal> 
   <Name>Èñëàíäñêèõ êðîí</Name> 
   <Value>0,3504</Value> 
  </Valute>
 <Valute ID="398">
   <CharCode>KZT</CharCode> 
   <Nominal>10</Nominal> 
   <Name>Êàçàõñêèõ òåíãå</Name> 
   <Value>0,2906</Value> 
  </Valute>
 <Valute ID="124">
   <CharCode>CAD</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Êàíàäñêèé äîëëàð</Name> 
   <Value>4,2337</Value> 
  </Valute>
 <Valute ID="417">
   <CharCode>KGS</CharCode> 
   <Nominal>10</Nominal> 
   <Name>Êèðãèçñêèõ ñîìîâ</Name> 
   <Value>1,0032</Value> 
  </Valute>
 <Valute ID="156">
   <CharCode>CNY</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Êèòàéñêèé þàíü</Name> 
   <Value>0,6420</Value> 
  </Valute>
 <Valute ID="414">
   <CharCode>KWD</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Êóâåéòñêèé äèíàð</Name> 
   <Value>15,2794</Value> 
  </Valute>
 <Valute ID="428">
   <CharCode>LVL</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Ëàòâèéñêèé ëàò</Name> 
   <Value>9,1054</Value> 
  </Valute>
 <Valute ID="440">
   <CharCode>LTL</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Ëèòîâñêèé ëèò</Name> 
   <Value>1,8712</Value> 
  </Valute>
 <Valute ID="458">
   <CharCode>MYR</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Ìàëàéçèéñêèé ðèíããèò</Name> 
   <Value>1,2881</Value> 
  </Valute>
 <Valute ID="498">
   <CharCode>MDL</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Ìîëäàâñêèé ëåé</Name> 
   <Value>0,3932</Value> 
  </Valute>
 <Valute ID="949">
   <CharCode>TRY</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Íîâàÿ òóðåöêàÿ ëèðà</Name> 
   <Value>2,9934</Value> 
  </Valute>
 <Valute ID="578">
   <CharCode>NOK</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Íîðâåæñêàÿ êðîíà</Name> 
   <Value>0,7749</Value> 
  </Valute>
 <Valute ID="586">
   <CharCode>PKR</CharCode> 
   <Nominal>10</Nominal> 
   <Name>Ïàêèñòàíñêèõ ðóïèé</Name> 
   <Value>0,5260</Value> 
  </Valute>
 <Valute ID="985">
   <CharCode>PLN</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Ïîëüñêèé çëîòûé</Name> 
   <Value>1,5182</Value> 
  </Valute>
 <Valute ID="682">
   <CharCode>SAR</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Ðèàë Ñàóäîâñêîé Àðàâèè</Name> 
   <Value>1,1681</Value> 
  </Valute>
 <Valute ID="810">
   <CharCode>RUB</CharCode> 
   <Nominal>10</Nominal> 
   <Name>Ðîññèéñêèõ ðóáëåé</Name> 
   <Value>1,4814</Value> 
  </Valute>
 <Valute ID="960">
   <CharCode>XDR</CharCode> 
   <Nominal>1</Nominal> 
   <Name>ÑÄÐ</Name> 
   <Value>6,9556</Value> 
  </Valute>
 <Valute ID="702">
   <CharCode>SGD</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Ñèíãàïóðñêèé äîëëàð</Name> 
   <Value>3,1326</Value> 
  </Valute>
 <Valute ID="764">
   <CharCode>THB</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Òàèëàíäñêèé áàò</Name> 
   <Value>0,1314</Value> 
  </Valute>
 <Valute ID="795">
   <CharCode>TMM</CharCode> 
   <Nominal>1000</Nominal> 
   <Name>Òóðêìåíñêèõ ìàíàòîâ</Name> 
   <Value>0,3074</Value> 
  </Valute>
 <Valute ID="860">
   <CharCode>UZS</CharCode> 
   <Nominal>100</Nominal> 
   <Name>Óçáåêñêèõ ñóìîâ</Name> 
   <Value>0,2918</Value> 
  </Valute>
 <Valute ID="980">
   <CharCode>UAH</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Óêðàèíñêàÿ ãðèâíà</Name> 
   <Value>0,5313</Value> 
  </Valute>
 <Valute ID="752">
   <CharCode>SEK</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Øâåäñêàÿ êðîíà</Name> 
   <Value>0,6266</Value> 
  </Valute>
 <Valute ID="756">
   <CharCode>CHF</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Øâåéöàðñêèé ôðàíê</Name> 
   <Value>4,2555</Value> 
  </Valute>
 <Valute ID="233">
   <CharCode>EEK</CharCode> 
   <Nominal>1</Nominal> 
   <Name>Ýñòîíñêàÿ êðîíà</Name> 
   <Value>0,4130</Value> 
  </Valute>
 <Valute ID="392">
   <CharCode>JPY</CharCode> 
   <Nominal>10</Nominal> 
   <Name>ßïîíñêèõ èåí</Name> 
   <Value>0,4850</Value> 
  </Valute>
</ValCurs>

需要做什么?有什么建议?

3 个答案:

答案 0 :(得分:1)

您正在使用返回UTF-16编码TIdHTTP::Get()的{​​{1}}版本。该版本的UnicodeString将接收到的内容指定字符集中的原始字节解码为UTF-16(Get()识别各种基于XML的TIdHTTP值,如果检测到,则从中提取字符集XML prolog直接,Content-Type在这种情况下,无论HTTP服务器说什么字符集是什么)。您在windows-1251中看到的是解码的Unicode字符,而不是原始编码的Ansi八位字节。

通常,不应该以这种方式处理XML。正确的字节编码对XML很重要。您应该使用将数据下载到TMemo而不是Get()的{​​{1}}版本。然后,您可以根据需要使用原始未解码的字节,例如将它们传递给真正的XML解析器,例如TStream,例如:

UnicodeString

答案 1 :(得分:0)

尝试使用“cp1251”或“1251”作为charset属性。或者尝试使用此功能:

function RussianToUnicode(S: String): String;
var Wrd:Word;
  pW,pR:PWord;
  len:Integer;
begin
  pW:=@S[1];
  len:=Length(S);
  SetLength(Result,len);
  pR:=@Result[1];
  while Len<>0 do begin
    Wrd:=pW^;
    case Wrd of
      $C0..$DF,$E0..$FF:pR^:=Wrd+$0350;
      else pR^:=Wrd;
    end;
    inc(pW);
    inc(pR);
    dec(Len);
  end;
end;

像这样使用:

text:= RussianToUnicode(IdHTTP.Get('url'));

答案 2 :(得分:0)

感谢@ remy-lebeau-teamb我的问题已经解决了!!!

 TMemoryStream *XML = new TMemoryStream;
 serv->Get("http://nbt.tj/?c=4&id=28&lg=ru&d=13-10-2009&export=xmlout", XML);
 XML->Position = 0;
// use XML as needed...
 AnsiString s="";
 ReadStringFromStream(XML,s);
 cxMemo1->Text=s;
 delete XML;