所以我有这个代码,它将整个 HTML 源代码剥离到列中的下一个单元格。问题是我用来提取 HTML 源代码的网页有一些波兰语字母,如“ą”、“ś”等。 有没有办法用这些波兰字母粘贴代码?现在我得到了一些带有问号等的疯狂方块。有什么提示吗?

Sub audycje()
    Dim strona As Object
    Dim adres As String
    Dim wb As Workbook
    Dim a As Object
    Dim str_var As Variant
    Set wb = ThisWorkbook
    adres = InputBox("Podaj adres strony")
    If adres = "" Then
       MsgBox ("Nie podano strony do zaladowania")
    Exit Sub
    End If
    Set strona = CreateObject("htmlfile")   'Create HTMLFile Object
    With CreateObject("msxml2.xmlhttp")  'Get the WebPage Content
       .Open "GET", adres, False
       strona.Body.Innerhtml = .responseText
    End With
    split_var = Split(strona.Body.Innerhtml, Chr(10))
    Application.ScreenUpdating = False
    For i = 0 To UBound(split_var, 1)
       Cells(2 + i, 2).Value2 = split_var(i)
    Next i
    Application.ScreenUpdating = True
    End Sub

3 个答案:

答案 0 :




Function StripAccent(thestring As String)
' Replaces accented characters with regular characters
  Dim A As String * 1
  Dim B As String * 1
  Dim i As Integer
  Const AccChars = "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåáçèéêëéìíîïðñòóôõöøùúûüýÿ"
  Const RegChars = "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaaceeeeeiiiidnoooooouuuuyy"
  For i = 1 To Len(AccChars)
    A = Mid(AccChars, i, 1)
    B = Mid(RegChars, i, 1)
    thestring = Replace(thestring, A, B)
    thestring = Application.WorksheetFunction.Trim(thestring)
  StripAccent = thestring
End Function 

选项 2: 另一种选择是将文档作为“Unicode 文本”导入。这应该保留波兰语字符。

为了测试,我从网页上复制了一个波兰语段落,然后使用选择性粘贴 >> Unicode 文本将其粘贴到 Excel 电子表格单元格中,并保留了波兰语字符。

答案 1 :

对于编码问题,请在开头添加(Office 2013 及更高版本中可用的功能):

Mystring = WorksheetFunction.EncodeURL(Mystring)

请参阅我在 Extract content of div from Google Translate with VBA 上的原始帖子

如果您的 Office 版本是 2013 之前的版本,或者您需要分发给可能有旧版本的用户,请使用: How can I URL encode a string in Excel VBA?


Dim Mystring as string
For i = 0 To UBound(split_var, 1)
   Mystring= split_var(i)
   Mystring = WorksheetFunction.EncodeURL(Mystring)
   Cells(2 + i, 2).Value2 = Mystring
Next i

答案 2 :

经过一个月的搜索,我终于找到了! 下面的代码可以解决问题:)



