在抓取HTML表格时,如果表格中的单元格(td)有多个属性(例如参见HTML代码段),您如何将两者分开和/或如何只选择一个?
HTML片段:
package com.sagproductions.gamertaggenerator;
import android.support.v7.app.AppCompatActivity;
import android.os.Bundle;
import android.view.View;
import android.widget.Button;
import android.widget.EditText;
import android.widget.TextView;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.lang.String;
import java.util.ArrayList;
import java.util.Random;
public class MainActivity extends AppCompatActivity {
WordGenerator WG;
private Button mSingleWordGenerator;
private Button mTwoWordGenerator;
private Button mThreeWordGenerator;
private EditText editText;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
mSingleWordGenerator = (Button)findViewById(R.id.button);
mSingleWordGenerator.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
generateWord();
editText=(EditText)findViewById(R.id.editText);
editText.setText(WG.getRandomWord(),TextView.BufferType.EDITABLE);
}
});
mTwoWordGenerator = (Button)findViewById(R.id.button2);
mTwoWordGenerator.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
}
});
mThreeWordGenerator = (Button)findViewById(R.id.button3);
mThreeWordGenerator.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
}
});
}
public void generateWord(){
final File file = new File("words.txt");
ArrayList<String> words= new ArrayList<String>();
Random rand=new Random();
try(final Scanner scanner=new Scanner(file)){
while(scanner.hasNextLine()){
String line = scanner.nextLine();
words.add(line);
}
}catch(FileNotFoundException e) {
e.printStackTrace();
}
WG.setRandomWord(words.get(rand.nextInt(words.size())));
}
}
我正在尝试的代码:
关于如何a)选择其中一个名称,或b)将单元格分成两个单元格的任何建议都将不胜感激。
谢谢。
答案 0 :(得分:2)
如果您想要全名和短名称,可以试试这个:
for td in row.find_all('td'):
full_name = td.find('a', {'class': 'full-name'}).text
short_name = td.find('a', {'class': 'short-name'}).text
答案 1 :(得分:1)
尝试使用正则表达式匹配tr
players = the_soup.findAll('tr',{'class':re.compile("player-overview")})
for p in players:
name = p.find('a',{'class':'full-name'}).get_text()