所以我试图从活动标题中自动分类体育。
它工作正常,但我认为应该有一个更好,更可靠的方法来做到这一点。对于某些类似于(FIFA
)的体育项目,它会将NCAA
的类型输出为FIFA
,而MMA
则输出str_contains
同样的内容。
这是我的代码(注意: $strTitle = strtolower($title);
if(str_contains($strTitle, 'mlb') || str_contains($strTitle, 'baseball')) {
$category = 'Baseball';
$type = 'MLB';
} elseif (str_contains($strTitle, 'nba') || str_contains($strTitle, 'fiba') || str_contains($strTitle, 'basketball') || str_contains($strTitle, 'wnba')) {
$category = 'Basketball';
$type = (str_contains($strTitle, 'nba')) ? 'NBA':
(str_contains($strTitle, 'fiba')) ? 'FIBA':
(str_contains($strTitle, 'wnba')) ? 'WNBA':'Basketball';
} elseif (str_contains($strTitle, 'nhl') || str_contains($strTitle, 'hockey')) {
$category = 'Hockey';
$type = 'NHL';
} elseif (str_contains($strTitle, 'nascar') || str_contains($strTitle, 'formula one') || str_contains($strTitle, 'gp2') || str_contains($strTitle, 'gp3') || str_contains($strTitle, 'motogp') || str_contains($strTitle, 'moto2') || str_contains($strTitle, 'moto3') || str_contains($strTitle, 'f1')) {
$category = 'Motor Sport';
$type = (str_contains($strTitle, 'nascar')) ? 'NASCAR':
(str_contains($strTitle, 'gp2')) ? 'GP2':
(str_contains($strTitle, 'gp3')) ? 'GP3':
(str_contains($strTitle, 'motogp')) ? 'MotoGP':
(str_contains($strTitle, 'moto2')) ? 'Moto2':
(str_contains($strTitle, 'moto3')) ? 'Moto3':
(str_contains($strTitle, 'f1') || str_contains($strTitle, 'formula one')) ? 'F1':'Motor Sport';
} elseif (str_contains($strTitle, 'nfl') || str_contains($strTitle, 'afl') || str_contains($strTitle, 'welsh premier league') || str_contains($strTitle, 'fox college') || str_contains($strTitle, 'football') || str_contains($strTitle, 'serie') || str_contains($strTitle, 'soccer') || str_contains($strTitle, 'fifa') || str_contains($strTitle, 'ncaa')) {
$category = 'Football';
$type = (str_contains($strTitle, 'nfl')) ? 'NFL':
(str_contains($strTitle, 'fifa')) ? 'FIFA':
(str_contains($strTitle, 'afl')) ? 'AFL':
(str_contains($strTitle, 'welsh premier league')) ? 'Welsh Premier League':
(str_contains($strTitle, 'ncaa')) ? 'NCAA':'Football';
} elseif (str_contains($strTitle, 'tennis')) {
$category = 'Tennis';
$type = 'Tennis';
} elseif (str_contains($strTitle, 'golf')) {
$category = 'Golf';
$type = 'Golf';
} elseif (str_contains($strTitle, 'rugby') || str_contains($strTitle, 'nrl')) {
$category = 'Rugby';
$type = (str_contains($strTitle, 'nrl')) ? 'NRL' : 'Rugby';
} elseif (str_contains($strTitle, 'sailing') || str_contains($strTitle, 'america\'s cup')) {
$category = 'Water Sport';
$type = 'Sailing';
} elseif (str_contains($strTitle, 'boxing') || str_contains($strTitle, 'fight night') || str_contains($strTitle, 'fighting') || str_contains($strTitle, 'wwe') || str_contains($strTitle, 'smackdown') || str_contains($strTitle, 'raw') || str_contains($strTitle, 'wwe main event') || str_contains($strTitle, 'mma') || str_contains($strTitle, 'strikeforce') || str_contains($strTitle, 'tna')) {
$category = 'Boxing';
$type = (str_contains($strTitle, 'ufc')) ? 'UFC' :
(str_contains($strTitle, 'smackdown')) ? 'WWE Smackdown' :
(str_contains($strTitle, 'raw')) ? 'WWE RAW' :
(str_contains($strTitle, 'wwe main event')) ? 'WWE Main Event':
(str_contains($strTitle, 'wwe')) ? 'WWE':
(str_contains($strTitle, 'mma')) ? 'MMA':
(str_contains($strTitle, 'tna')) ? 'TNA':
(str_contains($strTitle, 'strikeforce')) ? 'Strikeforce':
(str_contains($strTitle, 'fight night')) ? 'Fight Night':
(str_contains($strTitle, 'fighting')) ? 'Fighting':'Boxing';
} elseif (str_contains($strTitle, 'cricket') || str_contains($strTitle, 'icc') || str_contains($strTitle, 'mcc') || str_contains($strTitle, 'odi') || str_contains($strTitle, 'ipl') || str_contains($strTitle, 't20') || str_contains($strTitle, 'twenty20')) {
$category = 'Cricket';
$type = (str_contains($strTitle, 'icc')) ? 'ICC' :
(str_contains($strTitle, 'mcc')) ? 'MCC' :
(str_contains($strTitle, 'odi')) ? 'ODI':
(str_contains($strTitle, 'ipl')) ? 'IPL':
(str_contains($strTitle, 't20')) ? 'T20':
(str_contains($strTitle, 'twenty20')) ? 'Twenty20':'Cricket';
}
是我正在使用的laravel辅助函数。)
{{1}}
注意2:这不是完整的代码,也不适用于所有体育项目,仅适用于我的atm。
答案 0 :(得分:2)
虽然这是一个远非理想的解决方案,但我将这些东西放在一起会产生相同的结果,可能会有类似的性能影响(不知道,真的,只是猜测),多可读性很多< /强>
在此之前,严肃地说:看看你所有的三元条件。这看起来是个好主意吗?!
str_contains()
使用PHP的strpos()
,区分大小写。您需要记住这一点,或者在搜索/比较之前简单地小写整个字符串。
同样,strpos()
并不关心它是否在其他单词/字符串中找到 字符串。因此,例如,如果标题包含“WNBA”,则关键字“NBA”将匹配 first ,然后此检查将结束,从而为您提供意外结果。您可以通过将关键字从最大,最具体的第一个,最小的,最不明确的关键字列出来解决此问题。
除非你使用了大量的关键字(我的意思是废话),否则这里的表现并不算太糟糕。但是,由于订单的原因,您仍然可以在没有找到匹配的情况下查看10-20组关键字。除了使用基于文本的搜索软件(例如Sphinx,Lucene,Solr,基于DB等)之外,我没有真正好的,即时的解决方案,但要记住这一点。
// Define your sports and their keywords / human values.
// I use an array of objects (I like objects). This could be
// a JSON or XML feed, generated through an API or your DB.
// Doesn't matter. Just give the data you need to check against
// a structure, not just hardcoded into conditionals.
$sports = [
(object) [
'category' => 'Baseball',
'keywords' => [
'baseball' => 'Baseball',
'mlb' => 'MLB'
],
],
(object) [
'category' => 'Basketball',
'keywords' => [
'basketball' => 'Basketball',
'nba' => 'NBA',
'fiba' => 'FIBA',
'wnba' => 'WNBA',
],
],
(object) [
'category' => 'Motor Sport',
'keywords' => [
'nascar' => 'NASCAR',
'gp2' => 'GP2',
'gp3' => 'GP3',
'motogp' => 'MotoGP',
'moto2' => 'Moto2',
'moto3' => 'Moto3',
'f1' => 'Formula 1'
],
],
];
$title = strtolower("Rookie player Cryode injured in bizarre FIBA accident.");
$sport_category = null;
$sport_type = null;
// Step 1: Loop each sport.
foreach ($sports as $sport)
{
// Step 2: At least one keyword matched. Let's see which one.
foreach ($sport->keywords as $key_search => $key_type)
{
if (str_contains($title, $key_search))
{
// Step 3: We've found the matching keyword.
// Define the info we need from it...
$sport_category = $sport->category;
$sport_type = $key_type;
// ... then break BOTH loops.
break 2;
}
}
}
// Step 4: Check for no matches here by seeing
// if the category or type is still null.
// Or, initially set vars to default values.
var_dump($sport_category, $sport_type);