确定拉赫曼数据库中的新秀年

时间:2011-05-15 23:04:16

标签: mysql

我正在使用Lahman Baseball Database的MySQL版本,我无法确定玩家失去新秀的一年。失去新秀的MLB球员的规则是:

  

球员应被视为新秀,除非在上个赛季或赛季中,他(a)在主要联赛中超过130次击球或50次投球;或(b)在25名球员限制期间(不包括服兵役时间和残疾人名单上的时间)在大联盟俱乐部或俱乐部的活跃名册上累计超过45天。

是否有可以为Batters和Pitchers执行此操作的查询,或者这是以编程方式完成的事情?

2 个答案:

答案 0 :(得分:1)

这可以在SQL中完成。如何完成将基于什么是最佳的方式。最有可能的是,可以使用一个类似的查询(伪代码):

SELECT Master.*
FROM Master
LEFT JOIN Batting ON Master.player_id = Batting.player_id
LEFT JOIN Pitching ON Master.player_id = Pitching.player_id
WHERE Batting.AB > 130 OR Pitching.IPOuts > (50 x 3) 
OR Master.DaysActive > 45

WHERE语句的最后一部分有点不确定,因为我在数据库提供程序的数据中找不到类似的东西。我看到活跃的游戏,但这不是一回事。 “外观”表可能会让您关闭,但这就是您可以做的所有事情。

这是我基于伪代码的数据:

http://baseball1.com/files/database/readme58.txt

我确实找到了另一个正在做类似于你正在做的事情的人(包括计算谁是新手)。这是他的网站(带代码):

http://baseballsimulator.com/blog/category/database/

答案 1 :(得分:1)

使用拉赫曼数据库,您可以通过At Bats(> 130)和Innings Pitched(> 50)找出Rookies,但是在25人名单(非Sept)限制期间没有任何服务时间。

您需要使用回溯表{http://www.retrosheet.org/game.htm}数据来执行此操作。

以下查询将为您提供At Bats和Innings Pitched的所有新秀,但服务时间新手将是例外。其中只有少数几支球队不会将新秀留在美国职业棒球大联盟的名单上并且不参加比赛。失去了开发时间(没有玩)并加速了他们在受控年份失去的服务时间。所以如果你对此感到满意,那么这些表就可以了。

您可以将此作为带有击球手或投手的外部参照表来突出他们的新秀年。或者你可以通过RookieYr的区别为击球手和投手增加一个额外的栏目(建议反对它,好像你想为你的拉赫曼数据库增加新的赛季 - 需要更少的定制)。

/************************************ Create MLB Rookie Xref Table **********************************************
-- Sort Out Batters who accumulate 130 AB
-- Sort Out Pitchers who accumulate 50 IP
-- Define Rookie Year, Drop off years previous and years after
-- Can be updated Annually using "player ID not in (select distinct playerID from Xref_RookieYr)
-- Using the Sean Lahman Database 
-- Authored By Paul DeVos {www.linkedin.com/in/devosp/}
*****************************************************************************************************************/

/****** Query uses T-SQL, Query ran in MS SQL 2012 - you may need to tweek for other platorms or versions. ******/

--Step 1 - Run this for hitter accumulated ABs and when Rookie Year (130 Career At Bats)
Select
    concat(m.nameFirst, ' ', m.nameLast) as Name,
    b.PlayerID,
    b.yearID,
    m.debut,
    sum(b.ab) over (partition by b.playerID order by b.playerID, b.yearID) as CumulativeAB,
    null as CumulativeIP,  -- Place Holder for Rookie Pitchers Insert
    case when sum(b.ab) over (partition by b.playerID order by b.playerID, b.yearID) >= 130 then b.yearID end as RookieYR
into #temp_rookie_year
from
    [master] m
    inner join Batting b
    on m.playerID=b.playerID
-- Selects Position Players 
where b.playerID not in (select distinct f.playerID from Fielding f where f.pos = 'P')


--Step 2 - Run this to get accumulated IP and Rookie Year (50 Career IP)
Insert into #temp_rookie_year
    (
        Name, PlayerID, YearID, Debut, CumulativeAB, CumulativeIP, RookieYR
    )
Select
    concat(m.nameFirst, ' ', m.nameLast) as Name,
    p.PlayerID,
    p.yearID,
    m.debut,
    null as CumulativeAB,
    sum(p.IPouts) over (partition by p.playerID order by p.playerID, p.yearID) as CumulativeIP,
    case when sum(p.IPouts) over (partition by p.playerID order by p.playerID, p.yearID) >= 150 then p.yearID end as RookieYR
from [master] m
    inner join pitching p
    on m.playerID=p.playerID
--Chooses Pitchers
where p.playerID in (select distinct f.playerID from Fielding f where f.pos = 'P')


--Step 3 Run this - sorts out the rookie year into Rookie Xref Table
select Name, PlayerID, min(RookieYr) as RookieYear
into #Xref_RookieYr
from #temp_rookie_year
--where name = 'Hank Aaron'
group by Name, PlayerID
order by RookieYear desc

--Step 4 - run IF you want to remove players who never lost rookie status (cup of cofee players, etc - anyone under 130 AB or 50 IP)
select * from #Xref_RookieYr
order by playerID

Still has NUlls in Table

Delete from #Xref_RookieYr where RookieYear is null


select * from #Xref_RookieYr
order by playerID

Doesn't Have Nulls in Table

/*****************************************************************************************************************
You can change drop the "#" in front of the table (and name it whatever you want) when you want a permanent table. 
If you leave it, it'll drop off when you close the program. e.g. Xref_Rookie_2013
*****************************************************************************************************************/