真的在寻找一些建议。我有一个学校出勤数据的数据集,其中包括:student_id,school_id,term_start_date,attendance_marks。
此数据的示例行将是:
1234, 1002, 2016-09-01, '/\##/L/\##BB/\/\/\/\/\'
出勤标记字符串基本上是每天两个会话,每个标记对应各种不同的代码。基本上,我得到一个开始约会,然后必须每天计算出来的那一天的出勤标记...我知道可怕,但这就是我如何得到我害怕的数据。
我编写了一个脚本对象来循环遍历此字符串,并为每天输出一行以加载到数据仓库中。
我应该说在发布这个脚本有效的代码之前,它会做我需要它做的事情......但是它很慢。我希望我能借鉴这里的集体经验,看看是否有更有效的方法来实现这一目标?
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
int studentID = Row.STUDDENTID;
int baseID = Row.SCHOOLID;
DateTime dateKey = Row.STARTDATE;
string marks = Row.MARKS.ToString();
// TODO
// This is a bodge at the moment to make sure the mark string is always divisible by 2
// in the production release I'll have to handle this as an exception
if (marks.Length % 2 != 0)
{
marks = marks + '@';
}
char[] c = marks.ToCharArray();
for (int i = 0; i < c.Length; i += 2)
{
Output0Buffer.AddRow();
Output0Buffer.baseID = baseID;
Output0Buffer.studentID = studentID;
Output0Buffer.dateKey = dateKey.AddDays(i / 2).Year * 10000 + dateKey.AddDays(i / 2).Month * 100 + dateKey.AddDays(i / 2).Day;
Output0Buffer.markAM = c[i].ToString();
Output0Buffer.markPM = c[i + 1].ToString();
if (i == c.Length - 1)
{
base.FinishOutputs();
}
}
}
有更好的方法吗?我是否已经大规模地推翻了它,或者我只是需要与它一起生活需要多长时间才能运行?提前谢谢。
答案 0 :(得分:1)
这可能是用Oracle Pl-SQL编写的,虽然我既不与Oracle合作也不建议你在实时系统中进行转换作为提取的一部分
如果您将数据按原样加载到SQL Server登台环境中,则可以使用利用派生计数表的脚本将OLEDB Source
字符串拆分为Attendance
组件中的数据。每行一个字符。这将适用于多个学生,学校和几天(即:它是一个基于适当的集合的解决方案),因此可以用作一次获取将返回整个数据集:
-- Create test data
declare @d table(StudentID int,SchoolID int, AttDate date, Attendance nvarchar(500));
insert into @d values
(1234, 1002, '20160901', '/\##/L/\##BB/\/\/\/\')
,(1235, 1002, '20160901', '/\##/L/\##/\BB/\/\/\')
,(1234, 1002, '20160902', '/\##/L/\##BB/\/\/\/\/\')
,(1235, 1002, '20160902', '/\##/L/\/\BB/\/\##/\/\');
-- Create a list of 10 rows
with n(n) as (select n from (values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n(n))
-- Cross join 6 times to reach 1,000,000 numbers if required. Filtered to longest Attendance string in the SELECT TOP()
,t(t) as (select top (select max(len(Attendance)) from @d) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4,n n5,n n6)
select d.StudentID
,d.SchoolID
,d.AttDate
,d.Attendance
,t.t as AttCharNum
,substring(d.Attendance,t.t,1) as AttChar -- Use SUBSTRING to retreive just the one character
from @d d
join t -- Join to numbers only where they are actually required
on t.t <= len(d.Attendance)
order by d.StudentID
,d.AttDate
,t.t;
输出:
+-----------+----------+------------+------------------------+------------+---------+
| StudentID | SchoolID | AttDate | Attendance | AttCharNum | AttChar |
+-----------+----------+------------+------------------------+------------+---------+
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 1 | / |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 2 | \ |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 3 | # |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 4 | # |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 5 | / |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 6 | L |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 7 | / |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 8 | \ |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 9 | # |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 10 | # |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 11 | B |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 12 | B |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 13 | / |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 14 | \ |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 15 | / |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 16 | \ |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 17 | / |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 18 | \ |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 19 | / |
| 1234 | 1002 | 2016-09-01 | /\##/L/\##BB/\/\/\/\ | 20 | \ |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 1 | / |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 2 | \ |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 3 | # |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 4 | # |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 5 | / |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 6 | L |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 7 | / |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 8 | \ |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 9 | # |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 10 | # |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 11 | B |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 12 | B |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 13 | / |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 14 | \ |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 15 | / |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 16 | \ |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 17 | / |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 18 | \ |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 19 | / |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 20 | \ |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 21 | / |
| 1234 | 1002 | 2016-09-02 | /\##/L/\##BB/\/\/\/\/\ | 22 | \ |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 1 | / |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 2 | \ |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 3 | # |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 4 | # |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 5 | / |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 6 | L |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 7 | / |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 8 | \ |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 9 | # |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 10 | # |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 11 | / |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 12 | \ |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 13 | B |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 14 | B |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 15 | / |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 16 | \ |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 17 | / |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 18 | \ |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 19 | / |
| 1235 | 1002 | 2016-09-01 | /\##/L/\##/\BB/\/\/\ | 20 | \ |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 1 | / |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 2 | \ |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 3 | # |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 4 | # |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 5 | / |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 6 | L |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 7 | / |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 8 | \ |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 9 | / |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 10 | \ |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 11 | B |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 12 | B |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 13 | / |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 14 | \ |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 15 | / |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 16 | \ |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 17 | # |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 18 | # |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 19 | / |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 20 | \ |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 21 | / |
| 1235 | 1002 | 2016-09-02 | /\##/L/\/\BB/\/\##/\/\ | 22 | \ |
+-----------+----------+------------+------------------------+------------+---------+