根据公式

时间:2018-04-04 12:06:03

标签: algorithm loops for-loop foreach stata

我有两张桌子:

第一个表格包含姓名,日期,时间和日内价格变量。这意味着在特定日期和时间中每个名称的盘中价格。 第二个表有名称,日期和每日价格,每日价格是每个名称和日期的日内价格汇总。 我尝试编写一个执行以下过程的程序:

它可以在两个表中按名称和日期查找相同的观察结果,然后:

如果第一个和最后一个盘中价格超出了最后一天的0.962和1.0398倍的每日价格;然后删除与表1中该特定名称和日期相关的所有数据。

陈述是:

如果第一个也是最后一个(具体名称和日期的日内价格)不是[0.962 *(昨天的每日价格),1.0398 *(昨天的每日价格)]那么删除。

例如,考虑以下两个表:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 name long date str8 time double intraday_price
"A" 17659 "11:32:41"    3
"A" 17659 "12:32:41"    2
"A" 17659 "13:32:41"    1
"A" 17660 "11:32:41" 3.95
"A" 17660 "12:32:41"    3
"A" 17660 "13:32:41"    6
"A" 17660 "14:32:41" 4.01
"B" 17659 "11:32:41"  3.1
"B" 17659 "12:32:41"    1
"B" 17659 "13:32:41"    4
"B" 17659 "14:32:41"  2.9
"B" 17660 "11:32:41"    6
"B" 17660 "12:32:41"    1
"B" 17661 "11:32:41"    5
"B" 17661 "12:32:41"    7
"C" 17659 "11:32:41"    3
"C" 17659 "12:32:41"    2
"C" 17660 "11:32:41"  6.1
"C" 17660 "12:32:41"    3
"C" 17660 "13:32:41"    2
"C" 17661 "11:32:41"    8
"C" 17661 "12:32:41"    2
"C" 17661 "13:32:41"    3
"C" 17661 "14:32:41"    2
end
format %d date

表2是:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 name long date double daily_price
"A" 17657 3
"B" 17657 6
"C" 17657 5
"A" 17658 5
"A" 17659 4
"B" 17658 3
"B" 17659 4
"B" 17660 3
"C" 17658 7
"C" 17659 6
"C" 17660 5
end
format %d date

请考虑在公式中使用昨天的每日价格。

结果是:

+------+----------+----------+----------------+
| name |   date   |   time   | intraday price |
+------+----------+----------+----------------+
| B    | 7-May-08 | 11:32:41 |            3.1 |
| B    | 7-May-08 | 12:32:41 |              1 |
| B    | 7-May-08 | 13:32:41 |              4 |
| B    | 7-May-08 | 14:32:41 |            2.9 |
| A    | 8-May-08 | 11:32:41 |           3.95 |
| A    | 8-May-08 | 12:32:41 |              3 |
| A    | 8-May-08 | 13:32:41 |              6 |
| A    | 8-May-08 | 14:32:41 |           4.01 |
| C    | 8-May-08 | 11:32:41 |            6.1 |
| C    | 8-May-08 | 12:32:41 |              3 |
| C    | 8-May-08 | 13:32:41 |              2 |
+------+----------+----------+----------------+
你能告诉我怎么做吗?

1 个答案:

答案 0 :(得分:2)

您的问题不是很明确,我确定这是否是您想要的,而且您还有很多缺失的数据(表2中的名称日期与名称日期不匹配在表1)中,让我知道这是否达到你想要的效果。

基本上,我们将两个表都创建为临时文件。对于表2,我们首先在数据的最后一天之后创建一个值,因为我们想要一个"最后一天的价格"变量。然后我们创建"最后一天的价格"变量(我们可以在技术上使用时间序列运算符来执行此操作,但这有点简单)。然后我们将表2合并到表1上。我放弃任何没有日内价格的观察,因为我假设这些与你无关,然后使用bysort创建一个指标,表明你是否应该放弃。我注释掉了实际丢弃的部分,因此您可以首先关注数据,以确保达到您真正想要的效果。

首先,输入您的数据:

    clear
    tempfile table1 table2

//  Input data
    input str4 name long date str8 time double intraday_price
    "A" 17659 "11:32:41"    3
    "A" 17659 "12:32:41"    2
    "A" 17659 "13:32:41"    1
    "A" 17660 "11:32:41" 3.95
    "A" 17660 "12:32:41"    3
    "A" 17660 "13:32:41"    6
    "A" 17660 "14:32:41" 4.01
    "B" 17659 "11:32:41"  3.1
    "B" 17659 "12:32:41"    1
    "B" 17659 "13:32:41"    4
    "B" 17659 "14:32:41"  2.9
    "B" 17660 "11:32:41"    6
    "B" 17660 "12:32:41"    1
    "B" 17661 "11:32:41"    5
    "B" 17661 "12:32:41"    7
    "C" 17659 "11:32:41"    3
    "C" 17659 "12:32:41"    2
    "C" 17660 "11:32:41"  6.1
    "C" 17660 "12:32:41"    3
    "C" 17660 "13:32:41"    2
    "C" 17661 "11:32:41"    8
    "C" 17661 "12:32:41"    2
    "C" 17661 "13:32:41"    3
    "C" 17661 "14:32:41"    2
    end
    format %d date

    save `table1'

    clear
    input str4 name long date double daily_price
    "A" 17657 3
    "B" 17657 6
    "C" 17657 5
    "A" 17658 5
    "A" 17659 4
    "B" 17658 3
    "B" 17659 4
    "B" 17660 3
    "C" 17658 7
    "C" 17659 6
    "C" 17660 5
    end
    format %d date

现在,进行更改:

//  Create a new observation to create a "lastday_price" for the day AFTER the last day in the data
    levelsof name, local(names)
    foreach name of local names {
        set obs  `=_N+1'
        replace name = "`name'" if missing(name)
    }
    sort name date

//  Generate lastday_price
    bysort name (date): gen lastday_price = daily_price[_n-1]
    bysort name (date): replace date = date[_n-1] + 1 if missing(date)
    save `table2'

//  Merge table 2 onto table 1 by name and date
    use `table1', clear
    merge m:1 name date using `table2'
        drop if _merge == 2     // Only daily prices, no intra_day price

//  Generate indicator for whether or not to drop
    bysort name date (time): gen drop = 1 if    ///
        !inrange(intraday_price[1],0.962*lastday_price,1.0398*lastday_price) &  ///
        !inrange(intraday_price[_N],0.962*lastday_price,1.0398*lastday_price) & ///
        !missing(lastday_price)

*drop if drop == 1