更改熊猫数据框切片上数据的正确方法

时间:2018-12-31 05:24:02

标签: python pandas

我有一个EOD股票数据的熊猫数据框,如下所示:

    Date    High    Low Open    Close   Volume  Adj Close   Symbol  Pct_Change
0   1999-11-18  35.765381   28.612303   32.546494   31.473534   62546300.0  27.369196   A   0
1   1999-11-19  30.758226   28.478184   30.713520   28.880543   15234100.0  25.114351   A   0
2   1999-11-22  31.473534   28.657009   29.551144   31.473534   6577800.0   27.369196   A   0
3   1999-11-23  31.205294   28.612303   30.400572   28.612303   5975600.0   24.881086   A   0
4   1999-11-24  29.998211   28.612303   28.701717   29.372318   4843200.0   25.541994   A   0

我想添加一个Pct_Change列,该列计算Adj Close列中每个收盘价的变化百分比。我可以做这样的事情:

df.Pct_Change = df['Adj Close'].pct_change()

这个解决方案将会很接近,但是在股票之间会有一些重叠,这是我不希望的。

这是我正在尝试的解决方案,但未在原始df中设置数据,因此我最终在Pct_Change列中的所有内容仍为0

# first set everything equal to 0
df['Pct_Change'] = 0

for stock in all_data.Symbol.unique():
    subset = df.loc[all_data.Symbol == stock]
    subset.Pct_Change = subset['Adj Close'].pct_change()

编辑:我无法使这些解决方案正常工作,因此下面我将一个最小的数据集用于工作,这可能有助于测试。

pd.DataFrame({'Date': {0: Timestamp('1988-01-04 00:00:00'),
  1: Timestamp('1988-01-05 00:00:00'),
  2: Timestamp('1988-01-06 00:00:00'),
  3: Timestamp('1988-01-07 00:00:00'),
  4: Timestamp('1988-01-08 00:00:00'),
  5: Timestamp('1988-01-04 00:00:00'),
  6: Timestamp('1988-01-05 00:00:00'),
  7: Timestamp('1988-01-06 00:00:00'),
  8: Timestamp('1988-01-07 00:00:00'),
  9: Timestamp('1988-01-08 00:00:00')},
 'High': {0: 1.5982142686843872,
  1: 1.6517857313156128,
  2: 1.6071428060531616,
  3: 1.5982142686843872,
  4: 1.6160714626312256,
  5: 10.15625,
  6: 10.34375,
  7: 10.25,
  8: 10.375,
  9: 10.28125},
 'Low': {0: 1.5089285373687744,
  1: 1.5803571939468384,
  2: 1.5625,
  3: 1.5178571939468384,
  4: 1.4107142686843872,
  5: 9.6875,
  6: 10.09375,
  7: 10.09375,
  8: 10.0,
  9: 9.15625},
 'Open': {0: 1.5267857313156128,
  1: 1.6428571939468384,
  2: 1.6071428060531616,
  3: 1.5535714626312256,
  4: 1.5892857313156128,
  5: 9.71875,
  6: 10.1875,
  7: 10.21875,
  8: 10.0625,
  9: 10.21875},
 'Close': {0: 1.5982142686843872,
  1: 1.59375,
  2: 1.5625,
  3: 1.5892857313156128,
  4: 1.4285714626312256,
  5: 10.125,
  6: 10.1875,
  7: 10.09375,
  8: 10.28125,
  9: 9.5},
 'Volume': {0: 82600000.0,
  1: 77280000.0,
  2: 67200000.0,
  3: 53200000.0,
  4: 121520000.0,
  5: 5674400.0,
  6: 8926800.0,
  7: 4974800.0,
  8: 7011200.0,
  9: 7753200.0},
 'Adj Close': {0: 0.08685751259326935,
  1: 0.08661489188671112,
  2: 0.08491652458906174,
  3: 0.0863722488284111,
  4: 0.07763802260160446,
  5: 0.9220728874206543,
  6: 0.9277651309967041,
  7: 0.9192269444465637,
  8: 0.9363031387329102,
  9: 0.8651551008224487},
 'Symbol': {0: 'AAPL',
  1: 'AAPL',
  2: 'AAPL',
  3: 'AAPL',
  4: 'AAPL',
  5: 'XOM',
  6: 'XOM',
  7: 'XOM',
  8: 'XOM',
  9: 'XOM'}})

2 个答案:

答案 0 :(得分:3)

使用groupby.pct_change

df['Pct_Change'] = df.groupby('Symbol', sort=False)['Adj_Close'].pct_change()

print(df)
         Date       High        Low       Open      Close      Volume  \
0  1999-11-18  35.765381  28.612303  32.546494  31.473534  62546300.0   
1  1999-11-19  30.758226  28.478184  30.713520  28.880543  15234100.0   
2  1999-11-22  31.473534  28.657009  29.551144  31.473534   6577800.0   
3  1999-11-23  31.205294  28.612303  30.400572  28.612303   5975600.0   
4  1999-11-24  29.998211  28.612303  28.701717  29.372318   4843200.0   

   Adj_Close Symbol  Pct_Change  
0  27.369196      A         NaN  
1  25.114351      A   -0.082386  
2  27.369196      A    0.089783  
3  24.881086      A   -0.090909  
4  25.541994      A    0.026563  

答案 1 :(得分:1)

这可以使用groupby来完成,如@Sandeep指出的那样。但是,使用您的解决方案:

// Top-level build file where you can add configuration options common to all sub-projects/modules.

buildscript {
    repositories {
        jcenter()
        google()
    }
    dependencies {
        classpath 'com.android.tools.build:gradle:3.2.1'
        classpath 'com.github.dcendents:android-maven-gradle-plugin:1.5'
        classpath 'com.jfrog.bintray.gradle:gradle-bintray-plugin:1.7.3'
        // NOTE: Do not place your application dependencies here; they belong
        // in the individual module build.gradle files
    }
}

allprojects {
    repositories {
        jcenter()
        google()
    }
}

task clean(type: Delete) {
    delete rootProject.buildDir
    }

请注意,您正在将apply plugin: 'com.android.library' apply plugin: 'com.github.dcendents.android-maven' group = 'com.github.technolifestyle' ext { bintrayRepo = 'AutoImageFlipper' bintrayName = 'AutoImageFlipper' publishedGroupId = 'com.github.technolifestyle' libraryName = 'AutoImageFlipper' artifact = 'imageslider' libraryDescription = 'A carousel like implementation for Android with many functionalities' siteUrl = 'https://github.com/therealshabi/AutoImageFlipper/' gitUrl = 'https://github.com/therealshabi/AutoImageFlipper.git' libraryVersion = '1.5.3-beta.5' developerId = 'therealshabi' developerName = 'Shahbaz Hussain' developerEmail = 'shahbaz.h96@gmail.com' licenseName = 'The Apache Software License, Version 2.0' licenseUrl = 'http://www.apache.org/licenses/LICENSE-2.0.txt' allLicenses = ["Apache-2.0"] } android { compileSdkVersion 28 buildToolsVersion '28.0.3' defaultConfig { multiDexEnabled true minSdkVersion 17 targetSdkVersion 28 versionCode 1 versionName "1.0" testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner" } buildTypes { release { minifyEnabled false proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro' } } } dependencies { api fileTree(include: ['*.jar'], dir: 'libs') implementation 'androidx.appcompat:appcompat:1.1.0-alpha01' implementation 'com.romandanylyk:pageindicatorview:1.0.3@aar' implementation 'com.squareup.picasso:picasso:2.71828' implementation 'androidx.multidex:multidex:2.0.1' testImplementation 'junit:junit:4.12' } apply from: 'https://raw.githubusercontent.com/nuuneoi/JCenter/master/installv1.gradle' apply from: 'https://raw.githubusercontent.com/nuuneoi/JCenter/master/bintrayv1.gradle' 分配给新的df['Pct_Change'] = 0 for stock in df.Symbol.unique(): subset = df.loc[df.Symbol == stock] df.Pct_Change = subset['Adj_Close'].pct_change() 数据帧,而不是原始的,因此原始数据帧未更改。

subset['Adj_Close'].pct_change()