我被困住了,需要一些帮助。我有以下数据框:
+-----+---+---+-----+
| | A | B | C |
+-----+---+---+-----+
| 288 | 1 | 4 | 9 |
+-----+---+---+-----+
| 245 | 2 | 3 | 7 |
+-----+---+---+-----+
| 543 | 3 | 6 | 8 |
+-----+---+---+-----+
| 867 | 1 | 9 | 1 |
+-----+---+---+-----+
| 345 | 2 | 7 | 6 |
+-----+---+---+-----+
| 122 | 3 | 8 | 3 |
+-----+---+---+-----+
| 233 | 1 | 1 | NaN |
+-----+---+---+-----+
| 346 | 2 | 6 | NaN |
+-----+---+---+-----+
| 765 | 3 | 3 | NaN |
+-----+---+---+-----+
A列具有重复值,如图所示。我想要做的是每当我看到A列中的重复值时,我想要添加一个新的列,其中B列中的相应值为C列,如下所示:
$client = new Google_Client();
$storageService = new Google_Service_Storage($client );
$bucket = $this->bucketName;
$file_name = md5(uniqid(rand(), true)) . '.' . $ext;
$file_content = urldecode($myFile);
try {
$postbody = array(
'name' => $file_name,
'data' => file_get_contents($file_content),
'uploadType' => "resumable",
'predefinedAcl' => 'publicRead',
);
$options = array('Content-Encoding' => 'gzip');
$gsso = new Google_Service_Storage_StorageObject();
$gsso->setName( $file_name );
$gsso->setContentEncoding( "gzip" );
$storageService->objects->insert( $bucket, $gsso, $postbody, $options );
} catch ( Exception $e){
$result['error'] = json_decode($e->getMessage());
}
感谢。
答案 0 :(得分:0)
假设val
是重复值之一,
slice = df.loc[df.A == val, 'B'].shift(-1)
将创建一个单列数据框,其值重新编入其新位置。
由于重新分配的索引值都不应该是冗余的,因此您可以使用pandas.concat
将不同的切片拼接在一起,而不必担心会丢失数据。然后将它们作为新列添加:
df['C'] = pd.concat([df.loc[df['A'] == x, 'B'].shift(-1) for x in [1, 2, 3]])
分配列后,索引值将使所有内容对齐:
A B C
0 1 4 9.0
1 2 3 7.0
2 3 6 8.0
3 1 9 1.0
4 2 7 6.0
5 3 8 3.0
6 1 1 NaN
7 2 6 NaN
8 3 3 NaN
答案 1 :(得分:0)
反转数据帧顺序,groupby将其转换为shift函数,然后将其反转:
df = df[::-1]
df['C'] = df.groupby(df.columns[0]).transform('shift')
df = df[::-1]
df
A B C
0 1 4 9.0
1 2 3 7.0
2 3 6 8.0
3 1 9 1.0
4 2 7 6.0
5 3 8 3.0
6 1 1 NaN
7 2 6 NaN
8 3 3 NaN