Question

输入数据

<style>
  body {
    font-family: Arial, sans-serif;
  }
  header, footer {
    display: flex;
    flex-direction: row;
  }
  header .profile-thumbnail {
    width: 50px;
    height: 50px;
    border-radius: 4px;
  }
  header .profile-name {
    display: flex;
    flex-direction: column;
    justify-content: center;
    margin-left: 10px;
  }
  header .follow-btn {
    display: flex;
justify-content:center;
    margin: auto 0;
    padding-left: 10px;
  }
  header .follow-btn button {
    border: 0;
    border-radius: 3px;
    padding: 5px;
  }
  header h3, header h4 {
    display: flex;
    margin: 0;
  }
  #inner p {
    margin-bottom: 10px;
    font-size: 20px;
  }
  #inner hr {
    margin: 20px 0;
    border-style: solid;
    opacity: 0.1;
  }
  footer .stats {
    display: flex;
    font-size: 15px;
  }
  footer .stats strong {
    font-size: 18px;
  }
  footer .stats .likes {
    margin-left: 10px;
  }
  footer .cta {
    margin-left: auto;
  }
  footer .cta button {
    border: 0;
    background: transparent;
  }
</style>
<header>
  <img src="https://freecodecamp.s3.amazonaws.com/quincy-twitter-photo.jpg" alt="Quincy Larson's profile picture" class="profile-thumbnail">
  <div class="profile-name">
    <h3>Quincy Larson</h3>
    <h4>@ossia</h4>
  </div>
  <div class="follow-btn">
    <button>Follow</button>
  </div>
</header>
<div id="inner">
  <p>I meet so many people who are in search of that one trick that will help them work smart. Even if you work smart, you still have to work hard.</p>
  <span class="date">1:32 PM - 12 Jan 2018</span>
  <hr>
</div>
<footer>
  <div class="stats">
    <div class="Retweets">
      <strong>107</strong> Retweets
    </div>
    <div class="likes">
      <strong>431</strong> Likes
    </div>
  </div>
  <div class="cta">
    <button class="share-btn">Share</button>
    <button class="retweet-btn">Retweet</button>
    <button class="like-btn">Like</button>
  </div>
</footer>

df1：

from datatable import dt

C1 = ['a', 'a', 'b', 'c']
C2 = ['b', 'c', 'a', 'a']

df1 = dt.Frame(C1=C1, C2=C2)

输出数据

   | C1  C2
-- + --  --
 0 | a   b 
 1 | a   c 
 2 | b   a 
 3 | c   a

df2：

C1 = ['a', 'b', 'a', 'c']
C2 = ['b', 'a', 'c', 'a']

df2 = dt.Frame(C1=C1, C2=C2)

将datatable对象转换为pandas对象：

   | C1  C2
-- + --  --
 0 | a   b 
 1 | b   a 
 2 | a   c 
 3 | c   a

问题描述：

我尝试使其尽可能地易于理解。如果有任何问题，我很乐意解释。样本数据在列C1和C2中包含唯一值'a'，'b'，'c'。 C1和C2中值的每种组合仅出现一次（例如，在df1的第一行中C1 ='a'和C2 ='b'）。对于大多数组合，有一个“对”，表示反向组合（在这种情况下，是上述示例：第三行中的C1 ='b'＆C2 ='a'）。如何订购所有“对”彼此相邻的数据框？所需的输出显示在df2中。我更喜欢使用数据表而不是熊猫。但是，如果有人对熊猫有解决方案，那对我同样有帮助。

我希望这个问题满足SO准则。如果没有，我很乐意改善它。非常感谢。

编辑：看来我的样本数据太简化了。这是一个简化程度较低的数据集：

df = df.to_pandas()

Answer 1

转换为pandas后，我们可以在sort_values之后尝试numpy.sort

import numpy as np 
df1 = df1.to_pandas()

out = df1.iloc[pd.DataFrame(np.sort(df1.values,1)).sort_values([0,1]).index]
Out[54]: 
  C1 C2
0  a  b
2  b  a
1  a  c
3  c  a

Answer 2

尝试一下：

import pandas as pd

C1 = ['a', 'a', 'b', 'c']
C2 = ['b', 'c', 'a', 'a']
Values = [5, 10, 15, 20]

df = pd.DataFrame({'C1': C1, 'C2': C2, 'Values': Values})
srt = df.apply(lambda x: ','.join(sorted(x[['C1', 'C2']].values)),axis=1)
df.loc[srt.argsort(),:]

Answer 3

这就是您要寻找的东西

>>> from datatable import dt, f, sort, ifelse
>>> df1 = dt.Frame(C1=['a', 'a', 'b', 'c'], 
                   C2=['b', 'c', 'a', 'a'], 
                   Values=[5, 10, 15, 20])
>>> df1[:, :, sort(ifelse(f.C1<f.C2, f.C1, f.C2), 
                   ifelse(f.C1<f.C2, f.C2, f.C1))]
   | C1  C2  Values
-- + --  --  ------
 0 | a   b        5
 1 | b   a       15
 2 | a   c       10
 3 | c   a       20

[4 rows x 3 columns]

在这里，我们按2个计算列对帧进行排序，第一个是C1和C2的最小值，第二个是C1和C2的最大值。

Python数据表（或熊猫）：基于两列的数据帧难于排序

3 个答案: