设置数据

Question

带有列的SQL表

$q = "UPDATE `gestion`.`utilisateurs` SET `np` = '$np', `adr` ='$adr', `numTel` = '$ntel', `cin` = '$cin', `pass` = '$pass', `idEquipe` = '$idEquipe', `adrElectronique` = '$mail', `role` = '$role' WHERE `utilisateurs`.`id` = '$id'";

必须使用数据框更新sql表列

具有带有值的数据框ABC

    0    1
0  789 | NA     
1  123 | NA
2  456 | NA

我在sql表中的输出应该是

    0          1
0  123   |  pass1
1  456   |  pass2
2  789   |  pass3

使用sql的建议也可以。

Answer 1

设置数据

首先以可复制的方式创建数据框：

import datetime as dt
import pandas as pd

# provided data
data = [('2019-08-23', '10'), ('2019-06-23', '18'),('2019-07-21', '05'),
    ('2019-09-09', '09'), ('2019-09-19', '04'), ('2019-08-27', '22'),
    ('2019-05-03', '02'), ('2019-06-27', '07'), ('2019-05-25', '19'),
    ('2019-04-27', '02'), ('2019-01-19', '02'), ('2019-05-28', '10'),
    ('2019-02-22', '09'), ('2019-01-25', '06'), ('2019-10-22', '17'),
    ('2019-11-02', '13'), ('2019-10-29', '17'), ('2019-03-11', '18'),
    ('2019-03-11', '19'), ('2019-10-19', '19'), ('2019-02-17', '12'),
    ('2019-10-21', '01'), ('2019-09-01', '08'), ('2019-01-15', '09'),
    ('2019-11-15', '08'), ('2019-10-10', '18'), ('2019-03-31', '01'),
    ('2019-08-17', '01'), ('2019-05-27', '07'), ('2019-02-24', '20'),
    ('2019-11-03', '21'), ('2019-06-28', '21'), ('2019-01-06', '00'),
    ('2019-03-30', '23'), ('2019-06-27', '04'), ('2019-03-08', '19'),
    ('2019-01-30', '09'), ('2019-11-15', '02'), ('2019-06-04', '09'),
    ('2019-05-03', '14'), ('2019-07-01', '08'), ('2019-09-20', '19'),
    ('2019-05-15', '12'), ('2019-05-17', '02'), ('2019-09-21', '20'),
    ('2019-02-14', '14')]

# create df
df = pd.DataFrame.from_records(data, columns=('date', 'amount'))

您似乎正在使用object数据类型-使用适当的数据类型，此操作会容易得多：

# convert dtypes
df['date'] = pd.to_datetime(df['date'], errors='coerce')
df['amount'] = df['amount'].astype('int')

为了可视化我们正在查看的内容，我对数据进行了排序，以使其更易于评估结果

df = df.sort_values(['date', 'amount']).reset_index(drop=True)
df.head()

    date    amount
0   2019-01-06  0
1   2019-01-15  9
2   2019-01-19  2
3   2019-01-25  6
4   2019-01-30  9

获取数据

数据帧集

最简单的情况

如果您的解决方案需要返回五个单独的数据帧，则最简单的解决方案可能是对感兴趣月份使用列表推导方法如果您的数据范围始终在同一年：

# list comprehension
df_list = [df[df['date'].dt.month == mo] for mo in range(3, 8)]

# returning individual dfs
mar_df, apr_df, may_df, jun_df, jul_df = iter(df_list)

现实案例

除了这种简单的情况之外，您还需要使用pd.date_range。

# getting range of dates
>>> boundary_dates = pd.date_range(end=dt.datetime(2019, 8, 1), freq='MS', periods=6)
>>> boundary_dates
DatetimeIndex(['2019-03-01', '2019-04-01', '2019-05-01', '2019-06-01', '2019-07-01', '2019-08-01'],
              dtype='datetime64[ns]', freq='MS')

这为您提供了六个日期范围，以提供5组边界。您可以使用zip创建边界列表：

>>> [[l_bound, u_bound] for l_bound, u_bound in zip(boundary_dates, boundary_dates[1:])]
[[Timestamp('2019-03-01 00:00:00', freq='MS'), Timestamp('2019-04-01 00:00:00', freq='MS')],
 [Timestamp('2019-04-01 00:00:00', freq='MS'), Timestamp('2019-05-01 00:00:00', freq='MS')],
 [Timestamp('2019-05-01 00:00:00', freq='MS'), Timestamp('2019-06-01 00:00:00', freq='MS')],
 [Timestamp('2019-06-01 00:00:00', freq='MS'), Timestamp('2019-07-01 00:00:00', freq='MS')],
 [Timestamp('2019-07-01 00:00:00', freq='MS'), Timestamp('2019-08-01 00:00:00', freq='MS')]]

要利用pd.Series.between的优势，请再次减去dt.timedelta(days=1)。

boundaries = [[l_bound, u_bound - dt.timedelta(days=1)] for
    l_bound, u_bound in zip(boundary_dates, boundary_dates[1:])]

df_list = [df[df['date'].between(b) for b in boundaries]
mar_df, apr_df, may_df, jun_df, jul_df = iter(df_list)

由于您需要动态的内容，因此您不必每次都为每个数据框指定名称。将其返回为字典可将数据帧分配给键（来自dt.datetime.strftime，以便可以更轻松地将其拔出：

df_dict = {b[0].strftime('%b_%y_df'): 
        {df[df['date'].between(b[0], b[1])] for b in boundaries}

由于每个值都包含一个数据框，因此您仍然可以使用df_dict.values()轻松访问各个数据框。

创建函数

要将这些步骤包装到一个函数中，该函数可让您灵活选择要查看的年份和月份以及要返回的月份数：

def monthly_dfs(df, year, month, n=5):
    """return a number of dataframes for the n months preceding a given month"""
    # generate list of boundaries for months of interest
    before_dt = dt.datetime(year, month, 1)
    boundary_dates = pd.date_range(end=before_dt, freq='MS', periods=n+1)
    # get boundary pairs
    boundaries = [[l_bound, u_bound - dt.timedelta(days=1)] for 
        l_bound, u_bound in zip(boundary_dates, boundary_dates[1:])]
    # return df within each boundary pair with key according to month start
    return {b[0].strftime('%b_%y_df'): 
        df[df['date'].between(b[0], b[1])] for b in boundaries}

df_dict = monthly_dfs(df, 2019, 8)
mar_df, apr_df, may_df, jun_df, jul_df = df_dict.values()

输出

重新格式化一下，这里是df_dict：

{
    'Mar_19_df':
           date        amount
        9  2019-03-08      19
        10 2019-03-11      18
        11 2019-03-11      19
        12 2019-03-30      23
        13 2019-03-31       1,
    'Apr_19_df':
           date        amount
        14 2019-04-27       2,
    'May_19_df':
           date        amount
        15 2019-05-03       2
        16 2019-05-03      14
        17 2019-05-15      12
        18 2019-05-17       2
        19 2019-05-25      19
        20 2019-05-27       7
        21 2019-05-28      10,
    'Jun_19_df':
           date        amount
        22 2019-06-04       9
        23 2019-06-23      18
        24 2019-06-27       4
        25 2019-06-27       7
        26 2019-06-28      21,
    'Jul_19_df':
           date        amount
        27 2019-07-01       8
        28 2019-07-21       5
}

可以使用创建的密钥来访问这些密钥，例如：

>>>df_dict['Mar_19_df']
    date    amount
9   2019-03-08  19
10  2019-03-11  18
11  2019-03-11  19
12  2019-03-30  23
13  2019-03-31  1

Answer 2

解决方案是首先列出月份和年份，因为2019年的第3个月可以有2019年的1,2个月和2018年的10、11、12个月，然后基于字符串匹配工作几个月。

year = 2019
month = 3
month_list=[]
year_list=[]
for i in range(5):
    if month-i-2<0:
        month_list.append((month-i-2)%12)
        year_list.append(year-1)
    else:
         month_list.append((month-i-2))
         year_list.append(year)

month_list =  ["%02d" % (x+1) for x in month_list]
month_names = ['jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec']
print(month_list)
dataframe_collection = {}

for i in range(5):
    ## filtering year
    df_temp = df[df['date'].str.contains(str(year_list[i]))]
    ## filtering month
    df_temp = df[df['date'].str.contains(str('-'+month_list[i]+'-'))]

    dataframe_collection[month_names[int(month_list[i])-1]]=df_temp

for i in dataframe_collection:
    print(i)
    print(dataframe_collection[i])

Answer 3

您没有发布代码，所以我只能给您一个方向：

以pandas df_dbtable的形式获取表，将列0上的两个df结合在一起，以列#include <stdio.h> #include <cs50.h> #include <string.h> typedef struct node { int val; char* name; struct node *next; } node_t; void addFirst(int value, char* word, node_t** nd) { //initialize new node, allocate space, set value node_t * tmp; tmp = malloc(sizeof(node_t)); tmp->val = value; strcpy(tmp->name, word); //let the new nodes next pointer point to the old head tmp->next = *nd; //Make tmp the head node *nd = tmp; } int findItem(int value,char* word, node_t *nd) { if(nd->val == value) return 0; while(nd->next != NULL) { if(nd->val == value && strcmp(word, nd->name) == 0) return 0; if(nd->next != NULL) nd = nd->next; } return -1; } int main (void) { node_t *head = malloc(sizeof(node_t)); head->val = 0; strcpy(head->name, ""); head->next = NULL; addFirst(15, "word", &head); addFirst(14,"word2", &head); printf("%i \n", findItem(15, "word", head)); }创建新的df_new。截断sql表并插入新的df。

玩得开心。

如何更新具有两列的sql表并比较python中具有两列的df？

3 个答案:

设置数据

获取数据

推荐

使用`pd.date_range`

数据帧集

最简单的情况

现实案例

创建函数

输出

如何更新具有两列的sql表并比较python中具有两列的df？

3 个答案:

设置数据

获取数据

推荐

使用pd.date_range

数据帧集

最简单的情况

现实案例

创建函数

输出

使用`pd.date_range`