Question

我有以下代码可根据用户输入创建数据框：

urlpatterns = [
    ...
    path('groups/', include('groups.urls')),
    path('posts/', include('posts.urls')),
    ...
]

现在，我想在新数据框中使用发布ID来运行此SQL查询：

import pandas as pd
  from pandas import DataFrame

 publications = 
 pd.read_csv("C:/Users/nkambhal/data/pubmed_search_results_180730.csv", sep= 
 "|")

 publications['title'] = publications['title'].fillna('')

 search_term = input('Enter the term you are looking for: ')

 publications[['title','publication_id']] 
 [publications['title'].str.contains(search_term)]
 title_mask = 
 publications.title.str.lower().str.contains(search_term.lower())
 new = publications.loc[title_mask, ['title', 'publication_ID']]

在where语句中，我希望新数据框中的ID在那里。因此，在数据框中有publication_ids（5、6、4），然后我希望将它们添加到查询中。

如何将适当的publication_id添加到SQL查询中，并通过python运行并将其保存到csv文件中？

Answer 1

要将数据放入字符串中，可以使用python的str.format函数。您可以多读一些here

对于您的查询字符串，它应如下所示：

query_string = """
SELECT
   author_profile
   pub_lst.* 
FROM
   pub_lst
JOIN
    author_profile
        ON pub_lst.author_id = author_profile.author_id
WHERE
    pub_lst.publication_id IN {};
"""
print(query_string.format(str(tuple(new.publication_ID.values))))

对于运行查询，无论您要连接哪个数据库，都需要使用python模块。如PyMySQL用于连接到MySQL数据库。 https://pypi.org/project/PyMySQL/

尽管如此，您可以使用peewee或SqlAlchemy之类的ORM使您在处理SQL数据库时的生活更加轻松。熊猫和SqlAlchemy混合得很好。但是Peewee较容易上手。

要创建一个csv，您可以按难度升序使用内置python csv模块，pandas或Peewee或SqlAlchemy。

使用python更改和运行SQL查询

1 个答案: