如何合并这两个数据框?

时间:2020-10-14 22:50:31

标签: python pandas dataframe

我被困在这里是因为我已经为两个数据帧创建了一个列表。我有两个表,每个表有两列。第一个表具有product_name和brand列,第二个表具有product_name和shipping列。我正在尝试进行一对一的联接,因此我可以在一张桌子上放置三列。它给我一个错误:KeyError:'运输'

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import pandas as pd
from collections import defaultdict
import re

url='https://www.newegg.com/PS4-Video-Games/SubCategory/ID-3141'

with uReq(url) as uClient:
    page = uClient.read()

# parsing
page_soup = soup(page, "html.parser")

# grabs products
containers= page_soup.findAll("div",{"class":"item-container"})

# file
filename = "products.csv"

d = defaultdict(list)
d1 = defaultdict(list)

# fill dict
for container in containers:
    brand = container.div.div.a.img["title"]

    title = container.findAll("a", {"class":"item-title"})
    product_name = title[0].text

    shipping_container = container.findAll("li", {"class":"price-ship"})
    shipping = shipping_container[0].text.strip()
    
    d['brand'].append(brand)
    d['product'].append(product_name)
    d1['product'].append(product_name)
    d1['shipping'].append(shipping)
    
# create dataframe    
df = pd.DataFrame(d)
df1 =pd.DataFrame(d1)



# clean shipping column
df['shipping'] = df['shipping'].apply(lambda x: 0 if x == 'Free Shipping' else x)
df['shipping'] = df['shipping'].apply(lambda x: 0 if x == 'Special Shipping' else x) # probably should be handled in a special way
df['shipping'] = df['shipping'].apply(lambda x: x if x == 0 else re.sub("[^0-9]", "", x))
df['shipping'] = df['shipping'].astype(float)

# save dataframe to csv file
df.to_csv('dataframe.csv', index=False)
df1.to_csv('dataframe1.csv', index=False)
# choose rows where shipping is less than 5.99
#print(df[df['shipping'] > 200])
    
#merge two data sets 
df3 = pd.merge(df,df1)
print(df3)

1 个答案:

答案 0 :(得分:0)

使用此:

const StreamingButtons: React.FC<Props> = ({ appleMusicUrl, spotifyUri }) => {
    const getDownloadUrl = (service: StreamingType) => {
        switch (service) {
            case StreamingType.APPLEMUSIC: {
                return appleMusicUrl
            }
            case StreamingType.SPOTIFY: {
                return spotifyUri
            }
            default: {
                const NO_BEHAVIOR = '\\'
                return NO_BEHAVIOR
            }
        }
        return (
            <div className='streamingButtons'>
                <a href={getDownloadUrl(StreamingType.APPLEMUSIC)}>
                    <img
                        className="appleImg"
                        src='/images/streaming-transparent/apple-music.png'
                    />
                </a>
                <a href={getDownloadUrl(StreamingType.SPOTIFY)}>
                    <img
                        className='spotifyImg'
                        src='/images/streaming-transparent/spotify-green.svg'
                    />
                </a>
            </div>
        )
    }
}

根据您的喜好,可以改用df3 = df.merge(df1, on="product", how="left") how='inner'