Question

我正在处理一个JSON文件，从中运行此代码即可获取以下数据框：

Sub Test()

    Application.OnTime Now + TimeValue("00:00:10"), "TerminateExecution"
    MainContent

End Sub

Sub TerminateExecution()

    End

End Sub

' Add the rest procedures you need unchanged below
' ...

现在，我需要将信息从“主题”列分为两个不同的列：

这是预期的结果：

import pandas as pd

topics = df.set_index('username').popular_board_data.str.extractall(r'name":"([^,]*)')
total = df.set_index('username').popular_board_data.str.extractall(r'totalCount\":([^,}]*)')

data = []
for username in df.username.unique():
for topic in zip(topics[0][username], total[0][username]):
    data.append([username, topic])

df_topic = pd.DataFrame(data, columns='username,topic'.split(','))

    username        topic
0     lukl    (Hardware", 80)
1     lukl    (Marketplace", 31)
2     lukl    (Atari 5200", 27)
3     lukl    (Atari 8-Bit Computers", 9)
4     lukl    (Modern Gaming", 3)

尽管我要使用以下代码进行操作：

    username        topic          _topic       _total
0     lukl    (Hardware", 80)      Hardware     80
1     lukl    (Marketplace", 31)   Marketplace  31
2     lukl    (Atari 5200", 27)    Atari 5200   27
3     lukl    (Atari 8", 9)        Atari 8      9
4     lukl    (Modern", 3)         Modern       3

但是我遇到了这个错误：

AttributeError：只能将.str访问器与字符串值一起使用，后者在熊猫中使用np.object_ dtype

Answer 1

我认为有元组，因此仅使用DataFrame构造函数：

df_topic[['_topic', '_total']]=pd.DataFrame(df_topic['topic'].values.tolist(), 
                                index=df_topic.index)

使用concat和DataFrame.reset_index的先前答案数据的更好解决方案：

df = [{"username": "last",
    "popular_board_data": "{\"boards\":[{\"postCount\":\"75\",\"topicCount\":\"5\",\"name\":\"Hardware\",\"url\",\"totalCount\":80},{\"postCount\":\"20\",\"topicCount\":\"11\",\"name\":\"Marketplace\",\"url\",\"totalCount\":31},{\"postCount\":\"26\",\"topicCount\":\"1\",\"name\":\"Atari 5200\",\"url\",\"totalCount\":27},{\"postCount\":\"9\",\"topicCount\":0,\"name\":\"Atari 8\",\"url\"\"totalCount\":9}"
    },
    {"username": "truk",
     "popular_board_data": "{\"boards\":[{\"postCount\":\"351\",\"topicCount\":\"11\",\"name\":\"Atari 2600\",\"url\",\"totalCount\":362},{\"postCount\":\"333\",\"topicCount\":\"22\",\"name\":\"Hardware\",\"url\",\"totalCount\":355},{\"postCount\":\"194\",\"topicCount\":\"8\",\"name\":\"Marketplace\",\"url\",\"totalCount\":202}"
    }]
df = pd.DataFrame(df)

#added " for remove it from output
topics = df.set_index('username').popular_board_data.str.extractall(r'name":"([^,]*)"')
total = df.set_index('username').popular_board_data.str.extractall(r'totalCount\":([^,}]*)')

df1 = pd.concat([topics[0], total[0]], axis=1, keys=['_topic', '_total'])
df1 = df1.reset_index(level=1, drop=True).reset_index()
print (df1)
  username       _topic _total
0     last     Hardware     80
1     last  Marketplace     31
2     last   Atari 5200     27
3     last      Atari 8      9
4     truk   Atari 2600    362
5     truk     Hardware    355
6     truk  Marketplace    202

Answer 2

我将主题作为字符串（如果不是字符串），然后将其转换为字符串

info Building and installing the app on the device (cd android && gradlew.bat app:installDebug)...
Starting a Gradle Daemon, 2 incompatible and 1 stopped Daemons could not be reused, use --status for details

FAILURE: Build failed with an exception.

* Where:
Settings file 'E:\AppFolder\MyApp\android\settings.gradle' line: 4

* What went wrong:
A problem occurred evaluating settings 'TribeBond'.
> Could not read script 'E:\AppFolder\MyApp\node_modules\@react-native-community\cli-platform-android\native_modules.gradle' as it does not exist.

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 1m 10s
error Could not install the app on the device, read the error above for details.
Make sure you have an Android emulator running or a device connected and have
set up your Android development environment:
https://facebook.github.io/react-native/docs/getting-started.html
error Command failed: gradlew.bat app:installDebug. Run CLI with --verbose flag for more details.```

df = pd.DataFrame(data={"username":['luk1','luk1','luk1'],
                  'topic':[ '(Hardware, 80)','(Marketplace, 31)', '(Atari 5200, 27)']})
df['_topic'] = df['topic'].apply(lambda x:str(x).split(",")[0][1:])
df['_total'] = df['topic'].apply(lambda x:str(x).split(",")[1][:-1])

Answer 3

您可以使用以下正则表达式：

df['_topic'] = df['topic'].str.extract(r'([a-zA-Z]+)')
df['_total'] = df['topic'].str.extract(r'(\d+)')

  username                        topic       _topic _total
0     lukl              (Hardware", 80)     Hardware     80
1     lukl           (Marketplace", 31)  Marketplace     31
2     lukl            (Atari 5200", 27)        Atari   5200
3     lukl  (Atari 8-Bit Computers", 9)        Atari      8
4     lukl          (Modern Gaming", 3)       Modern      3

AttributeError：只能将.str访问器与字符串值一起使用，该字符串值在熊猫中使用np.object_ dtype（Python）

3 个答案: