我的数据框列中包含字符串列表:
public void RingtonesList() {
RingtoneManager manager = new RingtoneManager(this);
manager.setType(RingtoneManager.TYPE_RINGTONE);
Cursor cursor = manager.getCursor();
while (cursor.moveToNext()) {
String title = cursor.getString(RingtoneManager.TITLE_COLUMN_INDEX);
String uri = cursor.getString(RingtoneManager.URI_COLUMN_INDEX);
// Do something with the title and the URI of ringtone
Log.d("URI",""+uri);
}
}
Also give permission
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.READ_INTERNAL_STORAGE" />
我想从每行的包含整数的字符串中提取数字信息
例如,我需要创建一个名为data = [{'column A': '3 item X; 4 item Y; item E of size 7', 'column B': 'item I of size 10; item X has 5 specificities; characteristic W'},
{'column A': '13 item X; item F of size 0; 9 item Y', 'column B': 'item J of size 11; item Y has 8 specificities'}]
df = pd.DataFrame(data)
的新列,该列为A列中Size item E
的第一行取值7
,因为该列表包含df
。<登记/>
如果字符串列表中的值不包含数字,我只想将它们编码为1或0(如果它存在于原始列表中)。
以下是我想要的输出的摘要:
这是我到目前为止编写的内容,仅应用了1条规则:
item E of size 7
这回复了以下数据框:
如您所见,我不能按行应用我的特征提取,它会更新整个熊猫系列。是否要逐步更新每一行的新列值?
答案 0 :(得分:0)
不要去复杂的功能pandas有很棒的字符串操作功能。 检查此代码以获得所需的输出。
data = [{'column A': '3 item X; 4 item Y; item E of size 7', 'column B': 'item I of size 10; item X has 5 specificities; characteristic W'},
{'column A': '13 item X; item F of size 0; 9 item Y', 'column B': 'item J of size 11; item Y has 8 specificities'}]
df = pd.DataFrame(data)
#joining 2 columns with ';'
df['All Columns joined'] = df[['column A','column B']].apply(lambda x: ';'.join(x), axis=1)
#creating empty dataframe
df_new = pd.DataFrame([])
#Desired output logic using string extract function
df_new['Nb item X'] = df['All Columns joined'].str.extract(r'([0-9]+) item X',expand = False)
df_new['Nb item Y'] = df['All Columns joined'].str.extract(r'([0-9]+) item Y',expand = False)
df_new['Nb specificities item X'] = df['All Columns joined'].str.extract(r'item X has ([0-9]+) specificities',expand = False)
df_new['Nb specificities item Y'] = df['All Columns joined'].str.extract(r'item Y has ([0-9]+) specificities',expand = False)
df_new['Size item E'] = df['All Columns joined'].str.extract(r'item E of size ([0-9]+)',expand = False)
df_new['Size item F'] = df['All Columns joined'].str.extract(r'item F of size ([0-9]+)',expand = False)
df_new['Size item I'] = df['All Columns joined'].str.extract(r'item I of size ([0-9]+)',expand = False)
df_new['Size item J'] = df['All Columns joined'].str.extract(r'item J of size ([0-9]+)',expand = False)
df_new['characteristic W'] = df['All Columns joined'].str.extract(r'(characteristic W)',expand = False).notnull().astype(int)
df_new
Nb item X Nb item Y Nb specificities item X Nb specificities item Y Size item E Size item F Size item I Size item J characteristic W
0 3 4 5 NaN 7 NaN 10 NaN 1
1 13 9 NaN 8 NaN 0 NaN 11 0