我在熊猫中有一个数据框,如下所示:
pushSubdocument();//calls the function
async function pushSubdocument() {
const doc = await findByIdMongoose(); //I ask to await
console.log(doc);//I am printing here, and it is undefined
}
function findByIdMongoose() {
Document.findById({ _id: "5e6d0f3e8afae22ee0cc238c" })
.select("friends")
.then(doc => {
doc.friends.push({
name: "Maria",
email: "mariadomar@test.com",
relatives: []
});
doc.save().then(() => {
console.log("saved!");
});
// if I print it here, before returning, it is okay
return doc;
});
}
但是我想将其转换为这样的表:
ERROR: Command errored out with exit status 1:
command: 'c:\program files\python38\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\NDSNIVE\\AppData\\Local\\Temp\\pip-req-build-_mdk7oi0\\setup.py'"'"'; __file__='"'"'C:\\Users\\NDSNIVE\\AppData\\Local\\Temp\\pip-req-build-_mdk7oi0\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\NDSNIVE\AppData\Local\Temp\pip-req-build-_mdk7oi0\pip-egg-info'
cwd: C:\Users\NDSNIVE\AppData\Local\Temp\pip-req-build-_mdk7oi0\
Complete output (23 lines):
# pysam: cython is available - using cythonize if necessary
# pysam: htslib mode is shared
# pysam: HTSLIB_CONFIGURE_OPTIONS=None
'.' is not recognized as an internal or external command,
operable program or batch file.
'.' is not recognized as an internal or external command,
operable program or batch file.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\NDSNIVE\AppData\Local\Temp\pip-req-build-_mdk7oi0\setup.py", line 241, in <module>
htslib_make_options = run_make_print_config()
File "C:\Users\NDSNIVE\AppData\Local\Temp\pip-req-build-_mdk7oi0\setup.py", line 68, in run_make_print_config
stdout = subprocess.check_output(["make", "-s", "print-config"])
File "c:\program files\python38\lib\subprocess.py", line 411, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "c:\program files\python38\lib\subprocess.py", line 489, in run
with Popen(*popenargs, **kwargs) as process:
File "c:\program files\python38\lib\subprocess.py", line 854, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "c:\program files\python38\lib\subprocess.py", line 1307, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] Den angivne fil blev ikke fundet
# pysam: htslib configure options: None
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
是否可以应用逐行函数(在熊猫中使用df = pd.DataFrame([[4, 9],[4,9],[[1,2],[3,4]]], columns=['A', 'B'])
df
A B
0 4 9
1 4 9
2 [1, 2] [3, 4]
或其他函数)
答案 0 :(得分:3)
使用lis理解和chain
的平坦值:
from itertools import chain
out = list(chain.from_iterable(item if isinstance(item[0],list)
else [item] for item in df[['A','B']].values))
df1 = pd.DataFrame(out, columns=['A','B'])
或循环替代:
out = []
for x in df[['A','B']].values:
if isinstance(x[0], list):
for y in x:
out.append(y)
else:
out.append(x)
df1 = pd.DataFrame(out, columns=['A','B'])
print (df1)
A B
0 4 9
1 4 9
2 1 2
3 3 4
答案 1 :(得分:1)
在concat
中使用列表理解:
df = pd.DataFrame([[4, 9],[4,9],[[1,2],[3,4]],], columns=['A', 'B'])
print (pd.concat([df.loc[:1], *[pd.DataFrame(list(i),columns=df.columns) for i in df.loc[2:].to_numpy()]],
ignore_index=True))
A B
0 4 9
1 4 9
2 1 2
3 3 4
答案 2 :(得分:1)
您可以这样做:
#main piece - the rest is actually 'fixing' the multiindex piece to fit your purpose:
df=df.stack().explode().to_frame()
df["id"]=df.groupby(level=[0,1]).cumcount()
df.index=pd.MultiIndex.from_tuples(zip(df.index.get_level_values(0)+df['id'], df.index.get_level_values(1)))
df=df.drop(columns="id").unstack()
df.columns=map(lambda x: x[1], df.columns)
输出:
>>> df
A B
0 4 9
1 4 9
2 1 3
3 2 4
答案 3 :(得分:0)
使用DataFrame.apply
,Series.explode
,DataFrame.mask
和DataFrame.where
:
types = df.applymap(type).eq(list)
arr = df.where(types).apply(pd.Series.explode).dropna().T.to_numpy()
df.mask(types).dropna().append(pd.DataFrame(arr, columns=df.columns), ignore_index=True)
A B
0 4 9
1 4 9
2 1 2
3 3 4
答案 4 :(得分:0)
使用简单的for和if循环:
alist = df['A'].tolist()
blist = df['B'].tolist()
alist1=[]
blist1=[]
for k,r in zip(alist,blist):
if isinstance(k,list):
alist1.append(k[0])
blist1.append(k[1])
if isinstance(r,list):
alist1.append(r[0])
blist1.append(r[1])
else:
alist1.append(k)
blist1.append(r)
df = pd.DataFrame({'A': alist1, 'b': blist1})
答案 5 :(得分:0)
迄今为止使用DataFrame.melt,DataFrame.explode和DataFrame.pivot提出的所有其他解决方案:
import pandas as pd
df = pd.DataFrame([[4, 9],[4,9],[[1,2],[3,4]]], columns=['A', 'B'])
# Create index column
df.reset_index(inplace=True)
tmp = df.melt(id_vars='index', var_name='columns').explode('value')
# Define indexes
idx = sum([list(range(len(tmp)//tmp['columns'].nunique())) for _ in range(tmp['columns'].nunique())], [])
tmp['index'] = idx
result_df = tmp.pivot(index='index', columns='columns', values='value')
result_df
columns A B
index
0 4 9
1 4 9
2 1 3
3 2 4
答案 6 :(得分:0)
该问题中有一个问题,不能确定同一行中的列表项始终具有相同的长度。 如果满足该假设,则可以使用以下答案:
df.apply(pd.Series.explode)
A B
0 4 9
1 4 9
2 1 3
2 2 4