我有一个类似的CSV文件:
Function Do-Stuff
{
Param($Environment,$Action,$Schedule,$Note)
<# logic #>
}
$Splat = @{
Environment='';
Action='';
Schedule='';
Note='';
}
Write-Host "Env:`r`n`t1) staging`r`n`t2) prod`r`nSelection:"
$Splat.Environment = Read-Host
Write-Host "Select action to perform:`r`n`t1) foo`r`n`t2) bar`r`nSelection:"
$Splat.Action = Read-Host
Write-Host "Schedule or leave blank to schedule now (yyyy-mm-dd hh:mm:ss):"
$Splat.Schedule = Read-Host
Write-Host "note (leave blank to skip):"
$Splat.Note = Read-Host
Write-Host @"
Plan of action:
>> Sending action to: $($Splat.Environment)
>> Scheduling a action of: $($Splat.Action)
>> Schedule date: $($Splat.Schedule)
>> Notes: $($Splat.Note)
Ok to proceed? (Y|N):"@
$Agree = Read-Host
If ($Agree.ToUpper() -eq 'Y')
{
Do-Stuff @Splat
}
想要只有一次“行”的单个数据框。
想法是创建两个数据框并将它们合并到一个resp到列Time [s]。所以我创建了那个序列。
const result = Array.from({ length: 5 }, (_, k) => `Cat #${k}`);
console.log(result);
但它没有用。 KeyError:'时间[s]'
/ ********************************************** **************************** /
我发现pandas正在为重复的列添加编号。所以我改变了我的代码。
Time [s],Channel 0-Analog, Time [s],Reset-Digital, Time [s],Channel 1-Digital, Time [s],Channel 2-Digital, Time [s],Channel 3-Digital
-0.002204166666667, 2048.000000000000000, -0.002204166666667, 1, -0.002204166666667, 0, -0.002204166666667, 1, -0.002204166666667, 1
-0.002204000000000, 2048.000000000000000, -0.001124000000000, 0, -0.001504666666667, 1, -0.001448500000000, 0, -0.000199666666667, 0
-0.002203833333333, 2048.000000000000000, -0.000000000000000, 1, 0.000301666666667, 0, 0.000841666666667, 1, 0.000056333333333, 1
-0.002203666666667, 2048.000000000000000, 0.000550833333333, 0, 0.000932000000000, 1, 0.003178666666667, 0, 0.002361000000000, 0
-0.002203500000000, 2048.000000000000000, 0.003259333333333, 1, 0.002538166666667, 0, 0.005142333333333, 1, 0.004062000000000, 1
-0.002203333333333, 2048.000000000000000, 0.005602833333333, 0, ...
但现在我遇到的问题是索引只是为没有NaN的元素排序。首先表示两列都有数字的所有行,然后只有第一列没有NaN,然后只有第二列没有NaN。
df1 = pd.read_csv('untitled.csv',usecols=[2,3])
df2 = pd.read_csv('untitled.csv',usecols=[4,5])
merged = pd.merge(df1,df2,on=r'Time [s]')
我需要这种格式
df1 = pd.read_csv('untitled.csv',usecols=[2,3])
df2 = pd.read_csv('untitled.csv',usecols=[4,5])
df1.columns = df1.columns.str.strip('.123 ')
df2.columns = df2.columns.str.strip('.123 ')
merged =pd.merge(df1,df2,on=r'Time [s]',how='outer')
merged.set_index(r'Time [s]')
答案 0 :(得分:0)
我使用pd.melt提出了一个更简单的建议:
Time
作为键的列名称和列名称
包含Channel
作为值; df.drop("variable", axis=1)
来摆脱
由熔化创建的额外列。代码示例
df = pd.read_csv('untitled.csv')
keys = [col for col in df.columns if col.startswith('Time')]
values = [col for col in df.columns if col.startswith('Channel')]
pd.melt(df, id_vars=values, value_vars=keys, value_name='Time')
注意:我的回答受this启发: - )
答案 1 :(得分:0)
如果所有列名称都是唯一的,并且Time
列是信号列的前一列,则解决方案有效:
#get all columns with Digital text
d = df.columns[df.columns.str.contains('Digital')]
print (d)
Index(['Reset-Digital', 'Channel 1-Digital', 'Channel 2-Digital',
'Channel 3-Digital'],
dtype='object')
#get all previous columns (Time columns)
#for new versions of pandas for Time columns are added 1,2..for no duplicates
td = df.columns[df.columns.get_indexer(d) - 1]
print(td)
Index(['Time [s].1', 'Time [s].2', 'Time [s].3', 'Time [s].4'], dtype='object')
#zip time and signal column and concat data
df = pd.concat([df.set_index(x[0])[x[1]] for x in zip(td, d)], axis=1)
print (df)
Reset-Digital Channel 1-Digital Channel 2-Digital \
-0.002204 1.0 0.0 1.0
-0.001505 NaN 1.0 NaN
-0.001448 NaN NaN 0.0
-0.001124 0.0 NaN NaN
-0.000200 NaN NaN NaN
-0.000000 1.0 NaN NaN
0.000056 NaN NaN NaN
0.000302 NaN 0.0 NaN
0.000551 0.0 NaN NaN
0.000842 NaN NaN 1.0
0.000932 NaN 1.0 NaN
0.002361 NaN NaN NaN
0.002538 NaN 0.0 NaN
0.003179 NaN NaN 0.0
0.003259 1.0 NaN NaN
0.004062 NaN NaN NaN
0.005142 NaN NaN 1.0
Channel 3-Digital
-0.002204 1.0
-0.001505 NaN
-0.001448 NaN
-0.001124 NaN
-0.000200 0.0
-0.000000 NaN
0.000056 1.0
0.000302 NaN
0.000551 NaN
0.000842 NaN
0.000932 NaN
0.002361 0.0
0.002538 NaN
0.003179 NaN
0.003259 NaN
0.004062 1.0
0.005142 NaN