我是python的新手,并且为此苦苦挣扎了一段时间。 我有一个看起来像这样的文件:
name seq
1 a1 bbb
2 a2 bbc
3 b1 fff
4 b2 fff
5 c1 aaa
6 c2 acg
其中name是字符串的名称,seq是字符串。 我想要一个新的列或一个新的数据框,以指示每两行之间没有重叠的差异数。例如,我想要名称[a1-a2],然后是[b1-b2],最后是[c1-c2],序列之间的差异数。
所以我需要这样的东西:
name seq diff
1 a1 bbb NA
2 a2 bbc 1
3 b1 fff NA
4 b2 fff 0
5 c1 aaa NA
6 c2 acg 2
我们非常感谢您的帮助
答案 0 :(得分:5)
您似乎想要jaccard distance对字符串。这是使用groupby
和scipy.spatial.distance.jaccard
的一种方法:
from scipy.spatial.distance import jaccard
g = df.groupby(df.name.str[0])
df['diff'] = [sim for _, seqs in g.seq for sim in
[float('nan'), jaccard(*map(list,seqs))]]
print(df)
name seq diff
1 a1 bbb NaN
2 a2 bbc 1.0
3 b1 fff NaN
4 b2 fff 0.0
5 c1 aaa NaN
6 c2 acg 2.0
答案 1 :(得分:4)
距离Levenshtein
的替代项:
import Levenshtein
s = df['name'].str[0]
out = df.assign(Diff=s.drop_duplicates(keep='last').map(df.groupby(s)['seq']
.apply(lambda x: Levenshtein.distance(x.iloc[0],x.iloc[-1]))))
name seq Diff
1 a1 bbb NaN
2 a2 bbc 1.0
3 b1 fff NaN
4 b2 fff 0.0
5 c1 aaa NaN
6 c2 acg 2.0
答案 2 :(得分:1)
第一步,我用以下方法重新创建了数据:
function OpticalFunction
daq.reset
clear, close all
clc;
s = daq.createSession('ni');
% Creates the session object
s.addDigitalChannel('Dev1','Port0/Line0:7','OutputOnly');
% Adds 8 digital output channels (numbered 0:7) on the DAQ card
% The following creates the uicontrols
onoff = uicontrol('Style','togglebutton','String','go',...
'Position',[20 200 70 40],'Callback',@move_buggy);
forwards = uicontrol('Style','pushbutton','String','forwards',...
'Position',[20 150 70 40],'Callback',@go_forward);
backwards = uicontrol('Style','pushbutton','String','backwards',...
'Position',[20 100 70 40],'Callback',@go_backward);
nout = [51 102 204 153]; % decimal sequence for forward motion
% This is the callback function for the toggle button.
% It moves the buggy when the toggle button is pressed.
% 'hObject' is the handle for the uicontrol calling the function.
function move_buggy(hObject,eventdata)
while hObject.Value == hObject.Max
for n=1:4
output_data=dec2binvec(nout(n),8);
% high state=1 low state=0
outputSingleScan(s,output_data);
% outputs the data in output_data to the device
pause(1.6)
% use this to change the speed of the motor
end
end
end
% These are the callbacks for the pushbuttons.
% They set the direction of travel for the motors.
function go_forward(hObject,eventdata)
nout = [51 102 204 153];
end
function go_backward(hObject,eventdata)
nout = [153 204 102 51];
end
end
%%
startw = input('Enter starting wavelength: ');
deend = input('Desired final wavelength: ');
r = 11/62; % this is the rate of wavelegth change with time for GaAs
r = 29.5/66; %this is the rate of wavelenght change with time for GaP
% comment off the r value not used
OpticalFunction
% calls on the function optical thing
解决方案
您可以尝试遍历数据框,并将上一次迭代的 fetch(`{{ url('fetch/proficiency/list') }}`, {
method: 'GET',
headers: {
'Content-Type': 'application/json',
},
}).then(r => {
return r.json();
}).then(results => {
//console.log(results);
$("#all-proficiency").html("");
$("#edit-all-proficiency").html("");
$.each(results, function(index, val) {
$("#all-proficiency").append(`
<input type="checkbox" class="form-check-input" name="proficiency[]" value="${val.id}">${val.name}<br>
`);
$("#edit-all-proficiency").append(`
<input type="checkbox" class="form-check-input" name="proficiency[]" value="${val.id}">${val.name}<br>
`);
});
}).catch(err => {
console.log(err);
})
}```
**the ajax saving the project**
function saveNewProject() {
var _token = $('#token').val();
var title = $('#title').val();
var context = $('#context').val();
var description = $('#description').val();
var start_date = $('#start_date').val();
var project = $('#project').val();
var proficiency = [];
$.each($("input[name='proficiency']:checked"), function() {
proficiency.push($(this).val());
});
var details = $('#details').val();
$.ajax({
url: "add/project",
type: "POST",
data:{
"_token": "{{ csrf_token() }}",
title:title,
context:context,
description:description,
start_date:start_date,
project:project,
stack:stack,
proficiency:proficiency,
details:details,`
**my project model where i fetch the proficiency from...**
public function getProficiencyList(){
// body
$proficiency = [
[
"id" => 1,
"name" => "Expert"
],
[
"id" => 2,
"name" => "Intermediate"
],
[
"id" => 3,
"name" => "Beginner"
],
[
"id" => 4,
"name" => "Novice"
],
];
// return
return $proficiency;
}
值与当前迭代进行比较。为了比较两个字符串(存储在数据框的#!/usr/bin/env python3
import pandas as pd
# Setup
data = {'name': {1: 'a1', 2: 'a2', 3: 'b1', 4: 'b2', 5: 'c1', 6: 'c2'}, 'seq': {1: 'bbb', 2: 'bbc', 3: 'fff', 4: 'fff', 5: 'aaa', 6: 'acg'}}
df = pd.DataFrame(data)
列中),您可以像下面的函数一样应用简单的列表理解:
seq
对数据框行进行迭代
seq
结果看起来像这样
def diff_letters(a,b):
return sum ( a[i] != b[i] for i in range(len(a)) )
答案 3 :(得分:0)
选中这个
import pandas as pd
data = {'name': ['a1', 'a2','b1','b2','c1','c2'],
'seq': ['bbb', 'bbc','fff','fff','aaa','acg']
}
df = pd.DataFrame (data, columns = ['name','seq'])
diffCntr=0
df['diff'] = np.nan
i=0
while i < len(df)-1:
diffCntr=np.nan
item=df.at[i,'seq']
df.at[i,'diff']=diffCntr
diffCntr=0
for j in df.at[i+1,'seq']:
if item.find(j) < 0:
diffCntr +=1
df.at[i+1,'diff']=diffCntr
i +=2
df
结果是这样的:
name seq diff
0 a1 bbb NaN
1 a2 bbc 1.0
2 b1 fff NaN
3 b2 fff 0.0
4 c1 aaa NaN
5 c2 acg 2.0