我正在尝试使用Pandas阅读制表符分隔的txt文件。该文件如下所示:
14.38 14.21 0.8951 5.386 3.312 2.462 4.956 1<p>
14.69 14.49 0.8799 5.563 3.259 3.586 5.219 1<p>
14.11 14.12 0.8911 5.422 3.302 2.723 5 1<p>
某些行具有额外的标签。如果我使用read_csv或read_fwf,并指定sep ='\ t'。我得到的结果如下所示:
d
0 15.26\t14.84\t0.871\t5.763\t3.312\t2.221\t5.22\t1<p>
1 14.88\t14.57\t0.8811\t5.554\t3.333\t1.018\t4.9 <p>
对于我可以指定哪些参数来解决此问题,您有任何建议吗?谢谢。
解决方案:
使用pd.read_csv(filename,delim_whitespace = True)
答案 0 :(得分:0)
如果我使用此代码:
public class ReceiveBroadcast extends BroadcastReceiver {
private static final String TAG = "MyBroadcastReceiver";
String body, number;
@Override
public void onReceive(Context context, Intent intent) {
Bundle bundle = intent.getExtras();
if(bundle != null) {
Object[] obj=(Object[])bundle.get("pdus");
if(obj!=null){
for(int i=0; i<obj.length; i++){
SmsMessage smsMessage = SmsMessage.createFromPdu((byte[])obj[i]);
body = smsMessage.getMessageBody().toString();
number = smsMessage.getOriginatingAddress().toString();
}
databasePhone.orderByKey().limitToLast(1).addListenerForSingleValueEvent(new ValueEventListener() {
@Override
public void onDataChange(DataSnapshot dataSnapshot) {
for (DataSnapshot readphone : dataSnapshot.getChildren()) {
Log.v("tmz", "" + readphone.getKey()); //displays the key for the node
String lastphoneNumber = readphone.child("phoneNumber").getValue().toString();
String lastIMSINumber = readphone.child("code").getValue().toString();
//String lastIMSINumber= "278010401571570";
if(lastIMSINumber.equals(imsi)){
sendSMSBroadcast();
signoutButton.setEnabled(true);
statusText.setText("Signed in ");
SmsManager sms = SmsManager.getDefault();
sms.sendTextMessage(number, null, "Verified " , null, null);
Toast.makeText(MainActivity.this, "Phone Number Retrieved "+ lastphoneNumber + " IMSI: " + lastIMSINumber, Toast.LENGTH_LONG).show();
} else {
Toast.makeText(MainActivity.this, "Code not Verified. Incorrect IMSI. ", Toast.LENGTH_LONG).show();
}
}
}
@Override
public void onCancelled(DatabaseError databaseError) {}
});
}
}
}
}
在此文件上:
import pandas as pd
parsed_csv_txt = pd.read_csv("tabbed.txt",sep="\t")
print(parsed_csv_txt)
我得到:
a b c d e
14.69 2452 982 234 12
14.11 5435 234 12
16.63 1 12 66
我们在这里看到的输出是否有问题?
如果您想要不同的输出,例如:
a b c d e
0 14.69 2452 982.0 234.0 12
1 14.11 5435 234.0 NaN 12
2 16.63 1 NaN 12.0 66
使用此代码:
a b c d e
0 14.69 2452 982 234 12.0
1 14.11 5435 234 12 NaN
2 16.63 1 12 66 NaN
注意
有关值之间的空白量可变的话题的更长时间讨论,请查看此讨论:Can pandas handle variable-length whitespace as column delimiters
答案 1 :(得分:0)
Pandas read_csv非常通用,可以将其与delim_whitespace = True一起使用以处理可变数量的空白。
df = pd.read_csv(filename, delim_whitespace=True)