这是我的文字:
Title
1 First section
1.1 Introduction1
Hello. My name is John. I am an under graduate student. I live in the U.S. I am majoring in computer science. Blah blah blah.
1.2 Another Intro
My last name is Doe. Blah blah blah blah. Another random sentence.
2 next section name
2.1 Random Section name
Blah blah blah blah. Another random sentence. Another random sentence.
Another random sentence.
2.2 Requirements
The requirements include:
1. blah blah
2. blah blah blah
3. another random sentence
3 Third section
Blah blah blah. Blah blah blah blah.
4 End
我想创建一个数据框,如下所示:
Section Name String
1 First section
1.1 Introduction1 Hello. My name is John. I am an under graduate student. I live in the U.S. I am majoring in computer science. Blah blah blah.
1.2 Another Intro My last name is Doe. Blah blah blah blah. Another random sentence.
2 next section name
2.1 Random Section name Blah blah blah blah. Another random sentence. Another random sentence.
2.2 Requirements The requirements include:
1. blah blah
2. blah blah blah
3. another random sentence
3 Third Section Blah blah blah. Blah blah blah blah.
4 End
所以基本上,我想创建一个包含两列的数据框: 节号和名称,以及一列,其中包含该节后的所有内容,直到下一个节号。
答案 0 :(得分:0)
以下解决方案并非针对各种格式选项或“奇怪”字符串的故障保护。它还使用一些变通办法来使您的文本更易于解析。您可能需要根据输入内容来调整/使用正则表达式。此外,关于速度,肯定可以改进以下方法。但是,它至少应该为您提供一个解决问题的方法。
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import spline
x = np.array([0,1,2])
y = np.array([5, 4.31, 4.01])
plt.plot(x, y)
xnew = np.linspace(x.min(), x.max(), 300)
smooth = spline(x, y, xnew, order=2)
plt.plot(xnew, smooth)
plt.show()