我有一个json文件,其中包含900篇文章的元数据,我想从中提取Urls。我的文件就是这样
[
{
"title": "The histologic phenotypes of …",
"authors": [
{
"name": "JE Armes"
},
],
"publisher": "Wiley Online Library",
"article_url": "https://onlinelibrary.wiley.com/doi/abs/10.1002/(SICI)1097-0142(19981201)83:11%3C2335::AID-CNCR13%3E3.0.CO;2-N",
"cites": 261,
"use": true
},
{
"title": "Comparative epidemiology of pemphigus in ...",
"authors": [
{
"name": "S Bastuji-Garin"
},
{
"name": "R Souissi"
}
],
"year": 1995,
"publisher": "search.ebscohost.com",
"article_url": "http://search.ebscohost.com/login.aspx?direct=true&profile=ehost&scope=site&authtype=crawler&jrnl=0022202X&AN=12612836&h=B9CC58JNdE8SYy4M4RyVS%2FrPdlkoZF%2FM5hifWcv%2FwFvGxUCbEaBxwQghRKlK2vLtwY2WrNNl%2B3z%2BiQawA%2BocoA%3D%3D&crl=c",
"use": true
},
.........
我想用objectpath
检查文件以创建json.tree以排除URL。这是我要执行的代码
1. import json
2. import objectpath
3. with open("Data_sample.json") as datafile: data = json.load(datafile)
4. jsonnn_tree = objectpath.Tree(data['name of data'])
5. result_tuple = tuple(jsonnn_tree.execute('$..article_url'))
但是在创建树的第4步中,我必须插入数据标记的名称,我认为我不在文件中。我该如何替换这条线?
答案 0 :(得分:0)
您可以使用列表理解功能获取所有文章网址。
import json
with open("Data_sample.json") as fh:
articles = json.load(fh)
article_urls = [article['article_url'] for article in articles]
答案 1 :(得分:0)
您可以像这样实例化树:
componentDidMount() {
const { children } = this.props;
let selectedOption = null;
// check if child elements have a 'selected' attribute and set state based on it
React.Children.map(children, (child, index) => {
if (child.props.selected) {
selectedOption = index;
}
});
this.setState({ selectedOption });
最后:
tobj = op.Tree(your_data)
results = tobj.execute("$.article_url")
将产生:
results = [x for x in results]
答案 2 :(得分:0)
您尝试删除引用并仅使用:
jsonnn_tree = objectpath.Tree(data)