在对象属性中分割字符串并创建新的数据集

时间:2018-09-08 18:34:22

标签: javascript arrays object d3.js

我有一个数据集,其中最后一项是句子形式的字符串。我的目标是将句子分解为单词,并创建一个新的数据集,其中每个单词都位于单独的行上,如下所示:

这是旧数据集的格式:

0: Object { creator: "molly", number: 3, doc: "The cat in the hat ate the rat", … }
1: Object { creator: "may", number: 4, doc: "the crass rat", … }
2: Object { creator: "may", number: 4, doc: "The mouse in the pouch at the cat", … }
3: Object { creator: "may", number: 4, doc: "the fish hog", … }
4: Object { creator: "may", number: 4, doc: "the dog warm", … }

这是我想要的格式:

0: Object { creator: "molly", number: 3, doc: "The", … }
1: Object { creator: "molly", number: 3, doc: "cat", … }
2: Object { creator: "molly", number: 3, doc: "in", … }
3: Object { creator: "molly", number: 3, doc: "the", … }
4: Object { creator: "molly", number: 3, doc: "hat", … }
5: Object { creator: "molly", number: 3, doc: "ate", … }
6: Object { creator: "molly", number: 3, doc: "the", … }
7: Object { creator: "molly", number: 3, doc: "rat", … }
8: Object { creator: "may", number: 4, doc: "the", … }
9: Object { creator: "may", number: 4, doc: "crass", … }
10: Object { creator: "may", number: 4, doc: "rat", … }

我正在使用D3。以下代码使我可以生成一个新的数据集,其中每个单词都位于单独的行上:

doc.csv:

    date,number,creator,,doc
6/16/2000,3,molly,3,The cat in the hat ate the rat
2/25/2002,4,may,2,The mouse in the pouch at the cat
12/5/2004,3,molly,4,the lovely fish
7/6/2006,1,milly,1,the pog dog
9/7/2003,4,may,4,the fish hog
12/10/2001,4,may,3,the crass rat
6/15/2005,2,maggie,3,the ass rat
6/9/2004,1,milly,4,the fish blue
10/5/2005,1,milly,3,the rat true
10/7/2003,4,may,1,the dog warm
1/19/2009,4,may,2,the cat norm
10/30/2007,1,milly,4,the fish wish
8/13/2009,4,may,2,cat bat ticks
9/30/2004,3,molly,1,dog nog mog
1/17/2006,4,may,3,rat tittily too
12/18/2009,3,molly,1,dog coppily poo
11/2/2007,2,maggie,3,rat pitpat poo
4/17/2007,1,milly,4,fish too!

html:

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta chartset="utf-8">
        <title>Interactive scatterplot</title>
        <link rel="stylesheet" type="text/css" href="style.css">
        <script type="text/javascript" src="d3.v4.js"></script>

    </head>

    <body>

<script type="text/javascript" src="split.js"></script>

<textarea id="txtName" name="txt-Name" placeholder="Search for something.." rows="1"></textarea>


</div>
    </body>
</html>

代码:

var parseDate = d3.timeParse("%m/%d/%Y");

    var hoot = function(d) {return d.doc.split(" ").forEach(function (item) {
        var data2 = {creator: d.creator, date: parseDate(d.date),item: item}
        console.log(data2)
    });}



    d3.csv("doc.csv")
      .row(function(d) {return {creator: d.creator,date: parseDate(d.date),number: Number(d.number),doc: d.doc, split: (hoot(d))};})
      .get(function(error, data) {

    });

令人高兴的是,当我console.log data2时,我得到了一些接近最终目标的东西:

enter image description here

我有两个问题:

1)函数运行后,变量data2不可用。我试图通过将data2放在脚本的开头来使var data2 = [];成为全局变量,但这是行不通的。

2)变量data2不采用单个数组的形式。我尝试将方括号放在变量行(即var data2 = [{creator: d.creator, date: parseDate(d.date),item: item}])周围,但这会形成许多数组,而不是一个数组。

提前感谢您的宝贵时间。

1 个答案:

答案 0 :(得分:3)

这里data2foreach循环内的局部变量。因此,即使将其设置为全局值,也只会在上一次迭代期间获得该值。相反,您可以在每次迭代期间将data2做成一个数组并将push的值放入其中。可能看起来像这样

var parseDate = d3.timeParse("%m/%d/%Y");
var data2 = [];
    var hoot = function(d) {return d.doc.split(" ").forEach(function (item) {
       data2.push({creator: d.creator, date: parseDate(d.date),item: item})
    });}
console.log(data2);



    d3.csv("doc.csv")
      .row(function(d) {return {creator: d.creator,date: parseDate(d.date),number: Number(d.number),doc: d.doc, split: (hoot(d))};})
      .get(function(error, data) {

    });

现在控制台登录并查看,希望您能获得预期的结果。