需要拆分字典

时间:2020-04-26 18:30:26

标签: python pandas dataframe

我在csv中有以下数据(这是一个逗号分隔的文件,第一行是列标题)

ID,ENV,dictionary_column
35702,name1,"{'Employee': 1.56, 'IC': 1.18}"
35700,nam22,"{'Quota': 3.06, 'ICS': 0.37}"
11765,quotation,"{'02 WSS': 12235, '44 HR Part': 485, '333 CNTL':1}"
22345,gamechanger,"{'02 Employee's': 5.1923513, '04 Participant': 0.167899}"
22345,supporter,"{'0': '31', 'Table': '5', 'NewAssignee': '1', 'Result': '5'}"

dictionary_column列包含多个键-值对,我需要将它们分开并与其余列连接

所需的输出(csv或数据帧):

ID      ENV        dictionary_key       dictionary_value
35702   name1      Employee abc         1.56
35702   name1      IC                   1.18
35700   nam22      Quotation            3.06
35700   nam22      IC newer             0.37
35700   nam22      newmeansnew          0.001
11765   quotation  02 WSS               12235
11765   quotation  44 HR Part           485
11765   quotation  333 CNTL             1
........ .......   ...                  ... (likewise)

(不要介意输出中的空格,为格式或可读性而添加的内容)

The dictionary_column values example : 

“ {'0':'31','Table':'5','NewAssignee':'1','Result':'5'}”“

this is the trouble part

我尝试了ast函数的一些操作,还尝试通过json.normalize将dict转换为json 但如果有1万行,任何方法都无法给出正确的结果

3 个答案:

答案 0 :(得分:1)

您可以使用:

import json
import pandas as pd

with open("the.csv") as f:
    next(f)
    lines = [x.strip() for x in f]

vals = []
for line in lines:
    parts = line.split("   ") # file seems separated by 3 spaces or \t, adjust if needed
    for k, v in json.loads(parts[2].replace("'", "\"")).items(): # json.loads() excepts values enclosed in double quote, not single
        vals.append([parts[0], parts[1], k, v])

df = pd.DataFrame(vals, columns=["ID", "ENV", "dictionary_key", "dictionary_value"])

      ID    ENV dictionary_key  dictionary_value
0  35702  name1   Employee abc             1.560
1  35702  name1             IC             1.180
2  35700  nam22      Quotation             3.060
3  35700  nam22       IC newer             0.370
4  35700  nam22    newmeansnew             0.001

Demo

答案 1 :(得分:1)

所需输出的解决方案

import json
import pandas as pd

with open("the.csv") as f:
    next(f)
    lines = [x.strip() for x in f]
 
vals = ""
valLst = []
for line in lines:
    parts = line.split(",") # file seems separated by 3 spaces or \t, adjust if needed

    flag = False
    nextParts = ""
    for part in parts: 
        if part.startswith('\"'): 
            flag = True
        if flag:
            nextParts = nextParts +','+ part
           

    nextParts = nextParts.strip(',')
    nextParts = nextParts.strip('"')
    nextParts = nextParts.replace('\'', "\"")
    
   
    for k, v in json.loads(nextParts).items(): # json.loads() excepts values enclosed in double quote, not single
        valLst.append([parts[0], parts[1], k, v])


df = pd.DataFrame(valLst, columns=["ID", "ENV", "dictionary_key", "dictionary_value"])

答案 2 :(得分:0)

更完整的解决方案

 public void OnImageChanged(ARTrackedImagesChangedEventArgs args)
{

    foreach (var trackedImage in args.added)
    {
       //Instantiating prefab
        cubePrefab = Instantiate(cubeTv, trackedImage.transform);
        Debug.Log("Cube prefb name "+cubePrefab.name);
        var data = allDataContent.FirstOrDefault(i => i.imgFileName == trackedImage.referenceImage.name);
        videoName = data.vidFileName;
        Debug.Log("Video File Name = " + videoName);
        vp = cubePrefab.GetComponent<VideoPlayer>();
        vp.source = VideoSource.Url;
        vp.url = Application.persistentDataPath + "/" + videoName;

        vp.playOnAwake = true;
        vp.isLooping = true;
        vp.renderMode = UnityEngine.Video.VideoRenderMode.MaterialOverride;
        vp.targetMaterialRenderer = GetComponent<Renderer>();
        vp.targetMaterialProperty = "_MainTex";
        vp.Play();


        holdList.Add(cubePrefab, trackedImage.referenceImage.name);


    }
    foreach (var trackedImage in args.updated)
    {


        if (trackedImage.trackingState == TrackingState.Tracking)
        {


            var data = allDataContent.FirstOrDefault(i => i.imgFileName == trackedImage.referenceImage.name);
            videoName = data.vidFileName;
            if (File.Exists(Application.persistentDataPath + "/" + videoName))
            {

                foreach (var v in holdList.Values)
                {
                    go = holdList.FirstOrDefault(x => x.Value == trackedImage.referenceImage.name).Key;
                    if (v == trackedImage.referenceImage.name)
                    {

                        go.SetActive(true);


                    }
                    else if(v!= trackedImage.referenceImage.name)
                    {


                         List<GameObject> go = new List<GameObject>();
                         var exceptOne = holdList.Where(i => i.Value != trackedImage.referenceImage.name).ToList();
                         foreach (var item in exceptOne)
                         {
                             go.Add(item.Key);
                         }
                         foreach (var gameObjectone in go)
                         {
                             gameObjectone.SetActive(false);
                         }

                    }
                }



            }




            }
        else
        {

           // go.SetActive(false);
        }




    }

    foreach (var trackedImage in args.removed)
    {
       // go.SetActive(false);
    }

}