简单来说,我试图合并两组数据。我打开使用grep / bash或python。
阅读目录/ mediaid
阅读.json文件'文件名
如果.json文件名与.csv中的行匹配,则复制该行中json文件的内容(如果没有,只需跳过)
输入数据
File1.csv
testentry, 1234
testentry1, 6789
INPUT DATA(文件名是要检查的MEDIAID)
1234.json
[
{"id":"1", "text":"Nice man!"},
{"id":"2", "text":"Good job"}
]
6789.json
[
{"id":"1", "text":"Test1"},
{"id":"2", "text":"Test2"}
]
期望的输出数据.csv
testentry, 1234, Nice man!, Good job
testentry1, 6789, Test1, Test2
我正在尝试使用GREP,但我无法检查json文件名并从中传递数据。
#!/usr/bin/env bash
indir="$HOME/indir"
outdir="$HOME/outdir"
cd "$indir" || exit
mkdir -p "$outdir" || exit
for f in *.csv; do
[[ -f $f ]] || continue
lines=()
while IFS=, read -ra cols; do
if (( ${#cols[@]} != 2 )); then
echo "Sorry buddy, you'll have to use a real CSV parser to handle: $f" >&2
exit 1
fi
# Does the basename match the contents of the first column?
if [[ ${cols[0]} == "${f%.*}" ]]; then
echo "Match found in $f"
fi
lines+=("${cols[0]},${cols[1]}")
done <"$f"
# something with JQ to read the json filename, and pass its data into the row
printf '%s\n' "${lines[@]}" > "$outdir/$f" || exit
done
在Python中失败但尝试稍微好一些:
import csv
import json
path_to_json = 'somedir/'
json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
print json_files #
with open(json_files) as lookuplist:
# IT NEEDS to match the mediaID from the json FILENAME
with open('file1.csv', "r") as csvinput:
with open('VlookupOut','w') as output:
reader = csv.reader(lookuplist)
reader2 = csv.reader(csvinput)
writer = csv.writer(output)
d = {}
for xl in reader2:
d[xl[2]] = xl[3:]
for i in reader:
if i[4] in d:
i.append(d[i[4]])
writer.writerow(i)
答案 0 :(得分:1)
这提供了您所需的输出:
for file in /mediaid/*; do
while read -r entry fileid; do
jsonfile="$fileid.json"
if [[ -f "$jsonfile" ]]; then
text=$(jq -r 'map(.text) | join(", ")' "$jsonfile")
echo "$entry $fileid, $text"
fi
done < "$file"
done > output.csv
使用jq来解析JSON文件