我知道这是一个类似的问题。我已经完成了一些答案而没有工作。所以这就是问题,我正在为MapReduce程序编写mapper和reducer,我收到以下错误
Traceback(最近一次调用最后一次):文件 " / usr / local / hadoop /./ reducer.py" ;,第10行,in desc,count = line.split(' \ t',1)ValueError:需要多于1个值来解包
我无法调试错误,因为我不知道导致问题的原因。请在下面找到我的Mapper和Reducer类的代码。
映射器代码:
#!/usr/bin/env python
import sys
for line in sys.stdin:
line = line.strip('')
bYear = line.split(',')
for birthYear in bYear:
print '%s\t%s' % (bYear[6],1)
减速机代码:
#!/usr/bin/env python
import sys
current_desc = None
current_count = 0
desc = None
for line in sys.stdin:
line = line.strip()
**desc, count = line.split('\t', 1)** . ---> This is where I'm getting an error.
try:
count = int(count)
except ValueError:
continue
if current_desc == desc:
current_count += count
else:
if current_desc:
# write result to STDOUT
print '%s\t%s' % (current_desc, current_count)
current_count = count
current_desc = desc
if current_desc == desc:
print '%s\t%s' % (current_desc, current_count)
请帮忙。
答案 0 :(得分:0)
似乎没有' \ t'该特定行中的字符,因此grid = np.random.rand(40,2)
full = pd.DataFrame(grid, columns=['value'])
def percentile(x, df):
if int(x.name)<20:
pass
else:
df_temp = df.loc[(int(x.name)-20):int(x.name),'value']
bucketted = [b for b in df_temp.value if b < df_temp.loc[int(x.name), 'value']]
return len(bucketted)/0.2
full['percentile'] = full.apply(percentile, axis=1, args=(full,))
仅返回1个元素,无法分配给line.split('\t', 1)