我是Scala的新手并且正在努力解决这个用例 我有一个名单列表,我需要搜索是否存在任何这些名称我是DataFrame的特定列。
我的DataFrame有两列,如下所示:
{"jack","daniel"}
我有一个名单列表,上面写着名单df_cleansing.filter("description".isin(listOfLines:_*))
现在我需要遍历DataFrame的description列,看看列表中是否有任何单词出现在描述列中。
我尝试使用这段代码:
$('#calendar').fullCalendar({
header: {
left: 'prev,next today',
center: 'title',
right: 'listMonth, month,agendaWeek,agendaDay'
},
defaultView: 'listMonth',
locale: 'fr',
contentHeight: 600,
navLinks: true, // can click day/week names to navigate views
selectable: false,
eventRender: function(event, element, view) {
element.find('.fc-widget-header').append("<div style='color:#fff'>Conférencier choisi</div>");
element.find('.fc-title').append("<br/>" + event.lieu);
element.find('.fc-list-item-title').append("<br/>" + event.lieu);
element.find('.fc-list-item-title').append("<a href='" + event.lienconferencier + "'><div class='conferencier-calendrier-container'><div style='float:left;background-image:url(" + event.photoconferencier + ");width:40px;height:40px;background-size:cover;border-radius:100px;'></div><div style='float:left;padding-left:5px;font-weight:normal;'><strong>Conférencier</strong><br>" + event.conferencier + "</div></a>");
return ['all', event.status].indexOf($('#filter-status').val()) >= 0 &&
['all', event.client].indexOf($('#filter-contact').val()) >= 0 &&
['all', event.conferencier].indexOf($('#filter-conferencier').val()) >= 0 &&
['', event.numero].indexOf($('#numero').val()) >= 0;
},
selectHelper: true,
editable: false,
eventLimit: true, // allow "more" link when too many events
events: [
{
title: 'Example',
start: '2018-05-05',
end: '2018-05-06',
color: '#ff0000',
lieu: 'Montreal',
numero: '300445',
conferencier: 'John Doe',
photoconferencier: 'http://www.example.com/img/profile.jpg',
lienconferencier: 'http://www.example.com/profile/link.html',
url: 'http://www.google.com'
},
{
title: 'Example2',
start: '2018-05-08',
end: '2018-05-010',
color: '#ff0000',
lieu: 'New York',
numero: '300446',
conferencier: 'Steve Jobs',
photoconferencier: 'http://www.example.com/img/profile2.jpg',
lienconferencier: 'http://www.example.com/profile/link2.html',
url: 'http://www.apple.com'
},
],
});
但它显示编译错误。任何线索都会有很大的帮助。
答案 0 :(得分:2)
如下定义udf
功能应该适合您
import org.apache.spark.sql.functions._
// udf function for checking if any of the words in the
// list is contained in the value of description column
def containsUdf = udf((strCol: String) => listOfLines.exists(strCol.contains))
//calling the udf function
df_cleansing.filter(containsUdf(col("description")))
应该给你
+-----+----------------------+
|no. |description |
+-----+----------------------+
|12342|my name is jack |
|2345 |daniel is my neighbour|
+-----+----------------------+