我有大约700万边缘的大型数据集,在广泛搜索可视化这些数据的方法和工具之后,我遇到了pygraphistry
我试图想象边缘和连接,而不是现在应用任何建模。但这显示没有错误没有超过6小时的输出
我的工作环境是python 3.x anaconda和windows 64 bit
import pandas
import graphistry
# "GRAPHISTRY_API_KEY".
graphistry.register(key='key_from_team')
column_names = ['node1', 'node2', 'StartTime', 'EndTime']
logs = pandas.read_csv('Edges.dat', header = None, names = column_names )
logs[:4] # Show the first three rows of the loaded dataframe
'''
logs['StartTime'] = pandas.to_datetime(logs['StartTime'], unit='s')
logs['EndTime'] = pandas.to_datetime(logs['EndTime'], unit='s')
logs[:4]
'''
plotter = graphistry.bind(source='node1', destination='node2')
plotter.plot(logs)