如何使用python删除sql生成的表中的重复项

时间:2018-10-11 17:09:23

标签: python sql pycharm

我在python中运行两个查询(这会生成两个不同的数据集,但标题相同)。我使用python将它们组合成一个数据集,看起来像这样:(但是此实际数据集未显示在任何地方,我只使用table_a + table_b并将它们组合为一个)

>>> import numpy as np
>>> import math
>>> 
>>> a = np.random.randint(0, 1000, (6,)).astype(object)
>>> a[a%2==0] = np.nan
>>> 
>>> fact_exact = np.vectorize(math.factorial, 'O', 'O')
>>> 
>>> fact_exact(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/paul/.local/lib/python3.6/site-packages/numpy/lib/function_base.py", line 1972, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "/home/paul/.local/lib/python3.6/site-packages/numpy/lib/function_base.py", line 2048, in _vectorize_call
    outputs = ufunc(*inputs)
ValueError: factorial() only accepts integral values
>>> 
>>> a[a!=a] = 0
>>> fact_exact(a)
array([9819935662418089743352075922310862095706065486822583658822975979153852871637910339598847876493575760863201233608970580391009961465728060140206398380369810186460532083760537973722230477712617437079362600099095591538946730193485520929914465963675497331037894791629662134417383906616748712477435411911352595846133057242505006764835196420336585309344206359125847804414531691517822911373600118902137858177047463867389635205323328678714656377591230065986360526515442653777496908763065282294664208227077490200850296013058820462199153017425546879776071769432946284989651969735166129654123362278827485074178681546981559466233191972688158356430976918192398846419304865350500808417927115875428971873067092978672051108353026958311731456630717915806992149025378731927814021805881859364498816522297657223802150320368577537638698692463078070519911729996949263069045872688620575874758242248117345983373644762881336075203583068807371386560008413979828440302163961903567206206098114957943899603695885783671168564745354608640000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000,
       22328783881661914958481873975346502495151470121092663127656427617172486869336444341196216861471796204456103981797935323465763492125980526669772652700063306391000092324747490987759008282321662774044560021923711172537165034028116470777032463317525690139861312277154265627409161865934581816407380706408159413469087649804140238680046340298380454769197056000000000000000000000000000000000000000000000000000,
       1,
       61249584099358401539774988285121649211647782880181065019552657036267338153088195303988201779967275642784589505913349592976251572958797164520286603082616258499126414850388770750032832244874744865500684599339365169094265281656246018624169125087086336929008659140773790287427038315506740711640971717627407262119806133914039569804387544893605360482632749642132398074143010093832414811273406748220437584361624445361171146706501836044960640727879585735220969146850637281930634576684379022439144569827759897323120413808197447743317836963898751450642251281351982277623696403714801809091137618510094637754741546381374172490209156669750628265287758243565040756752491082629092890931663069084118249960190350279925210044221389170848672643624902424798289485981643559009642358060100976306359010066013465973059932028926310180595315985960099791957394179039519432507444190747654625992620055591848528852607925564873303749001475451862943569149219508203963665660697011849205174996326078837279628237406181221912723812127044670946612175065696608648876366755523800502033220426264259724448110042998615347327090687044945724644868095726898042638404229137017574884525227292991943592508583104116919096883640157188742952660337139750108570879849335960456768856494175006057451288109527150100807278246132549650716938934121106772599000305859091685578549764454500874996178837114679306052077693402114826710945748516120895211107804543955416170463298478450007640457295281818515443548160000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000,
       1, 1], dtype=object)

如果我想删除重复符号并产生如下输出:

date      symbol    data
10/9/2018   a       0.1
10/9/2018   b       0.2
10/9/2018   c       0.3
10/9/2018   a       0.1
10/9/2018   a       0.1

我应该怎么做?

谢谢!

1 个答案:

答案 0 :(得分:0)

只需使用DISTINCT即可删除重复的行。您没有发布SQL,但是看起来应该像这样:

select distinct date, symbol, data from my_table

请注意,DISTINCT适用于整行,而不仅适用于第一列。