所以我有大约160 000
个条目的数据集,它们是计算机生成的,并且多年来发生了错误。
让我们说表有以下几列:
- EntryID (auto int)
- FruitNumber
- JuiceNumber
- CandyNumber
- Date
现在重要的是,FruitNumber, JuiceNumber,CandyNumber
的每个组合在它们之间的时间少于12个月时都是唯一的。
这意味着所有这些精确组合只能在12个月内存在一次。现在,我需要将此数据集迁移到新的数据模型中,为此,我需要删除重复的记录(但保留其中的1条记录),我在Queries上进行了很多尝试,但找不到解决方案。
答案 0 :(得分:1)
尝试使用cte:
private ActionBar toolbar;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
toolbar = getSupportActionBar();
toolbar.setTitle("Home");
loadFragment(new home());
BottomNavigationView navigation = (BottomNavigationView)findViewById(R.id.navigation);
navigation.setOnNavigationItemSelectedListener(mOnNavigationItemSelectedListener);
}
private BottomNavigationView.OnNavigationItemSelectedListener mOnNavigationItemSelectedListener
= new BottomNavigationView.OnNavigationItemSelectedListener() {
@Override
public boolean onNavigationItemSelected(@NonNull MenuItem menuItem) {
Fragment fragment;
switch (menuItem.getItemId()){
case R.id.navigation_home:
toolbar.setTitle("Home");
fragment = new home();
loadFragment(fragment);
return true;
case R.id.navigation_dashboard:
toolbar.setTitle("Dashboard");
fragment = new Dashboard();
loadFragment(fragment);
return true;
case R.id.navigation_notifications:
toolbar.setTitle("Notifications");
fragment = new Notification();
loadFragment(fragment);
return true;
case R.id.navigation_profile:
toolbar.setTitle("Profile");
fragment = new Profile();
loadFragment(fragment);
return true;
}
return false;
}
};
private void loadFragment(Fragment fragment){
FragmentTransaction transaction = getSupportFragmentManager().beginTransaction();
transaction.replace(R.id.frame_container, fragment);
transaction.addToBackStack(null);
transaction.commit();
}
以及示例数据:
private OnFragmentInteractionListener mListener;
Button AskForHelp, Drafts, LogOut, Settings;
View view;
public Dashboard() {
// Required empty public constructor
}
public static Dashboard newInstance(String param1, String param2) {
Dashboard fragment = new Dashboard();
Bundle args = new Bundle();
args.putString(ARG_PARAM1, param1);
args.putString(ARG_PARAM2, param2);
fragment.setArguments(args);
return fragment;
}
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
if (getArguments() != null) {
mParam1 = getArguments().getString(ARG_PARAM1);
mParam2 = getArguments().getString(ARG_PARAM2);
}
}
@Override
public View onCreateView(LayoutInflater inflater, ViewGroup container,
Bundle savedInstanceState) {
view = inflater.inflate(R.layout.fragment_dashboard, container,false);
AskForHelp = (Button)view.findViewById(R.id.askHelp);
AskForHelp.setOnClickListener(this);
Drafts = (Button)view.findViewById(R.id.drafts);
Drafts.setOnClickListener(this);
LogOut = (Button)view.findViewById(R.id.logOut);
LogOut.setOnClickListener(this);
Settings = (Button)view.findViewById(R.id.settings);
Settings.setOnClickListener(this);
return view;
}
public void onButtonPressed(Uri uri) {
if (mListener != null) {
mListener.onFragmentInteraction(uri);
}
}
@Override
public void onDetach() {
super.onDetach();
mListener = null;
}
@Override
public void onClick(View v) {
switch (v.getId()){
case R.id.askHelp:
Intent intent = new Intent(getContext(), Questions.class);
startActivity(intent);
break;
case R.id.drafts:
Toast.makeText(getContext(),"Function not enabled", Toast.LENGTH_SHORT).show();
break;
case R.id.logOut:
Toast.makeText(getContext(),"Function not enabled", Toast.LENGTH_SHORT).show();
break;
case R.id.settings:
Toast.makeText(getContext(),"Function not enabled", Toast.LENGTH_SHORT).show();
break;
}
}
public interface OnFragmentInteractionListener {
void onFragmentInteraction(Uri uri);
}
和输出:
;WITH cte AS
(
SELECT
ft.EntryID
, ft.FruitNumber
, ft.JuiceNumber
, ft.CandyNumber
, ft.Date
, ROW_NUMBER() OVER (PARTITION BY ft.FruitNumber, ft.JuiceNumber, ft.CandyNumber
ORDER BY ft.FruitNumber) RN
, DENSE_RANK() OVER (ORDER BY ft.FruitNumber, ft.JuiceNumber, ft.CandyNumber)
AS Partitionid
, COUNT(1) OVER (PARTITION BY ft.FruitNumber, ft.JuiceNumber, ft.CandyNumber
ORDER BY ft.FruitNumber) as PartitionCNT
FROM FooTable ft
)
SELECT
t1.*
, DATEDIFF(DAY, t.Date, t1.Date) DATEDiff
FROM
cte t
INNER JOIN cte t1
ON t1.FruitNumber = t.FruitNumber
AND t1.JuiceNumber = t.JuiceNumber
AND t1.CandyNumber = t.CandyNumber
AND DATEDIFF(DAY, t.Date, t1.Date)>= 365
WHERE t.PartitionCNT > 1
答案 1 :(得分:0)
如果错误只是偶然的,那么这很可能会起作用:
select t.*
from (select t.*,
lag(date) over (partition by FruitNumber, JuiceNumber, CandyNumber) as prev_date
from t
) t
where prev_date is null or prev_date < dateadd(year, -1, date);
这不是一般的解决方案-尽管您可以多次运行此查询。特别是,这仅在一年内最多重复一次时有效。
不幸的是,常规解决方案需要递归CTE。例如,如果您每个月都有一条记录,那么弄清楚如何保留“一月”记录就很棘手。