提升数据分析能力
仅供日常学习打卡,欢迎交流
SQL经典题目——连续登陆天数
考察重点:窗口函数+聚合函数
题目类型:
① 计算连续登录n天及以上的次数
② 计算用户连续登录的最长天数
row_number排序后根据日期与自增序号之差判断
求最大连续登录天数/连续登录n天以上的次数
【假设登陆日期可能包含具体时间(yyyy-mm-dd hh:mm:ss),有重复】
(1)数据去重
-- 数据去重
Select id, substr(login_date,10) as dt
From table
Group by 1,2
(2)按id分组,日期正序排序
-- 按id分组,日期正序排序
Select id, dt, row_number()over(partition by id order by dt) as rn
From (
Select id, substr(login_date,10) as dt
From table
Group by 1,2
) as a
Order by 1,2
过程模拟:
Select id, dt, rn, date_sub(dt,rn) as dt_gap
From (
Select id, dt, row_number()over(partition by id order by dt) as rn
From (
Select id, substr(login_date,10) as dt
From table
Group by 1,2
) as a
) as b
Order by 1,2
(4)根据id,dt_gap分组计数,即为连续登录天数
Select id, dt_gap, count(1) as login_cnt
From (
Select id, dt, rn, date_sub(dt,rn) as dt_gap
From (
Select id, dt, row_number()over(partition by id order by dt) as rn
From (
Select id, substr(login_date,10) as dt
From table
Group by 1,2
) as a
) as b
) as c
Order by 1,2
(5)按id分组用max求最大连续登录天数
过程模拟:
Select id, max(login_cnt) as max_login_day
From (
Select id, dt_gap, count(1) as login_cnt
From (
Select id, dt, rn, date_sub(dt,rn) as dt_gap
From (
Select id, dt, row_number()over(partition by id order by dt) as rn
From (
Select id, substr(login_date,10) as dt
From table
--【可加where限制时间范围,例如:
-- where substr(login_date,10) between ‘2024-10-01’and ‘2024-10-31’】
Group by 1,2
) as a
) as b
) as c
) as d
Order by 1
过程模拟:
因篇幅问题不能全部显示,请点此查看更多更全内容