sql

Solving a LinkedIn Data Science SQL Interview Question

Bram Adams

27 Jan 2023 — 1 min read

 https://www.youtube.com/watch?v=d2920ysu2hQ&ab_channel=JayFeng

 01:37

peepoThink (1)

we could use GROUP BY to see how many posting share the same uid
we could then SELECT WHERE count == 1
(2)
use DATE_TRUNC to group by year or
use WHERE to split each into 180 day intervals from now()

 02:13

visualize the output and work backwards, make an output table, split into intermediate tables to process

 05:55

WITH user_job AS (
	SELECT u_id, j_id, COUNT(DISTINCT date_posted) AS num_posted
	FROM jobs
	GROUP BY u_id, j_id
)

 08:06

SELECT
	SUM(CASE WHEN avg_posted > 1 THEN 1 END) AS posted_multiple
	, SUM(CASE WHEN avg_posted = 1 THEN 1 END) AS posted_once
FROM (
	SELECT
		user_id, AVG(num_posted) AS avg_posted
	FROM user_job
	GROUP BY user_id
) AS t

 10:41

HAVING won't work because it will filter values

Post Mortem

WITH - named function
COUNT(DISTINCT field_name) - get count of distinct values in field
GROUP_BY creates sub tables on different keys
CASE WHEN - if
END - end the if
ELSE - else
AVG() - get average of field

Use intermediate tables to get to the final table