February 23, 2018

SQL 学习笔记 (2)

离上次写 SQL 学习笔记 过去了快三年,这次重新学习一下。下面是 SQL 和 MySQL 的一些题目练习。

参考资料:


175. Combine Two Tables

法一. 使用 Left Join

SELECT p.FirstName, p.LastName, a.City, a.State 
FROM Person p LEFT JOIN Address a 
ON p.PersonId = a.PersonId;

还可以再写的精简一点:

SELECT FirstName, LastName, City, State 
FROM Person LEFT JOIN Address
ON Person.PersonId = Address.PersonId

错误解法:使用 WHERE

要注意这种错误解法:

SELECT p.FirstName, p.LastName, a.City, a.State
From Person p, Address a
WHERE p.PersonId = a.PersonId

这道题要求的是 “regardless if there is an address for each of those people”,也就是不论 address 是否存在都返回解,所以这是 left join。而用 where 的结果这里会得到 inner join,只会得到有 address 情况下的解。


176. Second Highest Salary

法一. 使用 LIMIT

SELECT MAX(Salary) AS SecondHighestSalary FROM
(SELECT DISTINCT Salary FROM Employee
ORDER BY Salary DESC LIMIT 1, 1) AS c

总结:

  • LIMIT 后可以跟一个或者两个数字参数,跟两个数字参数的情况,第一个参数是 Offset,第二个参数是 Limit 的个数。具体用法参见 MySQL 手册

With two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1):

SELECT * FROM tbl LIMIT 5,10;  # Retrieve rows 6-15

To retrieve all rows from a certain offset up to the end of the result set, you can use some large number for the second parameter. This statement retrieves all rows from the 96th row to the last:

SELECT * FROM tbl LIMIT 95,18446744073709551615;
  • 这里用了 derived table:

    A derived table is basically a subquery, except it is always in the FROM clause of a SQL statement.

要注意最后要写 AS c。如果没加 alias,会报错 “Every derived table must have its own alias.” 这个 stackoverflow 上的回答把原因讲的很清楚:

Every derived table must indeed have an alias.

I.e. each query in brackets must be given an alias (AS whatever), which can the be used to refer to it in the rest of the outer query.

举例:

SELECT MAX(age) 
FROM (
-- this part of the query is a derived table:
SELECT age FROM table
) as Age -- must give derived table an alias
  • DINSTICT 经常和 ORDER BY 连用。

法二. NOT IN 或者 NOT EQUAL <>

SELECT MAX(Salary) AS SecondHighestSalary FROM Employee
WHERE Salary NOT IN (SELECT MAX(Salary) FROM Employee)
SELECT MAX(Salary) AS SecondHighestSalary FROM Employee
WHERE Salary <> (SELECT MAX(Salary) FROM Employee)
  • 这是一种最容易想到的方法。首先,我们使用 MAX 函数,找到最高工资:SELECT MAX(Salary) FROM Employee。接着,排除最高的,再选出来就是第二高的。
  • 这也是一个 subquery。subquery 和 derived table 不同:

A subquery is a SELECT statement that is nested within another statement – that’s why it’s called a subquery, because it’s like having a query within another query .

Subqueries are usually used in the WHERE clause as a way to filter out certain rows returned in the result set of the outer query.

  • 这是一个 Uncorrelated Subquery,因为 Subquery 可以脱离主查询语句独立存在。

最简单的分别 subquery 和 derived table 的方法:WHERE 跟 subquery,FROM 跟 derived table 且后面要加 alias。


177. Nth Highest Salary

这道题是 176. Second Highest Salary 的变种。找到第二大的,如何找第 N 大的?

法一. LIMIT

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
SET N = N - 1;
  RETURN (
      # Write your MySQL query statement below.
      SELECT DISTINCT Salary FROM Employee
      ORDER BY Salary DESC LIMIT N, 1
  );
END

总结:

  • LIMIT 从句不能写成 LIMIT N - 1, 1,因为 MySQL 的 LIMIT 后只能加数字参数,不能加变量。
  • MySQL 使用 SET 为变量赋值

法二. Subquery

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
  RETURN (
      # Write your MySQL query statement below.
      SELECT DISTINCT Salary
      FROM Employee Emp1
      WHERE (N-1) = (
          SELECT COUNT(DISTINCT(Emp2.Salary))
          FROM Employee Emp2
          WHERE Emp2.Salary > Emp1.Salary)
  );
END
  • 这是一个 Correlated subquery,subquery 中用到了 outer query 中的 alias Emp1。这篇文章将这种方法的思路讲的很清楚。

178. Rank Scores

法一. Subquery

SELECT s1.Score, 
(
    SELECT COUNT(DISTINCT(s2.Score)) FROM Scores s2
    WHERE s2.Score > s1.Score
) + 1 AS Rank
FROM Scores s1
ORDER By Score DESC

法二. Derived Table

SELECT s1.Score, COUNT(s2.Score) AS Rank 
FROM Scores s1,
(SELECT DISTINCT Score FROM Scores) s2
WHERE s1.Score <= s2.Score
GROUP BY s1.Id
ORDER BY s1.Score DESC

180. Consecutive Numbers

SELECT DISTINCT l1.Num AS ConsecutiveNums 
FROM Logs l1, Logs l2, Logs l3
WHERE l1.Id + 1 = l2.Id AND l2.Id + 1 = l3.Id
AND l1.Num = l2.Num AND l2.Num = l3.Num

这里使用了 Self Join:

A self join is basically when a table is joined to itself.

这篇文章解释的很清楚。


181. Employees Earning More Than Their Managers

法一.INNER JOIN

SELECT e1.Name AS Employee
FROM Employee e1 INNER JOIN Employee e2
ON e1.ManagerId = e2.Id AND e1.Salary > e2.Salary

注意,这里需要用 INNER JOIN,使用 LEFT JOIN 会报错。

法二. SELF JOIN

SELECT e1.Name AS Employee FROM Employee e1, Employee e2
WHERE e1.ManagerId = e2.Id  AND e1.Salary > e2.Salary

这里本质上也是使用了 INNER JOIN。但要注意 SELF JOIN 和 INNER JOIN 没有必然联系。


182. Duplicate Emails

SELECT Email FROM Person
GROUP BY Email
HAVING COUNT(Email) > 1

GROUP BY.. HAVING.. 语句。


183. Customers Who Never Order

SELECT Name AS Customers FROM Customers
LEFT JOIN Orders 
ON Customers.Id = Orders.CustomerId
WHERE Orders.CustomerId IS NULL

184. Department Highest Salary

SELECT d.Name AS Department, e.Name AS Employee, e.Salary
FROM Employee e INNER JOIN Department d
ON e.DepartmentId = d.Id
AND e.Salary IN (
	SELECT Max(e2.Salary) From Employee e2
    WHERE e2.DepartmentId = d.Id
	GROUP BY e2.DepartmentId
)

或者

SELECT d.Name AS Department, e.Name AS Employee, e.Salary
FROM Employee e INNER JOIN Department d
ON e.DepartmentId = d.Id
WHERE (e.DepartmentId, e.Salary) IN (
	SELECT e2.DepartmentId, Max(e2.Salary) From Employee e2
	GROUP BY e2.DepartmentId
)

或者

SELECT d.Name AS Department, e.Name AS Employee, e.Salary
FROM Employee e, Department d
WHERE e.DepartmentId = d.Id
AND (e.DepartmentId, e.Salary) IN (
	SELECT e2.DepartmentId, Max(e2.Salary) From Employee e2
	GROUP BY e2.DepartmentId
)

注意:

The IN operator expects a list of values which match whatever you are comparing against.


196. Delete Duplicate Emails

DELETE p1.* FROM Person p1, Person p2
WHERE p1.Email = p2.Email AND p1.Id > p2.Id

考察 Data Manipulation Language (Insert, Update, Delete)。


197. Rising Temperature

法一. INNER JOIN

SELECT w2.Id FROM Weather w1 INNER JOIN Weather w2
ON TO_DAYS(w1.Date) + 1 = TO_DAYS(w2.Date) 
WHERE w1.Temperature < w2.Temperature

法二. SELF JOIN

SELECT w2.Id FROM Weather w1, Weather w2
WHERE TO_DAYS(w1.Date) + 1 = TO_DAYS(w2.Date) AND w1.Temperature < w2.Temperature
  • 本质上也是 INNER JOIN。但要注意 SELF JOIN 和 INNER JOIN 没有必然联系。
  • 记得 MySQL TO_DAYS() Function。

570. Managers with at Least 5 Direct Reports

法一. Subquery

SELECT Name FROM Employee
WHERE Id IN
(
	SELECT ManagerId FROM Employee
	GROUP BY ManagerId
    HAVING COUNT(*) >= 5
)

法二. Self Join - Inner Join

 SELECT e2.Name FROM Employee e1 INNER JOIN Employee e2
ON e1.ManagerId = e2.Id
GROUP BY e1.ManagerId
HAVING COUNT(e1.ManagerId) >= 5

574. Winning Candidate

SELECT Name FROM Candidate
WHERE id =
(
	SELECT CandidateId FROM Vote
	GROUP BY CandidateId
	ORDER BY COUNT(CandidateId) DESC
	LIMIT 1 
)

简单的 subquery。


577. Employee Bonus

SELECT Employee.name, Bonus.Bonus
FROM Employee LEFT JOIN Bonus
ON Employee.empId = Bonus.empId
WHERE Bonus.bonus IS NULL 
OR Bonus.bonus < 1000

使用 LEFT JOIN。


619. Biggest Single Number

SELECT
(
	SELECT num FROM number
	GROUP BY num
	HAVING COUNT(num) = 1
    ORDER BY num DESC LIMIT 1
) AS num
comments powered by Disqus