Skip to content

Commit 84a3567

Browse files
committed
advance first draft complete
1 parent d6ebf85 commit 84a3567

File tree

2 files changed

+348
-4
lines changed

2 files changed

+348
-4
lines changed

Writerside/recursice_queries.pdf

183 KB
Binary file not shown.

Writerside/topics/ADVANCE.md

Lines changed: 348 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,16 @@
11
# Advance topics
22

3+
__TODO__ : add clear and small examples in relevance to complex queries.
4+
5+
## import demo database inside postgres container
6+
37
We are going to create a docker compose file to move the demo database into the image and
48
then open it
9+
- download demo database from : [https://edu.postgrespro.com/demo-small-en.zip](https://edu.postgrespro.com/demo-small-en.zip)
510
- go to terminal
611
- `vim .env`
712
- paste the following and enter the valid paths:
8-
```env
13+
```text
914
DEMO_DATABASE_PATH="<path to demo file>/demo-small-en-20170815.sql"
1015
DATABASE_BACKUP="<path to db backup"
1116
#POSTGRES_DB="flight"
@@ -15,7 +20,7 @@ then open it
1520
1621
- `vim docker-compose.yml`
1722
- paste the following
18-
```env
23+
```yaml
1924
version: '3'
2025
services:
2126
postgres:
@@ -30,7 +35,7 @@ then open it
3035
volumes:
3136
- ${DEMO_DATABASE_PATH}:/demo-small-en-20170815.sql
3237
- ${DATABASE_BACKUP}:/var/lib/postgresql/data
33-
```
38+
```
3439

3540
- run : `docker composer up -d`
3641
- `docker exec -it postgresintro psql -d flight -U flight`
@@ -39,4 +44,343 @@ then open it
3944

4045
- to change to database demo next time,
4146
- `docker exec -it postgresintro psql -d demo -U flight`
42-
- OR : `\c demo` inside psql
47+
- OR : `\c demo` inside psql
48+
49+
___
50+
51+
## DEMO database
52+
53+
### Schema
54+
- The demo database imported contains statistics on all the flights of an imaginary airline company of one month time-period
55+
- In postgres, schema is a named collection of database objects, including tables, views, functions, i.e, a collection
56+
logical structures of data.
57+
- The main entity in our demo database's schema is a __booking__(mapped to bookings table).
58+
- Each booking includes several passengers with a __separate__ ticket issued for each passenger(tickets).
59+
- we assume each passenger is unique , there is no unique-id issued for each passenger as a person
60+
- Each ticket always contain one or more flight segments(ticker_flights).
61+
- if there are no direct flights with departure to destination, and it's not round trip ticket then a ticket can
62+
have multiple flight segments included.
63+
- it's assumed all tickets in a single booking has the same flight segment. Even though schema does not contain
64+
any such restrictions.
65+
- Each flight('flight') goes from one airport('airports') to another .
66+
- Flights with the same flight number have the same points of departure and destination but different departure
67+
dates.
68+
- At flight check-in, each passenger is issued a boarding pass (boarding_passes), where the seat number is specified.
69+
- The flight/seat combination must be unique
70+
71+
> you can use \d+ command to understand tables within terminal too.
72+
> \d+ bookings
73+
> \d+ aircrafts
74+
75+
### Bookings
76+
- To fly with our airline, passengers book the required tickets in advance (book_date, which must be within one month
77+
before the flight). The booking is identified by its number (book_ref, a six-position combination of letters and digits).
78+
- The total_amount field stores the total price of all tickets included into the booking, for all passengers.
79+
80+
### Tickets
81+
82+
- A ticket has a unique number (ticket_no), which consists of 13 digits.
83+
- Ticker contains passenger_id, passenger_name and their contact_data
84+
85+
### Flight Segments
86+
87+
- A flight segment connects a ticket with a flight, identified by a number
88+
- Each flight segment has its price (amount) and travel class(fare_conditions).
89+
90+
### Flights
91+
92+
- natural composite key of the flights table consists of the flight number (flight_no) and the date of the departure
93+
(scheduled_departure). To make foreign keys that refer to this table a bit shorter, a surrogate key flight_id is used as
94+
the primary key.
95+
96+
- A flight always connects two points: departure_airport and arrival_airport.
97+
- There's no entity like "connecting flight" , in such cases . Ticket contains all flight segments.
98+
- Each flight has time of scheduled_departure and scheduled_arrival, actual_departure and actual_arrival times may differ.
99+
100+
#### Flight status
101+
- Scheduled :The flight can be booked, in advance one month before departure.
102+
- On Time : The flight is open for check-in (twenty-four hours before the scheduled departure) and is not delayed.
103+
- Delayed : open for check-in but delayed
104+
- Departed : airborne
105+
- Arrived
106+
- Cancelled
107+
108+
### Airports
109+
110+
- identified by a three-letter airport_code and has an airport_name.
111+
- a city name is simply an airport attribute, which is required to identify all the airports of the same city.
112+
- The table also includes coordinates (longitude and latitude) and the timezone.
113+
### Boarding Passes
114+
- the boarding pass is identified by the combination of ticket and flight numbers.
115+
- Boarding pass numbers (boarding_no) are assigned sequentially and has seat number (seat_no).
116+
117+
### Aircraft
118+
- To identify an aircraft model, a three-digit aircraft_code
119+
- also includes the name of the air-craft model and the maximum flying distance, in kilometers(range).
120+
121+
### Seats
122+
- define the cabin configuration of each aircraft model.
123+
- Each seat has a number (seat_no) and an assigned travel class (fare_conditions): Economy, Comfort, or Business.
124+
125+
### Flights View
126+
127+
__NOTE__: View is a query stored in postgres database server.
128+
129+
- There is a flights_v view built over the flights table.
130+
- use \d+ flights_v
131+
132+
### The “now” Function
133+
- The demo database contains a snapshot of data
134+
- The snapshot time is saved in the `bookings.now` function.
135+
- use this function in demo queries for cases that would typically require calling the `now` function.
136+
- the return value of this function determines the version of the demo database: `SELECT bookings.now();`
137+
138+
#### Simple Queries
139+
140+
>Problem: Who traveled from Moscow (SVO) to Novosibirsk
141+
(OVB) on seat 1A the day before yesterday, and when was the
142+
ticket booked?
143+
144+
> Solution : use bookings.now()
145+
```SQL
146+
SELECT t.passenger_name,
147+
bb.book_date
148+
FROM bookings b
149+
JOIN tickets t
150+
ON t.book_ref = b.book_ref
151+
JOIN boarding_passes bp
152+
ON bp.ticket_no = t.ticket_no
153+
JOIN flights f
154+
ON f.flight_id = bp.flight_id
155+
WHERE f.departure_airport = 'SVO'
156+
AND f.arrival_airport = 'OVB'
157+
AND f.scheduled_departure::date =
158+
bookings.now()::date - INTERVAL '2 day'
159+
AND bp.seat_no = '1A';
160+
```
161+
__NOTE__: '::' is used to data from one type to another :
162+
163+
164+
> Problem: How many seats remained free on flight PG0404 yesterday?
165+
> Solution: Let's try to use the NOT EXISTS expression for seats without boarding-pass.
166+
```SQL
167+
SELECT count(*)
168+
FROM flights f
169+
JOIN seats s
170+
ON s.aircraft_code = f.aircraft_code
171+
WHERE f.flight_no = 'PG0404'
172+
AND f.scheduled_departure::date =
173+
bookings.now()::date - INTERVAL '1 day'
174+
AND NOT EXISTS (
175+
SELECT NULL
176+
FROM boarding_passes bp
177+
WHERE bp.flight_id = f.flight_id
178+
AND bp.seat_no = s.seat_no
179+
);
180+
```
181+
182+
>Problem: Which flights had the longest delays? Print the list of ten “leaders.”
183+
> Solution: The query needs to include only those flights that have already departed.
184+
```SQL
185+
SELECT f.flight_no,
186+
f.scheduled_departure,
187+
f.actual_departure,
188+
f.actual_departure - f.scheduled_departure
189+
AS delay
190+
FROM flights f
191+
WHERE f.actual_departure IS NOT NULL
192+
ORDER BY f.actual_departure - f.scheduled_departure
193+
DESC
194+
LIMIT 10;
195+
```
196+
197+
#### Aggregate Functions
198+
>Problem: What is the shortest flight duration for each
199+
possible flight from Moscow to St. Petersburg, and how many
200+
times was the flight delayed for more than an hour?
201+
202+
> Solution: flights_v view instead of dealing with table joins, take into account only those flights that have
203+
already arrived.
204+
```SQL
205+
SELECT f.flight_no,
206+
f.scheduled_duration,
207+
min(f.actual_duration),
208+
max(f.actual_duration),
209+
sum(CASE WHEN f.actual_departure > f.scheduled_departure + INTERVAL '1 hour'
210+
THEN 1 ELSE 0
211+
END) delays
212+
FROM flights_v f
213+
WHERE f.departure_city = 'Moscow'
214+
AND f.arrival_city = 'St. Petersburg'
215+
AND f.status = 'Arrived'
216+
GROUP BY f.flight_no,
217+
f.scheduled_duration;
218+
```
219+
220+
> Problem: Find the most disciplined passengers who checked
221+
in first for all their flights. Take into account only those
222+
passengers who took at least two flights.
223+
224+
> Solution : boarding pass numbers are issued in the check-in order.
225+
```SQL
226+
SELECT t.passenger_name,
227+
t.ticket_no
228+
FROM tickets t
229+
JOIN boarding_passes bp
230+
ON bp.ticket_no = t.ticket_no
231+
GROUP BY t.passenger_name,
232+
t.ticket_no
233+
HAVING max(bp.boarding_no) = 1
234+
AND count(*) > 1;
235+
```
236+
237+
> Problem: How many passengers can be included into a single booking?
238+
> Solution: count the number of passengers in each booking and then find the number of bookings for each
239+
number of passengers.
240+
```SQL
241+
SELECT tt.cnt,
242+
count(*)
243+
FROM (
244+
SELECT t.book_ref,
245+
count(*) cnt
246+
FROM tickets t
247+
GROUP BY t.book_ref
248+
) tt
249+
GROUP BY tt.cnt
250+
ORDER BY tt.cnt;
251+
```
252+
253+
#### Window Functions
254+
>Problem: For each ticket, display all the included flight segments, together with connection time. Limit the result
255+
to the tickets booked a week ago.
256+
>Solution: Use window functions to avoid accessing the same data twice.
257+
```SQL
258+
SELECT tf.ticket_no,
259+
f.departure_airport,
260+
f.arrival_airport,
261+
f.scheduled_arrival,
262+
lead(f.scheduled_departure) OVER w AS next_departure,
263+
lead(f.scheduled_departure) OVER w - f.scheduled_arrival AS gap
264+
FROM bookings b
265+
JOIN tickets t
266+
ON t.book_ref = b.book_ref
267+
JOIN ticket_flights tf
268+
ON tf.ticket_no = t.ticket_no
269+
JOIN flights f
270+
ON tf.flight_id = f.flight_id
271+
WHERE b.book_date =
272+
bookings.now()::date - INTERVAL '7 day'
273+
WINDOW w AS ( PARTITION BY tf.ticket_no
274+
ORDER BY f.scheduled_departure);
275+
```
276+
277+
#### Arrays
278+
> Problem. Find the round-trip tickets in which the outbound route differs from the inbound one.
279+
> A round-trip ticket where the outbound route differs from the inbound route is typically referred to as a "multi-city"
280+
> or "multi-destination" ticket. With this type of ticket, travelers can fly from one location to another
281+
> (the outbound route), and then return from a different location (the inbound route).
282+
283+
> Solution:
284+
```SQL
285+
SELECT r1.departure_airport,
286+
r1.arrival_airport,
287+
r1.days_of_week dow,
288+
r2.days_of_week dow_back
289+
FROM routes r1
290+
JOIN routes r2
291+
ON r1.arrival_airport = r2.departure_airport
292+
AND r1.departure_airport = r2.arrival_airport
293+
WHERE NOT (r1.days_of_week && r2.days_of_week);
294+
```
295+
296+
#### Recursive Queries
297+
[Guide](https://habr.com/en/companies/postgrespro/articles/490228/)
298+
```SQL
299+
WITH RECURSIVE t(n,factorial) AS (
300+
VALUES (0,1)
301+
UNION ALL
302+
SELECT t.n+1, t.factorial*(t.n+1) FROM t WHERE t.n < 5
303+
)
304+
SELECT * FROM t;
305+
```
306+
307+
>Problem. How can you get from Ust-Kut (UKX) to Neryungri (CNN) with the minimal number of connections? What will
308+
the flight time be?
309+
>Solution: find the shortest path in the graph.
310+
311+
```SQL
312+
WITH RECURSIVE p(
313+
last_arrival,
314+
destination,
315+
hops,
316+
flights,
317+
flight_time,
318+
found
319+
) AS (
320+
SELECT a_from.airport_code,
321+
a_to.airport_code,
322+
array[a_from.airport_code],
323+
array[]::char(6)[],
324+
interval '0',
325+
a_from.airport_code = a_to.airport_code
326+
FROM airports a_from,
327+
airports a_to
328+
WHERE a_from.airport_code = 'UKX'
329+
AND a_to.airport_code = 'CNN'
330+
UNION ALL
331+
SELECT r.arrival_airport,
332+
p.destination,
333+
(p.hops || r.arrival_airport)::char(3)[],
334+
(p.flights || r.flight_no)::char(6)[],
335+
p.flight_time + r.duration,
336+
bool_or(r.arrival_airport = p.destination)
337+
OVER ()
338+
FROM p
339+
JOIN routes r
340+
ON r.departure_airport = p.last_arrival
341+
WHERE NOT r.arrival_airport = ANY(p.hops)
342+
AND NOT p.found
343+
)
344+
SELECT hops,
345+
flights,
346+
flight_time
347+
FROM p
348+
WHERE p.last_arrival = p.destination;
349+
```
350+
351+
> Problem. What is the maximum number of connections that can be required to get from any airport to any other airport?
352+
> Solution: We can take the previous query as the basis for the solution. However, the first iteration must now contain
353+
> all the possible airport pairs, not just a single pair: each airport must be connected to all the other airports.
354+
> For all these pairs of airports we first find the shortest path, and then select the longest of them.
355+
356+
```SQL
357+
WITH RECURSIVE p(
358+
departure,
359+
last_arrival,
360+
destination,
361+
hops,
362+
found
363+
) AS (
364+
SELECT a_from.airport_code,
365+
a_from.airport_code,
366+
a_to.airport_code,
367+
array[a_from.airport_code],
368+
a_from.airport_code = a_to.airport_code
369+
FROM airports a_from,
370+
airports a_to
371+
UNION ALL
372+
SELECT p.departure,
373+
r.arrival_airport,
374+
p.destination,
375+
(p.hops || r.arrival_airport)::char(3)[],
376+
bool_or(r.arrival_airport = p.destination)
377+
OVER (PARTITION BY p.departure, p.destination)
378+
FROM p JOIN routes r
379+
ON r.departure_airport = p.last_arrival
380+
WHERE NOT r.arrival_airport = ANY(p.hops)
381+
AND NOT p.found
382+
)
383+
SELECT max(cardinality(hops)-1)
384+
FROM p
385+
WHERE p.last_arrival = p.destination;
386+
```

0 commit comments

Comments
 (0)