7 분 소요


INTRO 🙌

저번 시간에는 Query Optimization 에 대하여 알아보았다.

이번 시간에는 Transaction Management에 대해 알아보자.


ACID 🍔

    Atomic, consistent, isolation, durable
  • Execute each transaction atomically; execute as a single unit
    • Transaction 중간에 crash –> 진행 작업 모두 초기화
  • Isolation: if two users run transactions concurrently, they should not interfere with each other
    • e.g., moving $20 from checking to saving AND return the balance (the sum of checking + saving)
    • what do we want in this case? sequential execution of the two transactions
  • Durable: if a transaction has been executed, its effect is persisted in the database

1) Atomicity

Responsibility: Transaction Control Manager

image

    Accounts(id, checking, saving)

    // Q1: checking 계좌 금액 삭감
    UPDATE Accounts
    SET checking = checking – 20
    WHERE id = 123

    // Q2: saving 계좌 금액 추가
    UPDATE Accounts
    SET saving = saving + 20
    Where id = 123

우리는 checking 계좌에서 saving 계좌로 돈을 이체하는 시스템을 구현하고자 한다.

만약, Q1, Q2 쿼리들을 단독으로 실행하다가, 중간에 crash가 나거나 0을 나눠서 segmentation fault 등 에러가 발생할 경우, checking 계좌에서 돈만 잃어버리는 현상이 발생할지도 모른다.

이러한 현상을 우리는 “Inconsistent” state라고 일컫는다.

  • “inconsistency” is subjective, depending on the business logic of the app

이러한 inconsistency라는 한계점을 타파하고자 등장한 것이 Transaction이다.

2) Isolation

Responsibility: concurrency control manager

다수 Transaction이 동시적으로 동작할 경우, 충돌이 발생할 수 있다.

    SELECT (checking + saving)
    FROM Accounts
    WHERE id = 123

Q3라는 한 id의 checking과 saving 금액의 합을 배출하는 쿼리가 Transaction 2 (T2)라고 해보자.

T1 실행 중, Q1 실행으로 checking 금액이 20 감소하자마자 Q3가 실행된다면, 총 금액은 20 감소된 합으로 배출될 것이다.

image

상기 이미지처럼, 105가 총합이여야 하는데, Q3는 85를 배출하는 모습이다.

가령, 하나의 비행기 티켓을 여러 사람이 동시에 예약하고, 모두가 예약이 성공되버리는 문제점이 한 예시이다.

image

따라서, 우리는 T1, T2가 상기 순서 둘 중 아무 하나의 순서로 개별적으로 동작하길 바란다.

3) Durable

Responsibility: DBMS and application programmer

만약 log에서 end transaction 표시가 있다면, DB에 무조건 해당 쿼리 작업이 반영되있어야 한다.

4) Consistency

Responsibility: recovery manager

  • depends on who is writing the transaction, not Database System
    • consistency는 시스템이 보장할 수 없고, 프로그래머가 책임져야 한다.

배경지식 🍟

System Failures

  • Each transaction has internal state
  • When system crashes, internal state is lost
    • Don’t know which parts executed and which didn’t
  • Remedy: use a log
    • A file that records every single action of the transaction

Transaction

Transaction: a sequence of SQL statements that you want to execute as a single atomic unit

  • transaction 실행 (무조건 둘 중 하나):
    • 모든 statements 다 실행
    • 아에 아무것도 실행 x
  • transaction 중간에 안 멈추면 consistent 보장
  • transaction 중간에 멈추기 불가능; inconsistent state
  • DB = elements 집합체
    • Usually 1 element = 1 block
    • Can be smaller (=1 record) or larger (=1 relation)
  • 각 transaction reads/writes some elements

Transaction은 하나의 단위로 실행하고자 하는 쿼리문들을 start/end transaction 사이에 위치시킨다.

    Accounts(id, checking, saving)
    START TRANSACTION
    UPDATE Accounts
    SET checking = checking – 20
    WHERE id = 123
    UPDATE Accounts
    SET saving = saving + 20
    Where id = 123
    END TRANSACTION

    SELECT (checking + saving)
    FROM Accounts
    WHERE id = 123

DB로의 최종 commit은 end transaction에서 발생하며, 만약 쿼리문 사이 중간에 에러가 발생하는 경우, 재부팅하자마자 실행된 쿼리문들에 대해 roll back을 진행한다.

  • In ad-hoc SQL
    • each command = 1 transaction
  • In embedded SQL (say inside a Python program)
    • Transaction starts = first SQL command issued
    • Transaction ends =
      • COMMIT
      • ROLLBACK (=abort)

가령, checking 계좌에서 20을 삭감하고 crash 발생한 경우, 재부팅하자마자 checking 계좌에 20을 되돌려 놓는다.

어떻게 roll back을 진행하나?

image

transaction start, Q1 실행 등 log로써 저장되는 모습이다. 이 때, Q1 실행 정보로 id와 같은 값들이 같이 저장되서, 만약 log에 end transaction이 없다면, DB에서 해당 id에 들어가 roll back을 진행한다.

Primitive Operations of Transactions

  • INPUT(X)
    • read element X(DISK) to memory buffer
  • READ(X,t)
    • copy element X(MEMORY) to transaction local variable t
  • WRITE(X,t)
    • copy transaction local variable t to element X(MEMORY)
  • OUTPUT(X)
    • write element X(MEMORY) to disk

        READ(A,t); t := t*2;WRITE(A,t)
        READ(B,t); t := t*2;WRITE(B,t)
      

image

  • ※ WRITE(A, t)와 READ(B, t) 사이에 INPUT(B) 빠짐

Log

  • An append-only file containing log records
  • Note: multiple transactions run concurrently, log records are interleaved
  • After a system crash, use log to:
    • Redo some transaction that didn’t commit
    • Undo other transactions that didn’t commit

ACID: Lock and Crash Recovery 🥩

1) Lock

image

T1을 실행하자마자 수정하는 데이터에 lock을 부여한 후, 쿼리를 실행한다.

이때, T1이 완료 되기 전에 T2가 들어온 경우, T2는 awaiting 상태로 돌입한다.

T1 완료 직후, T2는 잇달아 실행된다.

Lock은 Crash를 잡아내지 못하기 때문에, T1 실행 도중 crash가 발생한 경우에 대한 처리는 불가능하다.

따라서, log에 lock에 대한 언급 역시 해줘야 한다.

그러면, 재부팅 이후, lock이 걸린 데이터 역시 다시 lock을 해제하여, lock끼리도 충돌이 나는 문제점을 해결 가능하다.

전체 DB Table lock 걸면 안 되는 이유

Lock 기다리는 동안 다른 새로운 Transaction이 또 다시 Lock 걸어서 무한으로 기다릴 수 있음.

Read/Write Lock

Throughput 최대화!

  • Read Lock 걸림
    • Read만 가능, Write 불가능
  • Write Lock 걸림
    • R/W 모두 불가능

2) Crash Recovery

image

  • 1) Undo Logging (force, steal)
    • OUTPUT must be done early
    • If <COMMIT T> is seen, T definitely has written all its data to disk (hence, don’t need to undo)
    • 장점
      • Less memory intensive; flush updated data pages as soon as log records are flushed; only then COMMIT
    • 단점
      • higher latency (long time); forcing all dirty buffer pages to be flushed prior to COMMIT
  • 2) Redo Logging (No force, no steal)
    • OUTPUT must be done late
    • If <COMMIT T> is not seen, T definitely has not written any of its data to disk (hence there is not dirty data on disk)
    • Would like more flexibility on when to OUTPUT: undo/redo logging (next)
    • 장점
      • Lower latency; do not need to wait until data pages are flushed to COMMIT
    • 단점
      • More memory intensive; cannot flush data pages unless COMMIT log has been flushed
  • 3) Undo/Redo Logging 혼합 안중요

Recovery manager

Read log from the end; cases:

  • <COMMIT T>: mark T as completed
  • <ABORT T>: mark T as completed
  • <START T>: ignore

① Recovery with Undo Log

<T,X,v>: T has updated element X, and its old value was v

  • Whether T is completed or not; undo all modifications by incompleted transactions

          `<START T>`….`<COMMIT T>`….    = yes
          `<START T>`….`<ABORT T>`…….   = yes
          `<START T>`………………………   = no
    
  • All undo commands are idempotent
    • If we perform them a second time, no harm is done
    • E.g. if there is a system crash during recovery, simply restart recovery from scratch
  • 한계: Cannot stop until we reach the beginning of the log file (impractical)
    • Better idea: use checkpointing

Undo Logging 규칙

image

  • U1: If T modifies X, then <T,X,v> must be written to disk before X is written to disk
  • U2: If T commits, then must be written to disk only after all changes by T are written to disk

OUTPUTs are done early

기말출제★★★★★★★★ (Log 보고 X 값 유추하기)

image

  • <T2,X2,v2> –> X2=v2
  • <T3,X3,v3> –> X3=v3
  • <COMMITT T5 –> ignore
  • <T4,X4,v4> –> X4=v4
  • <T5,X5,v5> –> ignore
  • <T1,X1,v1> –> X1=v1
  • <START T4> –> ignore
  • <START T5> –> ignore

Checkpointing

image

  • Checkpoint the database periodically
    • Stop accepting new transactions
    • Wait until all curent transactions complete
    • Flush log to disk
    • Write a log record, flush
    • Resume transactions
  • 한계: DB freezes during checkpoint
Nonquiescent Checkpointing
  • Write a <START CKPT(T1,…,Tk)> where T1,…,Tk are all active transactions
  • Continue normal operation
  • When all of T1,…,Tk have completed, write <END CKPT>

image

        `<END CKPT>` --> ignore
        ...
        `<START CKPT T4, T5, T6>` --> 종료
  • T4,T5,T6: active transactions
  • <START CKPT T4, T5, T6>에서 Log 탐색 종료
    • CKPT 도중에 crash 발생 –> CKPT 과정 또한 초기화

② Recovery with Redo Log

<T,X,v>= T has updated element X, and its new value is v

  • Whether T is completed or not

      `<START T>`….`<COMMIT T>`….    = yes
      `<START T>`….`<ABORT T>`…….    = yes
      `<START T>`………………………   = no
    
  • Redo all updates of committed transactions

image

redo logging 시작 전에 entire log 한 번 쫙 훑으면서 commit 존재하는 transaction 미리 알아놓음

  • <START T1> –> ignore
  • <T1,X1,v1> –> ignore (<COMMIT T1> 없음)
  • <START T2> –> ignore
  • <T2,X2,v2> –> write v2 to X2 on disk
  • <START T3> –> ignore
  • <T1,X3,v3> –> ignore (<COMMIT T1> 없음)
  • <COMMIT T2> –> ignore
  • <T3,X4,v4> –> ignore (<COMMIT T3> 없음)
  • <T1,X5,v5> –> ignore (<COMMIT T1> 없음)

Redo Logging 규칙

image

  • R1: If T modifies X, then both <T,X,v> and <COMMIT T> must be written to disk before X is written to disk

OUTPUTs are done late

Nonquiescent Checkpointing

  • <START CKPT(T1,…,Tk)> where T1,…,Tk are all active transactions
  • Flush to disk all blocks of committed transactions (dirty blocks), while continuing normal operation
  • When all blocks have been written, write <END CKPT>

image

  • <START CKPT T4, T5, T6> 이전 시점 T4, T5, T6 제외 모든 transactions redo 진행
    • T4,T5,T6: <START CKPT T4, T5, T6> 이전 시점 active uncommitted transactions
  • <START CKPT T4, T5, T6> 일반 redo (committed transactions만 redo)
  • <END CKPT>
    • <START CKPT T4, T5, T6> 이전 commited transactions(T1) 확실히 disk 존재 보장 (dirty bit = 0)

image

③ Undo/Redo Logging

<T,X,u,v>= T has updated element X, its old value was u, and its new value is v

\[Log\ records,\ only\ one\ change\]

image

  • Redo all committed transaction, top-down
  • Undo all uncommitted transactions, bottom-up

Undo/Redo Logging 규칙

image

UR1: If T modifies X, then <T,X,u,v> must be written to disk before X is written to disk

OUTPUT early or late (I.e. before or after <COMMIT T>)


Reference

Database Management Systems by Raghu Ramakrishnan and Johannes Gehrke

댓글남기기