README.md



Deep Video Compression


Team


professor

김휘용


mentor

김성훈


Student

2018110650  박민정   mindyeoi@khu.ac.kr

2015104181  서승우   sswoo333@naver.com

2018102242  최승미   2018102242@khu.ac.kr    


Our Test Sequence


The Common Test Conditions(CTC) of the test sequence(4 sequences) we used are as follows.


Chroma Format: RGB 4:4:4
Input bit-depth: 8
Resolution: 768x768
HEVC & VVC QP: 22, 27, 32, 37
Compress AI Quality: 1, 2, 3, 4
Proposed Method Quality: 1, 4, 6, 8


Sensor-generated Sequence


Input bit-depth
Frame rate
Test sequence name
frame count


8
50
CrowdRun
100


8
50
DucksTakeOff
75


8
50
OldTownCross
100


8
50
Parkjoy
100


Computer-generated Sequence (not use)


Input bit-depth
Frame rate
Test sequence name
frame count


8
60
ArenaOfValor
120


8
24
GlassHalf
48


Reference Software

HEVC (HM 16.8): Download HM

VVC (VTM 12.1): Download VTM, Download Documents

Compress AI (bmshj2018-hyperprior, mbt2018, cheng2020-anchor): compressAI github


Our Proposed Codec


Process


입력 영상에 대해 가장 첫 번째 프레임을 NN-based Image Compression 기술을 적용하여 부호화
압축된 비트 스트림을 복원하여 예측 영상으로 사용
복원된 예측 영상과 두 번째 프레임과의 차이 신호를 이미지화 하여 NN-based Image Compression 압축을 진행
이후 프레임에 대해 2~3과정을 반복
복호화 시, 예측 영상과 차이 신호를 복호화하여 합침


Diagram


Performance Test


RD-Curve


PSNR (Peak Signal to Noise Ratio)


MS-SSIM (multi-scale Structural SIMilarity)

구조적 유사도 지수


BPP

(#압축에 사용된 전체 Bit 수) / (#전체 화소수)       


MSE Curve
MS-SSIM Curve


BD-rate (Bjontegaard-Delta rate)

PSNR-BPP curve: BD-rate -15.9% (Anchor: bmshj2018-hyperprior)


Subjective Quality


Code explanation


The Directory Structure

.
├── README.md
├── source code
|   ├── Our Encoder
|   |   ├── codec_allIntra.py
|   |   ├── codec_proposed.py
|   |   ├── codec_anotherMethod.py
|   |   ├── train_RGB.py
|   |   ├── train_RGB_MS-SSIMloss.py
|   |   └── train_YCbCr.py
|   ├── PostProcessing
|   |   ├── get_BPP.m
|   |   ├── get_PSNR_and_BPP manual.docx
|   |   ├── png_to_rgb.m
|   |   └── PostProcessing.m
|   ├── PreProcessing
|   |   ├── frame_to_png.m
|   |   ├── MakeBat.m
|   |   ├── RGB Crop(ffmpeg) manual.docx
|   |   ├── rgb_to_frame.m
|   |   └── rgb_to_png.m
|   └── 영상 주관적 화질평가
|       └── pYUV manual.docx
├── 면담발표
|   └── ...
├── 면담보고서
|   └── ...
├── 멘토면담보고서
|   └── ...
├── 주간보고서
|   └── ...
├── 기초조사서.docx
├── 중간보고서.hwp
└── 최종보고서.docx


PreProcessing

frame_to_png: 하나의 frame(.rgb)를 이미지(.png) 파일로 변환

MakeBat: HEVC 또는 VVC batch 파일내용 생성 코드

rgb_to_frame: 여러 frame의 영상(.rgb)을 하나의 frame(.rgb) 단위로 변환

rgb_to_png: 여러 frame의 영상(.rgb)을 하나의 이미지(.png) 단위로 변환    


PostProcessing

png_to_rgb: 여러 이미지(.png)를 여러 frame의 영상(.rgb)으로 변환

PostProcessing: HEVC 또는 VVC에서 Decoding된 영상(10bit, BGR)을 8it, RGB영상으로 변환 및 PSNR, SSIM, BPP계산    


Our Codec

codec_allIntra: ALL Intra Compression Codec

codec_proposed: First Frame Intra Compression + Other Frame Residual(clip(recon-ref, -0.5, 0.5)+0.5) Compression Codec

codec_anotherMethod: First Frame Intra + Other Frame Residual1(clip(recon-ref, 0, 1)), Residual2(-clip(recon-ref, -1, 0)) Codec

train_RGB.py: Training code of RGB 444 format residual image with MSE loss

train_RGB_MS-SSIMloss.py: Training code of RGB 444 format residual image with MS-SSIM loss

train_YCbCr.py: Training code of YCbCr 444 format residual image with MSE loss      

명령어

python [codec name] [encode or decode mode] --model [model name] -m [mse] -fr [frame rate] -f [frame count] -q [quality] [sequence name]    


python [train name] -d [data path] --epochs [epochs] -lr [learning rate] -fr [frame rate] -f [frame count] -q [quality] --batch-size [batch size]    


명령어 예시

python examples/codec_proposed.py encode --model cheng2020-anchor -m mse -q 6 CrowdRun


python examples/train_YCbCr.py -d Data/ --epochs 150 -lr 1e-4 -q 4 --batch-size 16 --cuda --save