DeepLearning 2

[Transformer] 01. Attention is All You Need!

Transformer ์ •๋ฆฌ ํฌ์ŠคํŠธ #01๋จผ์ €, ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ์‹œ์ž‘์œผ๋กœ Tansformer์˜ Key word๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ์ž์„ธํžˆ ์ •๋ฆฌํ•ด์„œ ํฌ์ŠคํŒ…ํ•ด๋ณด๋ ค ํ•œ๋‹ค.>> ๋…ผ๋ฌธ ๋งํฌ  Attention Is All You NeedThe dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a newarxiv.org1. Introduction..

[boostcamp] WEEK 03 : DL Basic

๋ถ€์ŠคํŠธ์บ ํ”„๊ฐ€ ๋ฒŒ์จ 3์ฃผ์ฐจ๋ผ๋‹ˆ.. ์‚ฌ์‹ค ์ฃผ์–ด์ง€๋Š” ์ž๋ฃŒ๋‚˜ ๊ณต๋ถ€ํ•  ๋‚ด์šฉ์ด ๋งŽ์•„์„œ ์ •์‹ ์—†์ด ํ•˜๋‹ค ๋ณด๋ฉด ์ผ์ฃผ์ผ์ด ํ›Œ์ฉ ์ง€๋‚˜์žˆ๋Š” ๋А๋‚Œ์ด๋‹ค.์ €๋ฒˆ์ฃผ๊นŒ์ง€๋งŒ ํ•ด๋„ ๊ฐ•์˜ ์ •๋„๋กœ ์‹œ๊ฐ„์„ ๋“ค์—ฌ์„œ'์•„ ๋ถ€์ŠคํŠธ์บ ํ”„๋Š” ๊ณผ์ œ๋กœ ์–ป์–ด๊ฐ€๋Š” ๊ฒŒ ๋งŽ์€ ์ฝ”์Šค์ธ๊ฐ€ ๋ณด๋‹ค!' ํ–ˆ๋Š”๋ฐ์ด๋ฒˆ์ฃผ๋Š” ๊ฐ•์˜ ๋ถ„๋Ÿ‰์œผ๋กœ ํœ˜๋ชฐ์•„์ณค๋‹ค..ใ…Žใ…Ž๋งค์ฃผ ์˜ˆ์ธกํ•  ์ˆ˜ ์—†๋Š” ๋ถ€์ŠคํŠธ์บ ํ”„^^...์•„๋ฌด๋ž˜๋„ ๋งค์ฃผ ๊ฐ•์˜์™€ ๊ณผ์ œ๋ฅผ ๋‹ด๋‹นํ•˜์‹œ๋Š” ๊ต์ˆ˜๋‹˜, ์กฐ๊ต๋‹˜์ด ๋‹ฌ๋ผ์„œ ๋งค๋ฒˆ ์Šคํƒ€์ผ์ด ๋‹ค๋ฅธ๊ฐ€๋ณด๋‹ค.์ด๋ฒˆ์ฃผ๋Š” Deep Learning Basic์ด๋ผ๋Š” ์ œ๋ชฉ์œผ๋กœ ๊ฐ•์˜๋ฅผ ๋“ค์—ˆ๋Š”๋ฐ(๋ฌผ๋ก  Data Visualization ๊ฐ•์˜๋„ ์—ด์–ด์ฃผ์…จ์ง€๋งŒ ์ด๋ฒˆ์ฃผ๋Š” DL ๋‚ด์šฉ์œ„์ฃผ๋กœ ์ •๋ฆฌํ•  ๊ฒƒ์ด๋‹ค)MLP→CNN→RNN→Transformer→Generative model ์ˆœ์„œ๋กœ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ๊ตฌ์กฐ์™€ ์ˆ˜ํ•™ ๋ชจ๋ธ, ์ฝ”๋“œ ๊ตฌํ˜„์— ๋Œ€ํ•ด ..