[카테고리:] Machine Learning

[Generative Artificial Intelligence] Maximum Likelihood Learning

2024년 11월 08일

Stanford CS236 Deep Generative Models 수업의 자료를 기반으로 생성모델의 기본 개념들을 정리해보고자 한다. (참고 https://deepgenerativemodels.github.io/syllabus.html) 이전에 포스팅한 Autoregressive Models 에서는 모델을 어떻게 설계할까에 대해 주로 이야기했고 이제 이 모델을 어떻게 학습시킬까에 대해 이야기한다. 우리가 가지고 있는건 n개의 데이터 뿐이다. (x1, x2, …, xn). 이 데이터들이 어떤 확률분포 \( ( P_\text{data} ) \)에서 sampling된 것일지를 추정하는것이…
[Generative Artificial Intelligence] Autoregressive Models

2024년 11월 08일

Stanford CS236 Deep Generative Models 수업의 자료를 기반으로 생성모델의 기본 개념들을 정리해보고자 한다. (참고 https://deepgenerativemodels.github.io/syllabus.html) 먼저 Autoregressive model 의 정의는 다음과 같다. (by charGPT) “An autoregressive model is a type of statistical model used to describe certain time-dependent processes. It predicts the next value in a sequence based on the previous values, assuming that each…
Reinforcement Learning 개념 정리 (23.5 – Policy Search, 23.7 – Applications (DQN) )

2024년 03월 07일

이전 포스팅에서는 가치 함수(Utility function or Value function) 이나 Q-fucntion을 이용하여 학습하는 방법에 대해 설명했고 이번 포스팅은 Policy를 이용하여 학습하는 방법을 설명한다. Policy Search Policy(\(\pi\))란 주어진 state에서 action을 매핑해주는 함수라고 했고 이러한 Policy를 파라미터화(\(\pi_\theta\)) 하여 최적의 Policy를 찾아내는 것이 Policy search의 목표가 된다. 이를 위한 간단한 예시로 policy를 아래와 같이 represent할 수도 있다. \( \pi(s)…
Reinforcement Learning 개념 정리 (23.2, 23.3 – Passive / Active Reinforcement Learning)

2024년 03월 03일

(생각보다 artificial intelligence: modern approach 책에서 쓰는 용어와 설명하는 순서들이 시중의 다른 강화학습 교육 자료들과 차이가 꽤 큰 것 같다. 최대한 책에 맞추지만 애매한 내용들은 다른 강화학습 자료도 참고해 정리할 예정이다.) Passive Reinforcement Learning 기본적으로 Passive 와 Active Reinforcement Learning 모두 model-free reinforcement learning 방식이다. 즉, 이전 posting에서 설명한 MDP에서 P(transition model)을 모르는 상태로 학습을…
Reinforcement Learning 개념 정리 (23.1 – Learning from Rewards)

2024년 02월 26일

Artificial Intelligence: A Modern Approach (4th edition) 기준으로 정리할 예정이다. Background MDP (Markov decision processes) MDP란 순차적으로 행동을 결정해야 하는 문제를 풀기 위해 수학적으로 표현한 것으로 아래와 같이 5가지 tuple로 구성되어 있다. S : set of states A : set of actions P : state transition probability matrix, 특정 시간의 상태(\( S = s_t \))에서…

[카테고리:] Machine Learning

[Generative Artificial Intelligence] Maximum Likelihood Learning

[Generative Artificial Intelligence] Autoregressive Models

Reinforcement Learning 개념 정리 (23.5 – Policy Search, 23.7 – Applications (DQN) )

Reinforcement Learning 개념 정리 (23.2, 23.3 – Passive / Active Reinforcement Learning)

Reinforcement Learning 개념 정리 (23.1 – Learning from Rewards)