딥러닝 강화학습을 통해 워킹 로봇 작동하기 matlab

인공지능/강화학습

딥러닝 강화학습을 통해 워킹 로봇 작동하기 matlab

이게될까 2024. 5. 8. 03:09

728x90

https://kr.mathworks.com/videos/deep-reinforcement-learning-for-walking-robots--1551449152203.html

Deep Reinforcement Learning for Walking Robots

Use MATLAB, Simulink, and Reinforcement Learning Toolbox to train control policies for humanoid robots using deep reinforcement learning.

kr.mathworks.com

이 강의 인데 자막이 없네요...

영어 자막이라도 있으면 알아볼텐데 일단 열심히 해보겠습니다 ㅠㅠㅠㅠ

2024.05.08 - [인공지능/강화학습] - matlab 강화학습 - walking Robot Problem

matlab 강화학습 - walking Robot Problem

https://kr.mathworks.com/videos/reinforcement-learning-part-4-the-walking-robot-problem-1557482052319.html Reinforcement Learning, Part 4: The Walking Robot ProblemThis video shows how to use the reinforcement learning workflow to get a bipedal robot to w

yoonschallenge.tistory.com

강화학습에서 넘어왔습니다.

이러한 형식으로 제어가 일어납니다.

보상을 통해 강화 학습이 일어난다!

이전에 강의에서 배웠던 딥러닝 강화학습 구조이다

여기서 Q의 오차를 통해 학습한다.

리워드 함수가 매우 중요하다!

이건 딥러닝 툴박스이다.

FCN, ReLU등으로 이루어진 것을 볼 수 있다.

오류가 좀 날텐데

다음 사용 중 오류가 발생함: rlRepresentation (69번 라인)
rlRepresentation은 향후 릴리스에서 제거될 예정입니다. rlRepresentation을 새 representation 객체로 자동 변환할 수
없습니다. 새 representation 객체 rlValueRepresentation, rlQValueRepresentation,
rlDeterministicActorRepresentation 또는 rlStochasticActorRepresentation을 대신 사용하십시오.

오류 발생: createDDPGNetworks (48번 라인)
critic = rlRepresentation(criticNetwork,criticOptions, ...

오류 발생: createWalkingAgent2D (31번 라인)
createDDPGNetworks;

이건 이렇게 수정해주면 됩니다.

% Create the critic representation
criticOptions = rlRepresentationOptions('Optimizer','adam','LearnRate',1e-3, ...
                                        'GradientThreshold',1,'L2RegularizationFactor',2e-4);
if useGPU
   criticOptions.UseDevice = 'gpu'; 
end
obsInfo = env.getObservationInfo;
actInfo = env.getActionInfo;
critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,criticOptions);

% Create the actor representation
actorOptions = rlRepresentationOptions('Optimizer','adam','LearnRate',1e-4, ...
                                       'GradientThreshold',1,'L2RegularizationFactor',1e-5);
if useGPU
   actorOptions.UseDevice = 'gpu'; 
end
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,actorOptions);

이렇게 하면 잘 돌아갑니다.

아니 GPU 이슈가...

다음 사용 중 오류가 발생함: gpuArray
지원되는 GPU 장치를 찾을 수 없습니다.

오류 발생: dlarray/gpuArray (22번 라인)
zdata = gpuArray(matlab.lang.internal.move(zdata));

오류 발생: deep.internal.recording.containerfeval>iDispatch_1in (58번 라인)
    [outputs{1:numOut}] = fun(matlab.lang.internal.move(primaryArg));

오류 발생: deep.internal.recording.containerfeval>iProcessCell_1in (115번 라인)
    outCell{i} = iDispatch_1in(allowNetInput, fun, paramFun, numOut, ...

오류 발생: deep.internal.recording.containerfeval>iDispatch_1in (60번 라인)
    outputs = iProcessCell_1in(allowNetInput, fun, paramFun, numOut, ...

오류 발생: deep.internal.recording.containerfeval (35번 라인)
    outputs = iDispatch_1in(allowNetInput, fun, paramFun, numOut, ...

오류 발생: deep.internal.networkContainerFixedArgsFun (29번 라인)
varargout = deep.internal.recording.containerfeval(...

오류 발생: dlupdate (99번 라인)
    [varargout{1:nargout}] = deep.internal.networkContainerFixedArgsFun(...

오류 발생: rl.internal.model.ILearnableModel/toGPU (25번 라인)
            this.Learnables = dlupdate(@gpuArray,this.Learnables);

오류 발생: rl.internal.model.createInternalModel (30번 라인)
        model = toGPU(model);

오류 발생: rlQValueFunction (114번 라인)
model = rl.internal.model.createInternalModel(model, nameValueArgs.UseDevice, ...

오류 발생: rlQValueRepresentation (91번 라인)
    Rep = rlQValueFunction(Model,ObservationInfo,ActionInfo,...

오류 발생: createDDPGNetworks (50번 라인)
critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,criticOptions);

관련 문서

GPU 관련 다 끄고 나서야 겨우 돌아가네요

actorNetwork 입니다.

단순한 구조를 가지고 있습니다.

비평가 네트워크!

여기선 하이퍼 파라미터를 볼 수 있습니다.

실행하면 학습을 시작합니다!

으 시뮬레이션을 어떻게 보는질 모르겠네요

제 컴퓨터에 GPU가 없어서 많이 느린가 보네요..ㅠ

저작자표시 (새창열림)

'인공지능 > 강화학습' 카테고리의 다른 글

matlab 강화학습 - 다중 에이전트 강화 학습 (0)	2024.05.08
Matlab 강화학습 - 실질적인 문제 극복하기 (0)	2024.05.08
matlab 강화학습 - walking Robot Problem (0)	2024.05.08
matlab 강화학습 - 정책과 학습 Policies and training (0)	2024.05.08
matlab 강화학습 - 환경과 보상 (0)	2024.05.07

현재글딥러닝 강화학습을 통해 워킹 로봇 작동하기 matlab

인공지능, 자율주행에 관심있는 공대생의 일기장...?

Today :
Yesterday :

공대생 도전 일지