Snippets Groups Projects

a-mfea-rl

Aritz authored Dec 14, 2020

60f6c100

60f6c100 Dec 14, 2020

Name	Last commit	Last update
analyzer
mfea
mujoco
utils
.gitignore
LICENSE
README.md
RUN_ALL.sh
compressed_MT10.pbz2
compressed_TOY.pbz2
exp.py
mujoco36.yml

Citing A-MFEA-RL

Aritz D. Martinez, Javier Del Ser, Eneko Osaba and Francisco Herrera, Adaptive Multi-factorial Evolutionary Optimization for Multi-task Reinforcement Learning, 2020.

A-MFEA-RL: Adaptive Multi-factorial Evolutionary Optimization for Multi-task Reinforcement Learning

(ABSTRACT) Evolutionary Computation has largely exhibited its potential to replace conventional learning algorithms in a manifold of Machine Learning tasks, especially those related to unsupervised (clustering) and supervised learning. It has not been until lately when the computational efficiency of evolutionary solvers has been put in prospective for training Reinforcement Learning (RL) models. However, most studies framed in this context so far have considered environments and tasks conceived in isolation, without any exchange of knowledge among related tasks. In this manuscript we present A-MFEA-RL, an adaptive version of the well-known MFEA algorithm whose search and inheritance operators are tailored for multitask RL environments. Specifically, our A-MFEA-RL approach includes crossover and inheritance mechanisms for refining the exchange of genetic material that rely on the multi-layered structure of modern Deep Learning based RL models. In order to assess the performance of the proposed evolutionary multitasking approach, we design an extensive experimental setup comprising different multitask RL environments of varying levels of complexity, comparing them to those furnished by alternative non-evolutionary multitask RL approaches. As concluded from the discussion of the obtained results, A-MFEA-RL not only achieves competitive success rates over the tasks being simultaneously solved, but also fosters the exchange of knowledge among tasks that could be intuitively expected to keep a degree of synergistic relationship.

In the framework, a reformulation of the well-known MFEA/MFEA-II algorithms is introduced. The algorithm is thought so that Multifactorial Optimization can be applied to train neural networks taking advantage of inter-task similarities bi mimicking the traditional Model-based Transfer Learning procedure. The adaptation is carried out by means of three crucial points:

Design of the unified space towards favoring model-based Transfer Learning: specifically, aspects such as the neural network architecture, the number of neurons of each layer, and the presence of shared layers among models evolved for each task are taken into account.
Adapted crossover operator: the crossover operator must support the previous aspects by preventing neural models from exchanging irrelevant information.
Layer-based Transfer Learning: unlike in traditional means to implement Transfer Learning, the number of layers to be transferred between models evolved for different tasks is autonomously decided by A-MFEA-RL during the search.

The code works on top of . The experimentation carried out considers three scenarios; TOY, MT-10/MT-10-R and MT-50/MT-50-R (Results included in Results Section ), R denotes random initialized episodes as in the next image:

MT-10-R results

Running the experimentation

It is recommended to use the conda environment provided with the code (mujoco36.yml) for ease:

conda env create -f mujoco36.yml
conda activate mujoco36

A-MFEA-RL depends on Metaworld and (license required). To install Metaworld please follow the instructions in the or run:

pip install git+https://github.com/rlworkgroup/metaworld.git@master#egg=metaworld

The experimentation can be replicated by running the RUN_ALL.sh. In order to run experiments independently:

python3 exp.py -exp INT -t INT -p STR

-exp: Integer. 0 = TOY, 1 = MT-10/MT-10-R, 2 = MT-50/MT-50-R.
-t: Integer. Number of threads used by Ray.
-p: STRING. Name of the folder under summary where results are saved.

Results

		MT-10			MT-10-R			MT-50			MT-50-R
Environment name (complexity)	A	B	C	A	B	C	A	B	C	A	B	C
assembly (H)	-	-	-	-	-	-	0	0	0	0	0	0
basketball (H)	-	-	-	-	-	-	0	0	0	22	33	0
bin-picking (H)	-	-	-	-	-	-	0	0	0	0	0	11
box-close (H)	-	-	-	-	-	-	44	44	0	22	33	0
button-press-topdown (M)	100	100	100	100	89	91	100	100	100	100	100	97
button-press-topdown-wall (H)	-	-	-	-	-	-	67	78	100	67	100	100
button-press (M)	-	-	-	-	-	-	44	67	100	44	55	100
button-press-wall (H)	-	-	-	-	-	-	100	100	100	100	100	98
coffee-button (H)	-	-	-	-	-	-	44	78	100	56	89	100
coffee-pull (M)	-	-	-	-	-	-	78	100	0	100	100	70
coffee-push (M)	-	-	-	-	-	-	78	89	100	89	89	40
dial-turn (H)	-	-	-	-	-	-	100	100	100	100	100	99
disassemble (H)	-	-	-	-	-	-	0	0	0	0	0	0
door-close (H)	-	-	-	-	-	-	78	56	100	78	55	100
door-lock (H)	-	-	-	-	-	-	89	100	100	89	89	100
door-open (H)	100	33	100	100	100	100	78	67	100	67	67	100
door-unlock (M)	-	-	-	-	-	-	78	89	100	89	100	100
drawer-close (H)	100	100	100	100	100	100	79	89	100	67	78	100
drawer-open (H)	0	33	100	33	0	99	22	33	100	22	44	98
faucet-close (M)	-	-	-	-	-	-	100	67	100	78	44	81
faucet-open (M)	-	-	-	-	-	-	89	89	100	89	67	91
hammer (H)	-	-	-	-	-	-	33	56	100	11	67	100
hand-insert (M)	-	-	-	-	-	-	100	100	100	100	100	100
handle-press-side (H)	-	-	-	-	-	-	0	11	100	100	33	40
handle-press (H)	-	-	-	-	-	-	89	78	60	100	78	35
handle-pull-side (H)	-	-	-	-	-	-	56	67	0	56	89	0
handle-pull (H)	-	-	-	-	-	-	89	100	0	78	100	0
lever-pull (M)	-	-	-	-	-	-	0	0	0	0	0	0
peg-insert-side (H)	67	33	0	56	56	0	0	22	0	44	33	0
peg-unplug-side (H)	-	-	-	-	-	-	100	100	0	100	100	0
pick-out-of-hole (H)	-	-	-	-	-	-	0	0	0	0	0	0
pick-place (H)	66	100	0	0	0	0	44	11	0	33	11	0
pick-place-wall (H)	-	-	-	-	-	-	44	33	0	33	0	10
plate-slide-back-side (M)	-	-	-	-	-	-	100	89	40	78	89	45
plate-slide-back (M)	-	-	-	-	-	-	67	89	100	89	100	58
plate-slide-side (M)	-	-	-	-	-	-	100	89	100	55	100	100
plate-slide (M)	-	-	-	-	-	-	33	100	100	78	78	77
push-back (E)	-	-	-	-	-	-	89	100	0	89	100	71
push (E)	100	100	100	78	67	59	44	89	100	78	33	47
push-wall (M)	-	-	-	-	-	-	56	33	100	55	44	47
reach (E)	100	100	100	100	100	91	100	100	100	100	100	98
reach-wall (E)	-	-	-	-	-	-	100	100	100	100	100	98
shelf-place (H)	-	-	-	-	-	-	0	0	0	44	55	0
soccer (E)	-	-	-	-	-	-	67	78	0	55	33	48
stick-pull (H)	-	-	-	-	-	-	11	33	0	11	44	79
stick-push (H)	-	-	-	-	-	-	0	0	0	11	0	100
sweep-into (E)	-	-	-	-	-	-	100	78	100	67	89	80
sweep (E)	-	-	-	-	-	-	100	89	100	100	67	74
window-close (H)	33	33	100	100	78	100	67	44	100	89	44	100
window-open (H)	67	100	100	78	89	99	11	67	100	44	78	93
Average success rate	73.3	73.2	80.0	74.5	67.9	73.9	57.3	62.0	60.0	61.5	62.1	59.7