Skip to content
Snippets Groups Projects
Select Git revision
  • 60f6c100c573e55e7a594652684c93e80e6de3c9
  • master default
2 results

a-mfea-rl

  • Clone with SSH
  • Clone with HTTPS
  • Aritz's avatar
    Aritz authored
    60f6c100
    History

    Citing A-MFEA-RL

    Aritz D. Martinez, Javier Del Ser, Eneko Osaba and Francisco Herrera, Adaptive Multi-factorial Evolutionary Optimization for Multi-task Reinforcement Learning, 2020.

    A-MFEA-RL: Adaptive Multi-factorial Evolutionary Optimization for Multi-task Reinforcement Learning

    (ABSTRACT) Evolutionary Computation has largely exhibited its potential to replace conventional learning algorithms in a manifold of Machine Learning tasks, especially those related to unsupervised (clustering) and supervised learning. It has not been until lately when the computational efficiency of evolutionary solvers has been put in prospective for training Reinforcement Learning (RL) models. However, most studies framed in this context so far have considered environments and tasks conceived in isolation, without any exchange of knowledge among related tasks. In this manuscript we present A-MFEA-RL, an adaptive version of the well-known MFEA algorithm whose search and inheritance operators are tailored for multitask RL environments. Specifically, our A-MFEA-RL approach includes crossover and inheritance mechanisms for refining the exchange of genetic material that rely on the multi-layered structure of modern Deep Learning based RL models. In order to assess the performance of the proposed evolutionary multitasking approach, we design an extensive experimental setup comprising different multitask RL environments of varying levels of complexity, comparing them to those furnished by alternative non-evolutionary multitask RL approaches. As concluded from the discussion of the obtained results, A-MFEA-RL not only achieves competitive success rates over the tasks being simultaneously solved, but also fosters the exchange of knowledge among tasks that could be intuitively expected to keep a degree of synergistic relationship.

    In the framework, a reformulation of the well-known MFEA/MFEA-II algorithms is introduced. The algorithm is thought so that Multifactorial Optimization can be applied to train neural networks taking advantage of inter-task similarities bi mimicking the traditional Model-based Transfer Learning procedure. The adaptation is carried out by means of three crucial points:

    1. Design of the unified space towards favoring model-based Transfer Learning: specifically, aspects such as the neural network architecture, the number of neurons of each layer, and the presence of shared layers among models evolved for each task are taken into account.
    2. Adapted crossover operator: the crossover operator must support the previous aspects by preventing neural models from exchanging irrelevant information.
    3. Layer-based Transfer Learning: unlike in traditional means to implement Transfer Learning, the number of layers to be transferred between models evolved for different tasks is autonomously decided by A-MFEA-RL during the search.

    The code works on top of Metaworld-v1. The experimentation carried out considers three scenarios; TOY, MT-10/MT-10-R and MT-50/MT-50-R (Results included in Results Section ), R denotes random initialized episodes as in the next image:

    MT-10-R results

    Running the experimentation

    It is recommended to use the conda environment provided with the code (mujoco36.yml) for ease:

    conda env create -f mujoco36.yml
    conda activate mujoco36

    A-MFEA-RL depends on Metaworld and MuJoco (license required). To install Metaworld please follow the instructions in the official GitHub or run:

    pip install git+https://github.com/rlworkgroup/metaworld.git@master#egg=metaworld

    The experimentation can be replicated by running the RUN_ALL.sh. In order to run experiments independently:

    python3 exp.py -exp INT -t INT -p STR
    • -exp: Integer. 0 = TOY, 1 = MT-10/MT-10-R, 2 = MT-50/MT-50-R.
    • -t: Integer. Number of threads used by Ray.
    • -p: STRING. Name of the folder under summary where results are saved.

    Results

    MT-10 MT-10-R MT-50 MT-50-R
    Environment name (complexity) A B C A B C A B C A B C
    assembly (H) - - - - - - 0 0 0 0 0 0
    basketball (H) - - - - - - 0 0 0 22 33 0
    bin-picking (H) - - - - - - 0 0 0 0 0 11
    box-close (H) - - - - - - 44 44 0 22 33 0
    button-press-topdown (M) 100 100 100 100 89 91 100 100 100 100 100 97
    button-press-topdown-wall (H) - - - - - - 67 78 100 67 100 100
    button-press (M) - - - - - - 44 67 100 44 55 100
    button-press-wall (H) - - - - - - 100 100 100 100 100 98
    coffee-button (H) - - - - - - 44 78 100 56 89 100
    coffee-pull (M) - - - - - - 78 100 0 100 100 70
    coffee-push (M) - - - - - - 78 89 100 89 89 40
    dial-turn (H) - - - - - - 100 100 100 100 100 99
    disassemble (H) - - - - - - 0 0 0 0 0 0
    door-close (H) - - - - - - 78 56 100 78 55 100
    door-lock (H) - - - - - - 89 100 100 89 89 100
    door-open (H) 100 33 100 100 100 100 78 67 100 67 67 100
    door-unlock (M) - - - - - - 78 89 100 89 100 100
    drawer-close (H) 100 100 100 100 100 100 79 89 100 67 78 100
    drawer-open (H) 0 33 100 33 0 99 22 33 100 22 44 98
    faucet-close (M) - - - - - - 100 67 100 78 44 81
    faucet-open (M) - - - - - - 89 89 100 89 67 91
    hammer (H) - - - - - - 33 56 100 11 67 100
    hand-insert (M) - - - - - - 100 100 100 100 100 100
    handle-press-side (H) - - - - - - 0 11 100 100 33 40
    handle-press (H) - - - - - - 89 78 60 100 78 35
    handle-pull-side (H) - - - - - - 56 67 0 56 89 0
    handle-pull (H) - - - - - - 89 100 0 78 100 0
    lever-pull (M) - - - - - - 0 0 0 0 0 0
    peg-insert-side (H) 67 33 0 56 56 0 0 22 0 44 33 0
    peg-unplug-side (H) - - - - - - 100 100 0 100 100 0
    pick-out-of-hole (H) - - - - - - 0 0 0 0 0 0
    pick-place (H) 66 100 0 0 0 0 44 11 0 33 11 0
    pick-place-wall (H) - - - - - - 44 33 0 33 0 10
    plate-slide-back-side (M) - - - - - - 100 89 40 78 89 45
    plate-slide-back (M) - - - - - - 67 89 100 89 100 58
    plate-slide-side (M) - - - - - - 100 89 100 55 100 100
    plate-slide (M) - - - - - - 33 100 100 78 78 77
    push-back (E) - - - - - - 89 100 0 89 100 71
    push (E) 100 100 100 78 67 59 44 89 100 78 33 47
    push-wall (M) - - - - - - 56 33 100 55 44 47
    reach (E) 100 100 100 100 100 91 100 100 100 100 100 98
    reach-wall (E) - - - - - - 100 100 100 100 100 98
    shelf-place (H) - - - - - - 0 0 0 44 55 0
    soccer (E) - - - - - - 67 78 0 55 33 48
    stick-pull (H) - - - - - - 11 33 0 11 44 79
    stick-push (H) - - - - - - 0 0 0 11 0 100
    sweep-into (E) - - - - - - 100 78 100 67 89 80
    sweep (E) - - - - - - 100 89 100 100 67 74
    window-close (H) 33 33 100 100 78 100 67 44 100 89 44 100
    window-open (H) 67 100 100 78 89 99 11 67 100 44 78 93
    Average success rate 73.3 73.2 80.0 74.5 67.9 73.9 57.3 62.0 60.0 61.5 62.1 59.7