dc.contributor.author | Imtiaz, Muhammad Babar | |
dc.contributor.author | Qiao, Yuansong | |
dc.contributor.author | Lee, Brian | |
dc.date.accessioned | 2023-04-25T10:58:36Z | |
dc.date.available | 2023-04-25T10:58:36Z | |
dc.date.copyright | 2023 | |
dc.date.issued | 2023-01-29 | |
dc.identifier.citation | Imtiaz, M.B., Qiao, Y., Lee, B, (2023). Prehensile and non-prehensile robotic pick-and-place of objects in clutter using deep reinforcement learning. Sensors, 23, 1513. https://doi.org/10.3390/s23031513 | en_US |
dc.identifier.uri | https://research.thea.ie/handle/20.500.12065/4490 | |
dc.description.abstract | In this study, we develop a framework for an intelligent and self-supervised industrial
pick-and-place operation for cluttered environments. Our target is to have the agent learn to perform
prehensile and non-prehensile robotic manipulations to improve the efficiency and throughput of
the pick-and-place task. To achieve this target, we specify the problem as a Markov decision process
(MDP) and deploy a deep reinforcement learning (RL) temporal difference model-free algorithm
known as the deep Q-network (DQN). We consider three actions in our MDP; one is ‘grasping’ from
the prehensile manipulation category and the other two are ‘left-slide’ and ‘right-slide’ from the nonprehensile
manipulation category. Our DQN is composed of three fully convolutional networks (FCN)
based on the memory-efficient architecture of DenseNet-121 which are trained together without
causing any bottleneck situations. Each FCN corresponds to each discrete action and outputs a
pixel-wise map of affordances for the relevant action. Rewards are allocated after every forward
pass and backpropagation is carried out for weight tuning in the corresponding FCN. In this manner,
non-prehensile manipulations are learnt which can, in turn, lead to possible successful prehensile
manipulations in the near future and vice versa, thus increasing the efficiency and throughput
of the pick-and-place task. The Results section shows performance comparisons of our approach
to a baseline deep learning approach and a ResNet architecture-based approach, along with very
promising test results at varying clutter densities across a range of complex scenario test cases. | en_US |
dc.format | PDF | en_US |
dc.language.iso | eng | en_US |
dc.publisher | MDPI | en_US |
dc.relation.ispartof | Sensors | en_US |
dc.rights | Attribution- 3.0 United States | * |
dc.rights.uri | http://creativecommons.org/licenses/by/3.0/us/ | * |
dc.subject | Prehensile | en_US |
dc.subject | Non-prehensile | en_US |
dc.subject | Robotic manipulation | en_US |
dc.subject | Markov decision process | en_US |
dc.subject | Deep reinforcement learning | en_US |
dc.subject | Deep Q-network | en_US |
dc.subject | Fully convolutional network | en_US |
dc.subject | DenseNet-121 | en_US |
dc.title | Prehensile and non-prehensile robotic pick-and-place of objects in clutter using deep reinforcement learning | en_US |
dc.type | info:eu-repo/semantics/article | en_US |
dc.contributor.affiliation | Technological University of the Shannon: Midlands Midwest | en_US |
dc.contributor.sponsor | Science Foundation Ireland (SFI) | en_US |
dc.description.peerreview | yes | en_US |
dc.identifier.doi | 10.3390/s23031513 | en_US |
dc.identifier.eissn | 1424-8220 | |
dc.identifier.orcid | https://orcid.org/0000-0003-4775-9033 | en_US |
dc.identifier.orcid | https://orcid.org/0000-0002-1543-1589 | en_US |
dc.identifier.orcid | https://orcid.org/0000-0002-8475-4074 | en_US |
dc.rights.accessrights | info:eu-repo/semantics/openAccess | en_US |
dc.subject.department | Software Research Institute: TUS MIdlands | en_US |
dc.type.version | info:eu-repo/semantics/publishedVersion | en_US |
dc.relation.projectid | SFI 16/RC/3918 | en_US |