Mid Sweden University

miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
An Improved DE Algorithm to Optimise the Learning Process of a BERT-based Plagiarism Detection Model
Hakim Sabzevari University, Sabzevar, Iran.ORCID iD: 0000-0001-8661-7578
Show others and affiliations
2022 (English)In: 2022 IEEE Congress on Evolutionary Computation (CEC), Institute of Electrical and Electronics Engineers (IEEE) , 2022Conference paper, Published paper (Refereed)
Abstract [en]

Plagiarism detection is a challenging task, aiming to identify similar items in two documents. In this paper, we present a novel approach to automatic plagiarism detection that combines BERT (bidirectional encoder representations from transformers) word embedding, attention mechanism-based long short-term memory (LSTM) networks, and an improved differential evolution (DE) algorithm for weight initialisation. BERT is used to pretrain deep bidirectional representations in all layers, while the pre-trained BERT model can be fine-tuned with only one extra output layer without significant changes in architecture. Deep learning algorithms often use the random weighting method for initialisation, followed by gradient-based optimisation algorithms such as back-propagation for training, making them susceptible to getting trapped in local optima. To address this, population- based metaheuristic algorithms such as DE can be used. We propose an improved DE algorithm with a clustering-based mutation operator, where first a winning cluster of candidate solutions is identified and a new updating strategy is then applied to include new candidate solutions in the current population. The proposed DE algorithm is used in LSTM, attention mechanism, and feed- forward neural networks to yield the initial seeds for subsequent gradient-based optimisation. We compare our proposed model with conventional and population-based approaches on three datasets (SNLI, MSRP and SemEval2014) and demonstrate it to give superior plagiarism detection performance.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2022.
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:miun:diva-52366DOI: 10.1109/CEC55065.2022.9870280Scopus ID: 2-s2.0-85138753376OAI: oai:DiVA.org:miun-52366DiVA, id: diva2:1894869
Conference
IEEE Congress on Evolutionary Computation (CEC2022)
Available from: 2024-09-04 Created: 2024-09-04 Last updated: 2024-09-05Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Seyed Jalaleddin, Mousavirad

Search in DiVA

By author/editor
Seyed Jalaleddin, Mousavirad
Computer Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 4 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf