DETAIL DOCUMENT
Kombinasi Metode Tomek-Links dan Random Undersampling untuk Identifikasi Single Nucleotide Polymorphism Menggunakan Artificial Neural Network pada Genom Kedelai
Total View This Week0
Institusion
Universitas Sumatera Utara
Author
Pulungan, Aflah Mutsanni (STUDENT ID : 171402012)
(LECTURER ID : 0031087905)
(LECTURER ID : 0001078708)
Subject
Next Generation Sequencing 
Datestamp
2022-12-19 03:04:29 
Abstract :
Next Generation Sequencing (NGS) is a machine that can read Single Nucleotide Polymorphism on a genome, including the soybean genome used in this study. However, the machine has a high error rate so that more SNP candidate data are found which are caused by errors when reading the NGS machine compared to the actual SNP candidate data. Then the data generated by the NGS also has an imbalance problem, where the number of negative SNPs is more than the number of positive SNPs. To overcome the imbalanced data, researchers will use Tomek Links and Random Undersampling which aims to eliminate noise data and form a new dataset. Then the SNP identification process uses a method that can classify large amounts of data, namely Artificial Neural Network. The resulting model is formed from Artificial Neural Network hyperparameters, namely epoch 10, activation function using Log Softmax and batch size 64. In addition to Artificial Neural Network, Random Undersampling also uses hyperparameter sampling strategy/balance ratio of 0.4. Based on the evaluation that has been done, the G-Mean is 93 with these results it can be concluded that the methods Random Undersampling and Artificial Neural Network used in this study can identify SNPs well. 

Institution Info

Universitas Sumatera Utara