The flowchart of the ncPro-ML


To fully extract information from benchmark datasets, eight kinds of feature representation schemes were used to transform DNA sequences into numerical vectors. For verifying the relation between the length of sequences used for training prediction models and their predictive performance, eight different length sequences range from 81 to 221 nucleotides have been used in this work. To obtain a better generalization prediction model, an optimal feature selection process was utilized to select optimal feature subsets from the candidate feature list for each feature representation scheme. Then, we trained different support vector machine (SVM) models based on multiple optimal subsets and integrated them based on the accuracy of five-fold cross-validation as weight.

Constrction of ncPro-ML


BKF: binary and k-mer frequency
DBPF: dinucleotide binary profile and frequency
DPCP: dinucleotide physical-chemical properties
TPCP: trinucleotide physical-chemical properties
triEIIP: electron-ion interaction pseudopotentials of trinucleotide
RFHCP: ring-function-hydrogen-chemical properties
PseDNC: pseudo dinucleotide composition
MMI: multivariate mutual information

A brief introduction to the input page


ncPro-ML provide a flexible and convinient way to input sequences. Users can either paste sequences in the text box directly or upload the relevant file from the local computer. The file should be plain text in raw or FASTA format. ncPro-ML will take all lines between the two adjacent >-beginning lines as just one sequence. And then, users need to specify species information.

Constrction of ncPro-ML


A brief introduction to the result page


Constrction of ncPro-ML



The meaning of each column of the above table is described below.
Id: the fasta id of input sequence.
Location: n-m, the sequence locating at the position from n to m of the input sequence.
Promoter or Not: sequence predicted as promoter or not.
Prob: the probability to be a promoter sequence.

And, users can checked the + in the Id column to view the corresponding sequence.

Constrction of ncPro-ML


Users can also use the filter box at the bottom to display the special information.

Constrction of ncPro-ML



Dataset: