Gibbs sampling (also called alternating conditional sampling) is a Markov Chain Monte Carlo algorithm for high-dimensional data. GibbsSampler is a motif finding algorithm that finds one common motif and returns a list of bestMotifs containing the closest motif match from each string in dna.
Given: Integers k, t, and N, followed by a collection of strings Dna.
Return: The strings BestMotifs resulting from running GibbsSampler(Dna, k, t, N) with 20 random starts.
We have to code Gibbs Sampler for the purpose of motif discovery. Here,
Dna -- A collection of DNA strings that are of the same length.
"t" -- Is an integer indicating how many times to read the genetic algorithm.
"k" -- An integer indicating the motif length being searched for.
"N" -- The number of iterations before returning the best motif.
GibbsSampler (Dna, k , t , N)
randomly select k−Mers Motifs = ( Motif 1 , . . . , Motift ) in each string from Dna
BestMotifs ←Motifs
for j ←1 to N
i ← Random ( t )
Profile ←profile matrix constructed from all strings in Motifs except for Motifi
Motifi ←Profile − randomly generated k-Mer in the i-th sequence
if Score ( Motifs ) < Score ( BestMotifs )
BestMotifs ←Motifs
return BestMotifs