Module Cgr.Seq

module Seq: Cgr_seq
Manipulating DNA sequences (reading, writing, generating from IID or Markov models, ...

val read_sequence : ?add_inv:bool ->
?start:int -> ?len:int -> Pervasives.in_channel -> Cgr_types.sequence
Read a nucleotide sequence from the given channel.
add_inv : indicate if we must add the reversed inverted sequence after reading; default is false
start : can be used to start from the given position (use only with files)
len : can be used to specify the number of nucleotides to read
val standard_read_sequence : unit -> Cgr_types.sequence
Common way to read a nucleotide sequence from the input channel, using Global.in_channel, Global.add_rev_inv_seq, Global.length, Global.start_pos.
val standard_of_file : string -> Cgr_types.sequence
Like Seq.standard_read_sequence but reads the sequence in the given file.
val of_file : ?add_inv:bool -> ?start:int -> ?len:int -> string -> Cgr_types.sequence
Like Seq.read_sequence but reads the sequence in the given file.
val of_string : ?on_error:[ `Concat | `Fail | `Longest ] -> string -> Cgr_types.sequence
Return a sequence from the given string.
on_error : can be used to indicate what to when an invalid character is found: Default is `Fail.
val to_string : Cgr_types.sequence -> string
Return the given sequence as a string.
val of_source : Cgr_types.sequence_source -> Cgr_types.sequence
Return the sequence from the given sequence source.
val print : Pervasives.out_channel -> Cgr_types.sequence -> unit
print oc seq prints the given sequence to the given channel.
val nucleotide_opt_of_char : char -> Cgr_types.nucleotide option
nucleoide_opt_of_char c returns None if the given character is an empty space, or Some n with n the nucleotide corresponding to the given character.
Raises Failure if c is not a blank and does not correspond to any nucleotide.
val read_char : ?len:int -> Pervasives.in_channel -> unit -> char
Buffered character input.
val ends_with : Cgr_types.sequence -> Cgr_types.sequence -> bool
ends_with s1 s2 returns true if sequence s1 ends with s2.
val isolate_sequences : Pervasives.in_channel -> string -> ?number:int -> int -> int -> unit
isolate_sequences ic file_prefix minlen maxlen tries to read a sequence from the given in_channel, and isolate sequences of minimal length minlen and maximal length maxlen. Each sequence is printed to a [file_prefix]-n.seq file, where n grows from 0.
number : can be given to indicate a maximum number of files to create.
val isolate_sequences_gen : Pervasives.in_channel ->
(Cgr_types.sequence -> unit) -> ?number:int -> int -> int -> unit
isolate_sequences ic f_seq minlen maxlen does the same as Seq.isolate_sequences but calls f_seq when a sequence is found.
val read_length : Pervasives.in_channel -> int
read_length ic computes the length of the sequence read from the given in_channel. It counts only the valid nucleotides.
val invert : Cgr_types.sequence -> Cgr_types.sequence
Return the inverted sequence from the given one.
val add_reversed_inverted : Cgr_types.sequence -> Cgr_types.sequence
Add reversed inverted sequence to the given one.
val test_sequences_files : string list -> bool
test_sequences_files files returns true if all files contain valid sequences, or else false.
val count : Cgr_types.sequence -> int array
count seq returns an array with the number of occurences of each nucleotides.
module IID: sig .. end
IID sequences.
module Markov: sig .. end
Markov sequences.
module Auto_regressive: 
functor (P : Cgr_types.Cenac_param) -> sig .. end
Auto-regressive sequences and Durbin Watson test.
val init_cgr_next_method : ?seq:Cgr_types.sequence -> unit -> unit
Initialize Global.cgr_next_method from Global.cgr_next_method (add the missing information).
Raises Failure if the needed missing information is not given (i.e. no seq for Cgr_types.Cgr_next_given_weight).
val binary_of_seq : Pervasives.out_channel -> Cgr_types.sequence -> unit
binary_of_seq out seq output to the given channel the binary representation of the given seq (except the last nucleotides if the length of the sequence if not a multiple of 4).
val seq_of_binary : Pervasives.in_channel -> Cgr_types.sequence
seq_of_binary ic read characters from the given channel and from each character determine the 4 nucleotides in it (each nucleotide is coded on two bits).
Raises Invalid_argument if the sequence is too long to be stored in an array.