cluster
A cluster is an ordered set of hits related to a model which satisfy the model distance constraints.
cluster API reference
cluster
- class macsypy.cluster.Cluster(hits: list[CoreHit | ModelHit], model, hit_weights)[source]
Handle hits relative to a model which collocates
- __contains__(m_hit: ModelHit) bool [source]
- Parameters:
m_hit – The hit to test
- Returns:
True if the hit is in the cluster hits, False otherwise
- __init__(hits: list[CoreHit | ModelHit], model, hit_weights) None [source]
- Parameters:
hits – the hits constituting this cluster
model – the model associated to this cluster
hit_weights – the weight of the hit to compute the score
- __weakref__
list of weak references to the object (if defined)
- _check_replicon_consistency() None [source]
- Raise:
MacsypyError if all hits of a cluster are NOT related to the same replicon
- fulfilled_function(*genes: ModelGene | str) frozenset[str] [source]
- Parameters:
genes – The genes which must be tested.
- Returns:
the common functions between genes and this cluster.
- property functions: frozenset[str]
- Returns:
The set of functions encoded by this cluster function mean gene name or reference gene name for exchangeables genes for instance
- <model vers=”2.0”>
<gene a presence=”mandatory”/> <gene b presence=”accessory”/>
- <exchangeable>
<gene c />
</exchangeable>
<gene/>
</model>
the functions for a cluster corresponding to this model wil be {‘a’ , ‘b’}
- property hit_weights: HitWeight
- Returns:
the different weight for the hits used to compute the score
- property loner: bool
- Returns:
True if this cluster is made of only some hits representing the same gene and this gene is tag as loner False otherwise: - contains several hits coding for different genes - contains one hit but gene is not tag as loner (max_gene_required = 1)
- merge(cluster: Cluster, before: bool = False) None [source]
merge the cluster param in this one. (do it in place)
- Parameters:
cluster –
before (bool) – If False the hits of the cluster will be added at the end of this one, Otherwise the cluster hits will be inserted before the hits of this one.
- Raises:
MacsypyError – if the two clusters have not the same model
- property multi_system: bool
- Returns:
True if this cluster is made of only one hit representing a multi_system gene False otherwise:
contains several hits
contains one hit but gene is not tag as loner (max_gene_required = 1)
- replace(old: ModelHit, new: ModelHit) None [source]
replace hit old in this cluster by new one. (do it in place)
- Parameters:
old – the hit to replace
new – the new hit
- Returns:
None
- property replicon_name: str
- Returns:
The name of the replicon where this cluster is located
- Return type:
str
- property score: float
- Returns:
The score for this cluster
build_clusters
- macsypy.cluster.build_clusters(hits: list[ModelHit], rep_info: RepliconInfo, model: Model, hit_weights: HitWeight) tuple[list[~macsypy.cluster.Cluster], dict[slice(<class 'str'>, macsypy.hit.Loner | macsypy.hit.LonerMultiSystem, None)]] [source]
From a list of filtered hits, and replicon information (topology, length), build all lists of hits that satisfied the constraints:
max_gene_inter_space
loner
multi_system
If Yes create a cluster. A cluster contains at least two hits separated by less or equal than max_gene_inter_space Except for loner genes which are allowed to be alone in a cluster
- Parameters:
hits – list of filtered hits
rep_info – the replicon to analyse
model – the model to study
hit_weights – the hit weight needed to compute the cluster score
- Returns:
list of regular clusters, the special clusters (loners not in cluster and multi systems)
- Return type:
tuple with 2 elements
true_clusters which is list of
Cluster
objectstrue_loners: a dict { str function: :class:macsypy.hit.Loner | :class:macsypy.hit.LonerMultiSystem object}