3
I am trying to solve an exercise in which a function with a start and a stop, where the start should occur when you find "ATG"
and the stop when you find the "TAA"
, "TAG"
, "TGA"
.
With the help of the comments, I was able to adjust the code a little, but the stop is limited to only what I put in the s.find()
. I’d like help getting him to look for one of the three stop codons proposed in the exercise.
Follows the statement of the exercise that generated the question:
Do a function
contacodon
in Python receiving a sequence of letters representing nucleotides (e. g.,A
,C
,G
,T
), check whether the sequence is valid (i.e. contains onlyA
s,C
s,G
s, andT
s) and return with the number of occurrences of each Odon counting from the start Codon"ATG"
up to one of the stop codons,"TAA"
,"TAG"
,"TGA"
Input example:
contacodon("AGCGATCGAGATGAGCATCGCATCGCGGACTACCGCGCGCGCGCGCGGGAGATGAGCATCGACGACTCGACTAG")
Exit to the above entrance:
{
'ATG': 1,
'AGC': 1,
'ATC': 1,
'GCA': 1,
'TCG': 1,
'CGG': 1,
'ACT': 1,
'ACC': 1,
'GCG': 2,
'CGC': 2,
'GGG': 1,
'AGA': 1,
'TGA': 1
}
My code:
def verificar (s):
s = s.upper()
for ent in s:
if not ent in "ACGT":
return False
return True
while True:
s = str(input("Entre com a seq: \n")).upper()
if verificar(s):
break
print("Seq inválida")
count={}
for i in range(s.find('ATG'),s.rfind('TAA')+1,3):
codon = s[i:i+3]
if codon in count:
count[codon] += 1
else:
count[codon] = 1
print('\n', count, '\n',)
It has the full statement of the exercise. Because the explanation is vague.
– Augusto Vasques
The sequence that the statement passed as an example does not have a multiple length of three. Is that right? If yes, the approach you took (from going through the string three by three) will not work.
– Luiz Felipe
It is a deliberate error, the codons that will be tested will be multiples of 3, this in the statement is an example of input and what has to come out from the start "ATG" and Stop in the "TAA" or in the "TAG" or in the "TGA".
– Marcelo Bueno
I’d like to help but there’s a lot of errors in your code and I’m not gonna rewrite it from scratch and I’d have to rate them all
– Augusto Vasques
If it’s intentional, Marcelo, as I said, the approach to go through the string (going three by three
range
) is completely invalid. You’ll probably have to think of something else (and in that case, we won’t do it for you here, as that’s not the purpose of the site).– Luiz Felipe
Ok! Still thank you for your attention, I will think of some other solution for the exercise.
– Marcelo Bueno
Use
str.find()
to locate the Index of the first occurrence ofATG
.– Augusto Vasques
Thanks Augusto Vasques, I will read this document and redo the code.
– Marcelo Bueno
Your logic is almost correct, only instead of yours
range
start at zero, you should start from the index of the first Codon ATG (which you can find using the methods.find()
of its input string, as @Augustovasques commented). Something else, onrange
there is no need to calculate the exact end of the sequence withlen(s)-len(s)%3
, since you pick up 3 characters with Slices, and Slices never giveIndexError
.– jfaccioni
@Jfaccioni he can take a slice
s[s.find("ATG"):]
and fragment into three-character portions with this function https://answall.com/a/496160/137387– Augusto Vasques
With the tips given I was able to tweak the code a little using this solution: for i in range(s.find('ATG'),s.find('TAA')+1,3): ----- prints the desired range, but the stop can occur in three cases: 'TAA', 'TAG' and 'TGA'. I would like some guidance on how to do this. It can be here or by documentation indication.
– Marcelo Bueno
Luiz Felipe, I reformulated the question.
– Marcelo Bueno