Convert string into a list where each element in the list is 3 characters from my initial string

Asked

Viewed 497 times

1

I wonder how I can turn one string on a list where each element of that list is a string with 3 characters from string original. Therefore if the string original was like this "AAGGGTTGG" I get a list like this ["AAG","GGT","TGG"].

I tried the following:

seq = "AAGGGTTGG"
i = 0
codon_lista = []
for i in range(len(seq)):
    codon = seq[i:i+3]
    codon_lista.append(codon)
    i = i + 3

However the result of this is something like:

["AAG", "AGG", "GGG", "GGT", "GTT", "TTG", "TGG", "GG", "G"]

3 answers

3


The instruction i = i + 3 within the for does not work because this value of i is always overwritten by the next value of range. Example:

for i in range(5):
    print(i)
    i = 100

This code prints the numbers from 0 to 4. Even if I set the value of i for 100, in the next iteration it will be overwritten by the next value of range. So it’s no use adding 3 to i within the loop.

The solution is to use a range that jumps 3 by 3:

seq = "AAGGGTTGG"
codon_lista = []
for i in range(0, len(seq), 3):
    codon_lista.append(seq[i:i+3])

print(codon_lista)

Also note that there is no need to initialize i = 0 outside the loop. The result is:

['AAG', 'GGT', 'TGG']


You can also use the syntax of comprehensilist on, much more succinct and pythonic.

seq = "AAGGGTTGG"
codon_lista = [ seq[i:i + 3] for i in range(0, len(seq), 3)]
print(codon_lista)

0

The for i in range() receives as parameters an initial value, a final value and a value to iterate. Summarizing: for i in range(val_inicial, val_final, iterador). The value of i is iterated to a value before the final value, i.e., i varies from [val_inicial, val_final[.

By default, iterate incrementing of 1 in 1.

>>> for i in range(0, 5):
>>>     print(i)
0
1
2
3
4

That’s the same as writing for i in range(0, 5, 1):.

To iterate from 3 in 3, we can do:

>>> for i in range(0, 5, 3):
>>>     print(i)
0
3

In your case, just switch to for i in range(0, len(seq), 3):.

It is possible to iterate with negative values as well, such as

>>> for i in range(5, 0, -1):
>>>     print(i)
5
4
3
2
1

0

Look at this implementation in Javascript, it can serve as an inspiration, and it’s very simple, it goes through letter by letter, and every interval (3 in your example) inserts in another array.

var seq = "AAGGGTTGG";
var intervalo = 3;

codon_lista = [];
for (var i=0; i<seq.length-1; i++) {
    if (i%intervalo === 0) {
        codon_lista.push(seq.substr(i, intervalo));
    }
}
console.log(codon_lista);

Browser other questions tagged

You are not signed in. Login or sign up in order to post.