Ordering a list of strings in python

Question

Ordering a list of strings in python

Asked 7 years, 7 months ago

Viewed 916 times

-1

Staff need to sort a list that inside it there is another list containing exactly a string and a number, the problem is that I’m not getting the desired result, let’s put an example, if I have the list below

a = [
    ['c2sp1s5', 0],
    ['c2sp1s10', 1],
    ['c2sp1s11', 0],
    ['c2sp1s1', 0]
]

and I want it ordered that way

a = [
    ['c2sp1s1', 0],
    ['c2sp1s5', 0],
    ['c2sp1s10', 1],
    ['c2sp1s11', 0]
]

I need the list to be sorted as the example above, so I can make a comparison of the resulting list with another list that is already sorted as the list above, thus extracting my desired result. But if my list is sorted as follows below, I will have an incorrect result.

More precisely, I need to use them in the same format, where the internal list string is in the same position in both lists, so I can use the same data from position 1 of the internal list and extract my result.

I can’t get an ordered list like that using sorted(a, key=itemgetter(0)), because they result me in a list as follows

a = [
    ['c2sp1s1', 0],
    ['c2sp1s10', 1],
    ['c2sp1s11', 0],
    ['c2sp1s5', 0]
]

Is there any practical way to do this or will I have to implement ordination at hand?

Ricardo, you didn’t pay attention to the excerpt I commented on all the rules for ordering. Until then the result is, yes, expected, unless you explain in detail why it is not. The strings at all times will begin with 'c2sp1s'? If yes, should the sort occur only considering the last characters? If not, what are the possible values? Why not set the list with values using left zeros, such as 'c2sp1s001'?

– Woss

2017/12/13 at 21:22
@Andersoncarloswoss I tried to detail as much as possible why the ordering was not desired

– Ricardo Mendes

2017/12/13 at 21:43

1 answer

Browser other questions tagged python python-2.7 classification

You are not signed in. Login or sign up in order to post.

by Woss • **73,416** points · Answer 1 · 2017-12-13T21:04:05+00:00

As the question has not yet been perfectly clear - and apparently the author himself has not been able to explain - I will consider this answer strings independent of the format with one condition well-specified: where there is an integer value in the string, the classification should consider the numerical value of these characters and not more as text; this would imply, for example, that the string c2sp1s5 should appear before the string c10sp1s5, due to the presence of numerical values 2 and 10 in string and that usually 2 is less than 10.

For the implementation of this logic, I will create a function called magic, which, as the name suggests, will do magic with the classification. The function will receive a string to then separate it to each numerical value found, generating a list of strings, some with text only, others with numerical values; for example, with input c2sp1s5 will generate the list ['c', '2', 'sp', '1', 's', '5'], already the entrance c2sp1s10 will generate the list ['c', '2', 'sp', '1', 's', '11']. If we compare the two generated lists, we would have the same initial problem: each term of the lists would be compared one by one and the result would be exactly the same, because still '11' would be less than '2', then, before comparing the list, we should convert the numerical values to integers, resulting in the lists ['c', 2, 'sp', 1, 's', 5] and ['c', 2, 'sp', 1, 's', 11]; thus, when comparing the lists, in the latter term would be compared the integer values 2 and 11, returning 2 as less than 11.

The code would look like this:

def magic(value):
    parts = re.split(r'(\d+)', value)
    return [int(part) if part.isdigit() else part for part in parts]

The first line of the function divides the input into numerical values and the second returns a list by converting the numerical values into integers. As the function waits only one string, to use in the example given in the question, it is necessary to inform which will be the string which will be considered in the ranking of the list. In this case, it is the string present at index 0, so we do:

import re


def magic(value):
    parts = re.split(r'(\d+)', value)
    return [int(part) if part.isdigit() else part for part in parts]


a = [
    ['c2sp1s5', 0],
    ['c2sp1s10', 1],
    ['c2sp1s11', 0],
    ['c2sp1s1', 0]
]

print( sorted(a, key=lambda v: magic(v[0])) )

^{See working on Ideone | Repl.it}

What generates the result>

[
    ['c2sp1s1', 0], 
    ['c2sp1s5', 0], 
    ['c2sp1s10', 1], 
    ['c2sp1s11', 0]
]