Give same value independent of the order

Asked

Viewed 69 times

3

I needed to somehow generate a "hash" from a string but that the same hash is obtained independent of the order of the two strings

1234+4321  = asdfghjkl
4321+1234  = asdfghjkl

this is possible with PHP?

  • 2

    The input would be two strings and the output a hash? You could always use the smaller one after the larger one (comparing them lexicographically) or alternatively hash each of them and then combine them with a xor or +.

  • 1

    @mgibsonbr Yes.. two strings that in theory would be just numbers but could simply be words too.. could explain your idea in a reply?

  • The hash is not for that purpose but returns the same value for a known entry, so it is used for file integrity checking and etc...

  • 1

    Yes, I can. Just clarify one thing: you’re thinking of a hash table hash or a cryptographic hash?

  • @Wellingtonsilvaribeiro I understand, I referred to the hash why it mixes the initial presets..

  • @mgibsonbr to thinking about a regular hash, not focused on cryptographic use but only to scramble the initial data.. like md5 or sha1, I don’t care..

  • @user3163662 I don’t think you understand: the hash is used in the hash table to index a data (i.e. if you want to put the strings in a collection and want to quickly check whether or not these strings are in this collection and/or map them to some other data, you use a hash type). Or the hash can be used, as pointed out by Wellington, to map its input to a single string, for the purpose of ensuring integrity or perhaps uniqueness. And this is usually done using another type of hash, type MD5, SHA, etc.

  • @mgibsonbr got a bit confused, but what I want is to join the string A + B and regardless of the order in which I present either A+B or B+A the scrambled code with sha1/md5.. is the same

  • 1

    @user3163662 Ok, I’ll write an answer then. It’s just that at first I thought you wanted something like this or this.

Show 4 more comments

2 answers

3


I see two options: 1) you compare both strings, and concatenate them so that the "minor" (in lexicographic order) is always first, and then hash the concatenated strings; 2) you hash each string independently, and then combines the results using some commutative operation (like a xor). Examples:

function hashPar1($a, $b) {
    if ( strcmp($a, $b) > 0 ) {
        $temp = $a;
        $a = $b;
        $b = $temp;
    }
    return hash("sha256", $a . $b);
}

function hashPar2($a, $b) {
    $ha = hash("sha256", $a);
    $hb = hash("sha256", $b);
    $ret = "";
    for( $i=0; $i<strlen($ha); $i++) {
        ret .= chr(ord($ha{$i}) ^ ord($hb{$i}))
    }
    return ret;
}

Note: I suggest the second option as the first method has the disadvantage of a larger number of collisions. For example, the pair foo+bar would produce the same hash as the pair fo+obar or f+oobar or fooba+r (but not foob+ar). This problem does not occur in the second method, since different strings would have totally different hashes [with high probability].

Updating: I just read a comment saying that in PHP you can xor two strings directly, you don’t need to break it into characters and convert to and from numbers. If this is correct (not tested) then the code of option 2 can be simplified to:

$resultado = hash("sha256", $string1) ^ hash("sha256", $string2);

0

Sort the String and use the hash function. See the code below

>>> a = '1234+4321'
>>> b = '4321+1234'
>>> a
'1234+4321'
>>> b
'4321+1234'
>>> sorted_a = ''.join(sorted(a))
>>> sorted_a
'+11223344'
>>> sorted_b = ''.join(sorted(b))
>>> sorted_b
'+11223344'
>>> hash(sorted_a)
6594644838925616234
>>> hash(sorted_b)
6594644838925616234

Note that the above code works for letters and numbers. The utuilized procedure was the following:

  1. Variables 'a' and 'b were created'.
  2. Variables 'a' and 'b' have been ordered so that they are the same if they have the same letters.
  3. Their hash has been calculated. If they have the same letters, they will have the same hash value.

Note: the original question does not specify the lingaugem, so I did it in python.

  • From what I understand what AP wants is for two specific strings to map to the same value, regardless of the order, and not for any strings with the same characters to do so. Your code would produce the same hash for strings 1234, 1243, 1324, 1342, 1423, 1432, 2134, ...

  • 1

    I understood otherwise. For any Strings with the same letters, they must produce the same hash value.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.