"Deep Copy" without importing the "copy" module?

Asked

Viewed 254 times

7

Is there any other way to perform a "Deep Copy" of Python objects without using method deepcopy() of the standard library copy ?

The code below could be written differently ?

import copy

class Xpto:
    pass

lst = [ Xpto(), Xpto() ]
copy = copy.deepcopy(lst)

print( "Original: %s = %s" % (id(lst),[ id(i) for i in lst ]))
print( "Copia   : %s = %s" % (id(copy),[ id(i) for i in copy ]))

Possible exit:

Original: 140635532213960 = [140635657043304, 140635657043248]
Copia   : 140635532235400 = [140635657043528, 140635657043472]
  • 4

    Did you ever analyze the source code of this library to be as it is done?

  • 1

    And why do you want to do it no matter the method copy? many of these modules from the standard Python library are already used internally by Python itself - you have no penalty for importing the module copy.

  • @jsbueno: You’re right! The module copy is part of the standard Python library. However, the intent of the question is purely "didactic", and perhaps a good answer can broaden horizons of the whole community on the subject and the internal mechanisms of language.

2 answers

5


The copy module and its main functions copy.copy and copy.deepcopy are very useful, and should be used without restrictions whenever you need their functionalities.

An interesting thing, both from copy and deepcopy is that they do more than copy structures of dictionaries, lists, numbers and strings - that is, the type of data that we serialize as JSON in modern applications: they actually make copies of arbitrary Python objects, including instances of user-created classes.

And this involves a series of "corner cases" - roughly speaking, the "copy" of an arbitrary Python object can be done by explicitly calling the method __new__ of the class of an object, and then assigning itself to the __dict__ of the new instance created a copy of the __dict__ of the first object:

class MinhaClasse: 
   ...

m = MinhaClasse()
m.atributo = 10

m1 = m.__class__.__new__()
m1.__dict__ = m.__dict__.copy()

Okay, M1 is a copy of m, with "depth 2"

That said, in general when we think about deepcopy what we have at hand is just a data structure like the one I mentioned above: an arbitrary set of primitives within dictionaries and lists. A version of deepcopy that works only for this can be done recursively - whenever it finds a list or dictionary, it calls itself, otherwise it assigns the new object:

def mycopy(data):
    if isinstance(data, dict):
        result = {}
        for key, value in data.items():
            result[key] = mycopy(value)
    elif isinstance(data, list):
        result = []
        for item in data:
            result.append(mycopy(item))
    elif isinstance(data, (int, float, type(None), str, bool)):
        result = data
    else:
        raise ValueError("Unrecognized type for mycopy function")
    return result

This "mycopy" function is all that is required to create JSON structure deepcopies. It can be improved for arbitrary Python daods structure containing tuples and sets - but to make it work with arbitrary containers, for example custom classes, would still lack a lot.

In short, as the code for a simple "deepcopy" may be something instructive to understand, I included mine there, but as it is in the comments, the recommendation is really to use the copy and deepcopy

  • Thank you, your reply was of great value!

2

From the @jsbueno user response, I was able to better understand the complexity of the subject I’m dealing with.

After many tests and different approaches, finally I arrived at the following code:

def my_deepcopy(data):
    if isinstance(data, dict):
        result = {}
        for key, value in data.items():
            result[key] = my_deepcopy(value)

        assert id(result) != id(data)

    elif isinstance(data, list):
        result = []
        for item in data:
            result.append(my_deepcopy(item))

        assert id(result) != id(data)

    elif isinstance(data, tuple):
        aux = []
        for item in data:
            aux.append(my_deepcopy(item))
        result = tuple(aux)

        assert id(result) != id(data)

    elif isinstance(data, (int, float, complex, type(None), str, bool )):
        result = data

    elif isinstance(data, set ):
        result = set(data)

        assert id(result) != id(data)

    elif isinstance(data, bytearray ):
        result = bytearray(data)

        assert id(result) != id(data)

    elif hasattr( data, '__name__' ):
        result = data

    elif hasattr( data, '__class__'):
        aux = {}
        result = data.__class__()

        for k, v in data.__dict__:
            aux[k] = my_deepcopy(v)
            assert id(aux[k]) != id(v)

        result.__dict__ = aux

        assert id(result) != id(data)

    else:
        raise ValueError("unexpected type")

    return result

# Funcao
def FooBar():
    return "FooBar"

# Classe
class Xpto:
    pass

# Tipos para teste
lst_obj = [ 0, 1.1, 'foo', 'bar' ]
dict_obj = { 'zero' : 0, 'pi' : 3.1415, 'desc' : 'foobar' }
list_list_obj = [ [1,2,3], [4,5,6], [7,8,9] ]
tuple_list_obj = [ (-1,-1), (0,-1,0), (-1,0), (0,0,0,0) ]
dict_list_obj = [ {'zero' : 0}, {'pi' : 3.1415}, {'desc' : 'foobar'} ]
list_set_obj = [ set([1,2,3]), set([1,2,3])]
list_bytearray_obj = [ bytearray([1,2,3]), bytearray([1,2,3])  ]
list_func_obj = [ FooBar, FooBar ]
list_arbitrary_obj = [ Xpto(), Xpto() ]

# Testando
my_deepcopy( lst_obj )
my_deepcopy( dict_obj )
my_deepcopy( list_list_obj )
my_deepcopy( tuple_list_obj )
my_deepcopy( dict_list_obj )
my_deepcopy( list_set_obj )
my_deepcopy( list_bytearray_obj )
my_deepcopy( list_func_obj )
my_deepcopy( list_arbitrary_obj )

Browser other questions tagged

You are not signed in. Login or sign up in order to post.