How does PHP arrays work internally?

Asked

Viewed 119 times

9

PHP handles arrays in a way different from other languages, apparently there are concepts of hashtable to associate values. How it works internally in the core language arrays?

  • Your post I come very handy with this https://answall.com/questions/331276/como-o-foreach-do-php-function/332160?noredirect=1#comment676248_332160 So far I have not reached a conclusion(even by the rush)

1 answer

8


PHP has something called array associative and internally is some form of hashtable even, therefore it can have sparse keys, of different types of data and has complexity O(1) for almost all basic operations, as well as the array normal. In fact this complexity is typically O(1), it may be up to O(n), but in practice it does not even come close to happening. Obviously the index key is calculated by a function hash language standard or specialized in its primitive types.

In fact almost everything in the general data structure and memory organization is controlled by scattering tables (hashtable), of analogous form with Javascript, but this is not as visible and as linear as in JS that everything is an object created upon a hashtable.

There is an optimization to get the order since an object hashtable pure cannot show data in a specific order, so there is an extra cost of storage for the scan in order to work. See how the current structure is (if not already changed), it has pointers to be able to maintain the entry order through a linked list of Buckets:

typedef struct _hashtable {
    uint nTableSize;
    uint nTableMask;
    uint nNumOfElements;
    ulong nNextFreeElement;
    Bucket *pInternalPointer;
    Bucket *pListHead;
    Bucket *pListTail;
    Bucket **arBuckets;
    dtor_func_t pDestructor;
    zend_bool persistent;
    unsigned char nApplyCount;
    zend_bool bApplyProtection;
#if ZEND_DEBUG
    int inconsistent;
#endif
} HashTable;

And the bucket which shows how inefficient it is (it is almost unbelievable the size of the waste, in architecture 64 bits each entry occupies 72 bytes only in the table, still has additional cost in each element that is close to this):

typedef struct bucket {
    ulong h;
    uint nKeyLength;
    void *pData;
    void *pDataPtr;
    struct bucket *pListNext;
    struct bucket *pListLast;
    struct bucket *pNext;
    struct bucket *pLast;
    const char *arKey;
} Bucket;

In general the values are allocated very inefficiently in PHP, the arrays could not be different, although have had an optimization effort recently.

For example, I found that every value is stored like this:

typedef union _zvalue_value {
    long lval;  /* long value */
    double dval;  /* double value */
    struct {
        char *val;
        int len;
    } str;
    HashTable *ht;  /* hash table value */
    zend_object_value obj;
} zvalue_value;

I put in the Github for future reference.

In 64 bits this structure will have 16 bytes because the len is there and not next to the string in itself, then any value other than string will waste 8 bytes because of alignment. Clearly the structure was designed for 32 bits that wastes nothing. So running PHP in 64 bits can be almost double the memory consumption that in 32 bits, for zero gain. Maybe one day they’ll work it out.

Nor can we give so many details precisely because the internal functioning is detail of implementation, and nothing prevents it from working quite differently.

And if all this is strange to you, I suggest you start studying a little more computer science, especially data structure. A little C helps understand these codes of Internals of PHP.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.