1
Friends, I have a csv file with 5k lines in which there are purchase transactions. There is an id for each purchase, it occurs that for a lot where several purchases transactions are made there is an id that always starts with the same numerical sequence, but before the end of this sequence there is a number that identifies a purchase within the batch report. Example:
Person bought 5 items:
000034200100
000034200200
000034200300
000034200400
000034200500
If these sequences lived in this order would be wonderful, it turns out that these sequences come scattered in this 5k-line file.
How can I group these lots together in order to put everything together as in my example? I want to do this in python okay.
I thought about clusters but I don’t know if it’s a good idea.
Do you already have something of this code done? Can you include an example of the complete data that needs to be organized? The result would be another
.csv
?– Daniel
I don’t have anything in code ready yet. I received this yesterday from a customer and basically is a csv with that column of id, description, origin, destination, value, fare, and etc. The output can be a csv or an xlsx.
– Bene
What is csv’s astronomer? the way you asked the question makes it difficult to understand the context, I suggest editing the question and putting a fragment of csv to 'clear' a little more, it’s kind of obscure. :-)
– Sidon
csv is separated by commas and opening in excel the first column is id, soon after comes description, quantity, origin, destination, value, tariff.... It is a common columnar structure, nothing very different from conventional tables.
– Bene
The file has many columns and to put here a piece will be bad, but basically it is a normal table without any kind of different structure.
– Bene
With its description we cannot understand how these sequences are "scattered". If you want to post csv (or part of it) on a Storage any (google drive, Expirebox or File Town and put the link here that I try to help. I’m here having fun with python. :-)
– Sidon
Okay. I will provide, but when I said spread, I meant that in the column id these numbers ( as I described) does not come in sequence, IE, purchases of a lot does not come with the ids in sequence, but comes out of order. If the guy buys 5 items, the ids don’t come in ordered sequence...?
– Bene
No, not to understand so, I need to know the "structure" of csv, so it gets very obscure. If in msg you colcasse a 2 or 3 lines of csv with an example and explanation, probably I would understand.
– Sidon