Conversion of timestamp epoch file name to date

Asked

Viewed 208 times

3

Expensive;

I have a list of files (140 thousand) with date in the format Unix epoch timestamp in the file name. I need to convert each file to match its actual date by changing its name. Example: 1475279740.15044_xxx.xxx.Stats, where the epoch timestamp is 1475279740, converting gives 2016-09-30 (2016-09-30_xxx.xxx.Stats).

I have the list of files with their names in timestamp and another file with the list of names already converted, both in txt. However, I need to change/move the file containing the timestamp to the converted files.

I imagine having two for where one opens the timestamp file list and the other for which opens the converted files and then would just change/move with a simple command mv.

To test, I created these two bonds for, but only the second variable is changed in sequence, the first becomes static.

Follow the example code:

 for x in $(cat timestamp.txt)
     do
         for y in $(cat timestamp-conv.txt)
       do

           echo $x convertido para $y

    done
 done

Expected code output:

  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-24
  1474566212 convertido para 2016-09-25
  1474566212 convertido para 2016-09-25

The two lists are identical, line by line between the two beat the timestamp with the other already converted.

I have tried in many ways, without success!

Can you help me?

  • And the time, won’t you need it? It’s possible that two files have the same date, so what would differentiate them would be the time.

  • No time required, only date same.

  • The list with the name of the files is on timestemp.txt? One per line? You want to copy the file 1475279740.15044_xxx.xxx.stats to the archive 2016-09-30_xxx.xxx.stats, for example, and print out a report at the end of all converted files, that’s it?

  • I have a directory with 140,000 log files. Each log file is timestamped from the day it was saved. So of these 140,000 files, I have a September file on another one. As the script that does this on a daily basis, we only saw after almost two months, we will have to filter each file by its current date in timestamp, converting from timestamp to real date and save in a directory with this date. So, I need to know the days of all files to make this conversation: from 1475279740.15044_xxx.xxx.Stats to 2016-09-30_xxx.xxx.Stats. I don’t know if it’s clear, in case I’m not sorry !!

1 answer

1


I believe this script can do the job:

#!/bin/bash

exec 3< timestamp.txt

while read arq <&3; do
    epoch=$(echo $arq | awk '{ print $1 }' FS="_")
    filenameend=$(echo $arq | awk '{ print $2 }' FS="_")
    date=$(date --date="@$epoch" +%Y-%m-%d)
    mv ${arq} ${date}_${filenameend} && echo ${arq} convertido para ${date}_${filenameend}
done

exec 3<&-
  • You said you don’t need to differentiate by the time, so you’ll lose files if you have more than one on the same day: the last one on the list will overwrite the previous(s). Careful with that!

  • Hello Thiago, actually, when I make sure that it is working, I will add at the end of each converted file a string to differentiate, so I was using echo. What interests me is the content of each file, and not its name. It leaves me a doubt, you set in your While the file timestemp.txt. Where does the other converted file enter? Or will its code convert the files on time without the need for a secondary file? Grateful for your interest in helping.

  • So, it already moves the original file to the new file, with the date formatted as you wanted. No need to create another file to control name conversion.

  • Great Thiago, I will test your code. Big hug!

  • Ok! Then give feedback here with upvote or any questions you have to help others who have the same problem.

  • Opa Tiago, it worked, tested and worked as expected. I thank you! Taking advantage, I have some doubts regarding your code for my learning. This serves as a delimiter (FS="_") ? What is this command <&3 for ? .

  • The FS="_" serves as the delimiter setting for the command awk, as you supposed! So when we direct the output of the command echo $arq pro awk, the same separates into strings by the defined tab. So, passing the command { print $2} in the sequence we are referring to the second string, or the string just after the first "_" (separator).

  • The command exec 3< timestamp.txt will create a new file Descriptor and link the timestamp.txt file to it. This has to do with your requirement, which is to process a large number of lines from the file timestamp.txt. If it didn’t, we would upload the entire file in memory and there may be a performance loss of your script. So, using a file Descriptor, the file lines can be read one by one and processed the same way, this is what happens in the command read arq <&3. There are other solutions, this is just one of them. The other solutions involve using the command xargs.

  • Some Linux commands may also not accept a very large list of arguments and may require such a solution, using xargs or file Descriptors. I hope I’ve cleared your doubts!

  • Pohhh that’s right Thiago, sometimes I come across List Too long and I have to do some tricks to work, but never knew of an answer/solution to this case. This (FS="_"), is the same thing to use the cut with delimiter " _ " ? . With your explanation, it’s already clearer. Thanks and big hug.

  • Yes, the FS is the delimiter ;-)

Show 6 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.