Handling dates in AWK scripts

Asked

Viewed 479 times

3

CONTEXT: I have a CSV file from which I want to extract and process some columns. I realized that AWK would be the perfect tool for this and everything was fine until I had to deal with timestamps - for example 2008-07-31T21:42:52.667

PROBLEM 1: I need to calculate the amount of days that have passed between a base date (say 2008-07-31) and all the timestamps in the first column of the input file.

DETAIL: I know I can do difference operations in case I can use the command date of BASH, because with the following command I get the amount of seconds passed since the base date of the date system:

date -d"2008-07-31" +%s #RESULTADO: 1217473200s

This way my problem can be reduced to the following:

PROBLEM 2: How to execute a bash command from within AWK?

2 answers

3

OK. Here’s an answer to PROBLEM 2, which has already solved my problem in general, but perhaps has some other more legal solution.

I can execute a command bash in GAWK using the following construction:

STRING_COMANDO | getline RESULTADO_VAR_NOME

So I wrote the following script to get the first column of a file - which has a timestamp - and calculate the difference of the base date in seconds.

#!/usr/bin/gawk -f
BEGIN {
  base_date="2008-07-31"
  #Comando para obter a quantidade de segundos da data base
  "date -d\""base_date"\" +%s" | getline first_day
  print "BASE: " base_date " -> " first_day
  #Variáveis utilizadas para evitar execuções BASH repitidas
  #Só ajudou pois sei que meu arquivo só tem datas em sequência crescente
  now_date="2008-07-31"
  now_day=first_day
}
{
  #Crio uma variável temp = [DATA,HORA]
  split($1,temp,"T")
  #Só invoco o BASH novamente se a data mudou
  if(temp[1] != now_date){
    now_date=temp[1]
    "date -d\""now_date"\" +%s" | getline now_day
  } 
  print now_date", " now_day", "now_day-first_day
}

1

In relation to date arithmetic I present below a hypothese of this time with perl

Assuming the F file looks like this:

timestamp | legume | preço
2008-07-31T21:42:52.667 | batatas | 30
2008-08-31T21:42:52.667 | cebolas | 40

for demonstration I will add a first column with the days until Christmas

perl -MDate::Simple=date -pe '
  print /(\d{4}-\d\d-\d\d)(?=T)/ ? date("2016-12-25")-date($1) : "","|"' F

gives:

|timestamp | legume | preço
3069|2008-07-31T21:42:52.667 | batatas | 30
3038|2008-08-31T21:42:52.667 | cebolas | 40

Browser other questions tagged

You are not signed in. Login or sign up in order to post.