Mass collection of information from files

Asked

Viewed 56 times

1

I intend to collect ctime, atime, mtime and crtime from a considerable mass of files.

I have assembled the following script as a partial solution:

sudo debugfs -R 'stat <1055890>' /dev/sda1|awk -F': ' -v c='' -v a="" -v m="" 'BENGIN {} $1==" ctime" {c=$2} $1==" atime" {a=$2} $1==" mtime" {m=$2} $1=="crtime" {print c, a, m, $2}'

debugfs 1.42.13 (17-May-2015) 0x5ade9510:c7eb0e9c -- Mon Apr 23 23:23:12 2018 0x5b05601f:111ab67c -- Wed May 23 09:35:43 2018 0x5ade9510:c7eb0e9c -- Mon Apr 23 23:23:12 2018 0x5ade9510:c7eb0e9c -- Mon Apr 23 23:23:12 2018

I intend to capture the information as follows:

Mon Apr 23 23:23:12 2018, Wed May 23 09:35:43 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018

Where they will be respectively ctime, atime, mtime and crtime in a future csv.

How can I treat the variables to catch only after " -- "?

3 answers

1

You can use a script awk to solve your problem (script.awk):

BEGIN {
    OFS = ",";
    FS = " -- ";

    print "ctime,atime,mtime,crtime"
}
{
    for( i = 0; i < 4; i++ ){
        split( $(i+2), a, " 0x" );
        b[i] = a[1];
     }

     print b[0], b[1], b[2], b[3];
} 
END {
}

Use:

debugfs -R 'stat <1055890>' /dev/sda1 | awk -f script.awk > saida.csv

All in one line:

debugfs -R 'stat <1055890>' /dev/sda1 | awk 'BEGIN{OFS=",";FS=" -- ";print "ctime,atime,mtime,crtime"}{for(i=0;i<4;i++){split($(i+2),a," 0x");b[i]=a[1]};print b[0],b[1],b[2],b[3];}' > saida.csv

Exit (saida.csv):

ctime, atime, mtime, crtime
Mon Apr 23 23:23:12 2018,Wed May 23 09:35:43 2018,Mon Apr 23 23:23:12 2018,Mon Apr 23 23:23:12 2018

1

Summarizing my answer, the full command would be:

sudo debugfs -R 'stat <1055890>' /dev/sda1 | 
awk -F'--' '{gsub(/0x[a-z0-9:]+/, ""); print $2, $3, $4, $5}' | 
sed -r 's/^ //;s/  +/, /g'

Full answer:

With awk it is not strictly necessary to assign arbitrary variables, since the positional values already serve your purpose of enabling the return only of what you want.

Thus, defining the field separator as the "-" string, through the flag -F, just give one print in the desired fields. In the example below I executed a gsub to also remove hexadecimal values:

[stat...] | awk -F'--' '{gsub(/0x[a-z0-9:]+/, ""); print $2, $3, $4, $5}'

> Mon Apr 23 23:23:12 2018    Wed May 23 09:35:43 2018    Mon Apr 23 23:23:12 2018    Mon Apr 23 23:23:12 2018

Finally, to eliminate unwanted spaces and add the commas between fields, a sed quick:

sed -r 's/^ //;s/  +/, /g'

The full command, according to the question, would be:

sudo debugfs -R 'stat <1055890>' /dev/sda1 | 
awk -F'--' '{gsub(/0x[a-z0-9:]+/, ""); print $2, $3, $4, $5}' | 
sed -r 's/^ //;s/  +/, /g'

Mon Apr 23 23:23:12 2018, Wed May 23 09:35:43 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018


For performance purposes, if you intend to stat of many files, I suggest saving the full result of all stat executions in one file and then running this awk | sed once, passing the Stats file as input to the awk:

awk -F'--' '{gsub(/0x[a-z0-9:]+/, ""); print $2, $3, $4, $5}' arquivo-de-stats | 
sed -r 's/^ //;s/  +/, /g'

0

Using the awk, you can set as delimiter the set "-- ".

A simple script using your command as an example would be:

#!/bin/bash

var=$(sudo debugfs -R 'stat <1055890>' /dev/sda1|awk -F': ' -v c='' -v a="" -v m="" 'BENGIN {} $1==" ctime" {c=$2} $1==" atime" {a=$2} $1==" mtime" {m=$2} $1=="crtime" {print c, a, m, $2}')

ctime=$(echo $var | awk -F"-- " '{print $2}'| rev | cut -d" " -f3-7 | rev)
atime=$(echo $var | awk -F"-- " '{print $3}'| rev | cut -d" " -f3-7 | rev)
mtime=$(echo $var | awk -F"-- " '{print $4}'| rev | cut -d" " -f3-7 | rev)
crtime=($echo $var | awk -F"-- " '{print $5}')
echo "$ctime", "$atime", "$mtime", "$crtime"

Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018

Explanation:

No awk, the flag -F indicates which delimiter will be used, and the print indicates which column will be displayed. As the result of this parse brings a set of unwanted characters, it was necessary to use the cut in conjunction with the rev to remove the last column and keep only the date. Note: For the crtime, as there is no last column, we can dispense with the cut in the filter.

If you are going to run the command many times, I recommend encapsulating the solution below in a function:

function parse_date {
  ctime=$(echo $1 | awk -F"-- " '{print $2}'| rev | cut -d" " -f3-7 | rev)
  atime=$(echo $1 | awk -F"-- " '{print $3}'| rev | cut -d" " -f3-7 | rev)
  mtime=$(echo $1 | awk -F"-- " '{print $4}'| rev | cut -d" " -f3-7 | rev)
  crtime=$(echo $var | awk -F"-- " '{print $5}')

  echo "$ctime", "$atime", "$mtime", "$crtime"
}

Then you would call the function by passing the result of your command as argument:

parse_date "$var"

Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018, Mon Apr 23 23:23:12 2018

Browser other questions tagged

You are not signed in. Login or sign up in order to post.