Extract information from a text file with shell script and regular expressions

Asked

Viewed 851 times

0

I want to make a script with Shell script that extracts from a text file emoticons, for example ;), :), :3, :(, xD and also count the emoticons of each sentence. A sentence is declared positive if the sum of the positive emoticons exceeds 2.5 times the sum of the negative emoticons if present, I made that code that counts the amount of positive, please as I would finish that script?

#!/usr/bin/ksh
file="${1}"

while IFS= read -r line
x=0
do
    let "x=x+1"

    qtd=$(echo "$line" | sed -r 's/ /\n/g' | grep "POSITIVO" | wc -l)
    echo "SENTENÇA $x \t - POSITIVO - $qtd"
done <"$file"

1 answer

1

I’m not sure I fully understand your request...

In the example below the polarity of a file is calculated.

Actually this is bash but with small changes should work as ksh...

#!/bin/bash
pos=$(grep -oP ':\)|;\)|:D' $1 | wc -l)     ## juntar outros positivos
neg=$(grep -oP ':\('        $1 | wc -l)     ## juntar outros negativos

if (( $pos * 2 > $neg * 5 )) ;then
   echo "Positivo "
else
   echo "Negativo"
fi

UPDATE: to give the polarity of each line we can for example join with the version proposed by PO:

#!/bin/bash

while read line
do
  pos=$(<<<$line grep -oP ':\)|;\)|:D' | wc -l)   ## juntar outros positivos
  neg=$(<<<$line grep -oP ':\('        | wc -l)   ## juntar outros negativos

  echo $line $pos $neg
  if (( $pos * 2 > $neg * 5 )) ;then
     echo "Positivo "
  else
     echo "Negativo"
  fi

done <"$1"
  • Almost that, I just wish there was a loop that reads line by line from a text file, and identifies whether that phrase is positive or negative.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.