Capture the WHERE clause of an SQL query

Asked

Viewed 212 times

5

I need to capture only the clause where several queries to analyze the filters used. For example:

select "DIM_1"."col1",
"DIM_2"."col2" ,
"DIM_3"."col3" , 
"DIM_4"."col4" , 
"FAT_1"."col5"     
from "FAT_1",
 "DIM_1",
 "DIM_2",
 "DIM_3",
 "DIM_4"
where  "DIM_1"."col1" IS NOT NULL 
AND "DIM_2"."col2" LIKE ('SUCCESS')
AND "DIM_3"."col3" BETWEEN 20161213 AND 20161222
AND "DIM_4"."col4" > 0

I created a list with SQL, and then tried applying regular expressions to extract the Where part, but unsuccessfully, follows below what I tried:

`for line in sql:`
    `if re.search(r'[where]\W',line):`
        `where.append(line)`

Unfortunately I could not extract only the part of Where, can tell me what mistake I made and how to fix?

  • 1

    Follows a regex that returns the Where clause: (?:WHERE\s|ORDER BY\s|GROUP BY\s|\Z)(.*). For it to work, the command needs to be all in one line. See if this helps you

  • Regex to get the Where clause when the sql command is idented: (?:WHERE\s|ORDER BY\s|GROUP BY\s|\Z)([^\n]*\n+)+(.*)

3 answers

3


I think what you want is to get the where.

First let’s check what can come after the command where.

Post-following commands of where

Second postgres.
Seeing only the most common

  • GROUP BY
  • HAVING
  • ORDER BY
  • LIMIT
  • OFFSET
  • "NOTHING" - It could just be where without any subsequent command.

REGEX

  • Pattern : (?<=where)(.*?)((ORDER BY|GROUP BY|HAVING|LIMIT|OFFSET|$).*)
  • flags : si

Explanation

  • (?<=where) - ensures that what we are researching comes after where
  • (.*?) - all that follows will be the clauses.
  • ((ORDER BY|GROUP BY|HAVING|LIMIT|OFFSET|$).*) - ensures that it will end in one of the commands or at the end($).
  • Flag : s - says that the .(dot) should include \n in research.
  • Flag : i - case-insensitive - allows searching uppercase or minuscule.

Examples

0

I don’t know Python, I don’t know how you do right to assign a select for a variable because of the quotes, but here is an example I found in the documentation applied to your case.

According to your case, I believe that it is not working because it is checking line by line, ie, will find the where on the line of where, but in the next.

Now if you take the group that corresponds to the regular expression, then I believe it works.

   import re

    text = 'select "DIM_1"."col1",
    "DIM_2"."col2" ,
    "DIM_3"."col3" , 
    "DIM_4"."col4" , 
    "FAT_1"."col5"     
    from "FAT_1",
     "DIM_1",
     "DIM_2",
     "DIM_3",
     "DIM_4"
    where  "DIM_1"."col1" IS NOT NULL 
    AND "DIM_2"."col2" LIKE ('SUCCESS')
    AND "DIM_3"."col3" BETWEEN 20161213 AND 20161222
    AND "DIM_4"."col4" > 0'

    m = re.search('where.*', text)
    if m:
        found = m.group(1)

If someone wants to edit for code correction, or if they don’t sense, comment that delete the answer.

Source: 7.2. re - Regular Expression Operations

  • Thanks David, I’ll try. Thank you very much!

0

select = 'select * from tabela where campo = 1 and and campo2 = 2'
print (select[select.find('where'):])  # com o where incluso

or

 print (select[select.find('where') + 5:]) # ou para excluir a palavra where

Browser other questions tagged

You are not signed in. Login or sign up in order to post.