SQL Server - Return only the searched word

Asked

Viewed 90 times

1

I have a problem where the Cód. below returns not only the searched words, but tb words that have this word in their composition.

Ex: The term is Rio, but also finds terms as salary.

In the link below, you can see my first question on the subject. SQL Server - calculate number of word occurrences in a column, per record

DECLARE @conteudo_dcm TABLE (DCM_id INT IDENTITY(1,1), DCM_conteudo VARCHAR(100))

INSERT INTO @conteudo_dcm VALUES ('Rio e a RIO vai se repetir então teremos 2 :D')
INSERT INTO @conteudo_dcm VALUES ('um Rio apenas... então teremos 1 salario :D')
INSERT INTO @conteudo_dcm VALUES ('tres pra este caso. Rio, Rio, Rio XD.. então 3.. blz?')

declare @palavra varchar(30) = 'Rio',
        @cont int,
        @result int,
        @palavraLike varchar(30) = '%Rio%'
SELECT
    [DCM_conteudo],
    (LEN([DCM_conteudo]) - LEN(REPLACE([DCM_conteudo], @palavra, ''))) /
            LEN(@palavra)
FROM @conteudo_dcm
where upper([DCM_conteudo]) like @palavraLike-- and [DCM_id] = 1
  • It seems to me that what you need is: 1) if it is the beginning of the field then 'river'; 2) if it is the end of the field then ' river' and if it is in the middle of the field then 'river'. It may be that the ' character is replaced by some punctuation character. I don’t know if in SQL-Server it is possible to use regular expressions that would certainly make your life easier.

1 answer

2


You can create a function that counts the word occurrences you want:

CREATE FUNCTION contar_ocorrencias(@palavra VARCHAR(30),
                                   @frase   VARCHAR(100))
RETURNS INT
BEGIN
  DECLARE @inteiro INTEGER;
  DECLARE @inicio  INTEGER;
  DECLARE @meio    INTEGER;
  DECLARE @fim     INTEGER;
  DECLARE @tamanho INTEGER;
  DECLARE @total   INTEGER;
  DECLARE @posicao INTEGER;

  SET @tamanho = LEN(REPLACE(@palavra, ' ', '.'));
  SET @total   = LEN(REPLACE(@frase, ' ', '.'));

  WHILE 1 = 1
  BEGIN
    DECLARE @copia   VARCHAR(100);

    SET @copia = @frase;
    SET @inteiro = PATINDEX(@palavra, @frase);

    if @inteiro > 0
    BEGIN
      RETURN 1;
    END;

    SET @inicio  = PATINDEX(@palavra + '[^a-zA-Z]%', @frase);

    IF @inicio > 0
    BEGIN
      SET @frase = SUBSTRING(@frase, @tamanho + 1, LEN(REPLACE(@frase, ' ', '.')) - @tamanho + 1);
    END;

    SET @meio    = PATINDEX('%[^a-zA-Z]' + @palavra + '[^a-zA-Z]%', @frase);

    IF @meio > 0
    BEGIN
      SET @frase = SUBSTRING(@frase, 1, @meio) + SUBSTRING(@frase, @meio + @tamanho + 1, LEN(REPLACE(@frase, ' ', '.')) - @meio + @tamanho);
    END;

    SET @fim     = PATINDEX('%[^a-zA-Z]' + @palavra, @frase);

    IF @fim > 0
    BEGIN
      SET @frase = SUBSTRING(@frase, @fim, LEN(REPLACE(@frase, ' ', '.')) - @tamanho + 1);
    END;

    IF @copia = @frase
    BEGIN
      BREAK;
    END;
  END;

  RETURN (@total - LEN(REPLACE(@frase, ' ', '.'))) / @tamanho;
END;
GO

And to use:

DECLARE @conteudo_dcm TABLE (
  DCM_id INT IDENTITY(1,1),
  DCM_conteudo VARCHAR(100)
);

INSERT INTO @conteudo_dcm VALUES ('Rio e a RIO vai se repetir então teremos 2 :D');
INSERT INTO @conteudo_dcm VALUES ('um Rio apenas... então teremos 1 salario :D');
INSERT INTO @conteudo_dcm VALUES ('tres pra este caso. Rio, Rio, Rio XD.. então 3.. blz?');

SELECT DCM_conteudo,
       dbo.contar_ocorrencias('rio', DCM_conteudo)
  FROM @conteudo_dcm
  • Thank you! A tip for anyone using the database fields like nvarchar(max): Change the type of the @phrase variable, and all variables that are like "integer", change to "bigint" !

Browser other questions tagged

You are not signed in. Login or sign up in order to post.