PHP hashtag system

Asked

Viewed 996 times

4

I’m developing a hashtag system in PHP, and my question is this::

There’s how I make one SELECT publications based on the content?

For example, the content saved in the database is:

Hi, are you all right? #bomdia

In no case the SELECT would be something like SELECT * FROM msgs WHERE conteudo LIKE %#bomdia% ?

I have seen several articles where, when saving the post in the database, all the detected hashtags are saved in a separate column in the table, for example "hashtags".

What’s the best way to do that?

  • The best is to separate and save each hashtag referring to the post. I just don’t quite understand your doubt. Problem to identify the hashtag in the text?

  • The doubt itself is about the best way to do it, or both result in the same?

  • The best is to separate a field only for hashtags because the user will be able to do more filtered searches... I would do so...

  • I get it, so I guess I’ll do it anyway... thank you!

1 answer

6


As mentioned in the comments, the ideal is to separate the tag from the message. I would go a little further and make a table only for tags, taking into account that a message may have multiple or no tag:

DER

Notice I added an index like UNIQUE tag name, to avoid duplicates and speed up searches.

To extract tags from the message use the preg_match_all(), saving in the database only the tag text, without the wire (#):

<?php

class Mensagem {

    protected $mensagem;
    protected $tags = [];

    public function __construct($messagem)
    {
        $this->mensagem = $messagem;
        $this->extractTags($messagem);
    }

    private function extractTags($mensagem)
    {
        // Casa tags como #dia #feliz #chateado
        // Não casa caracteres especias #so-pt
        $pattern = '/#(\w+)/';

        // Alternativa para incluir outros caracteres
        // Basta incluir entre os colchetes
        //$pattern = '/#([\w-]+)/';

        preg_match_all($pattern, $mensagem, $tags);

        // Utiliza o vetor com os grupos capturados entre parenteses
        $this->tags = $tags[1];
    }

    public function getMensagem()
    {
        return $this->mensagem;
    }

    public function getTags()
    {
        return $this->tags;
    }

}

To use the class:

$mensagem = "Partiu #ferias #praia #feliz #so-pt";

$msg = new Mensagem($mensagem);

var_dump($msg);

//Retorna:

object(Mensagem)#1 (2) {
  ["mensagem":protected]=>
  string(35) "Partiu #ferias #praia #feliz #so-pt"
  ["tags":protected]=>
  array(4) {
    [0]=>
    string(6) "ferias"
    [1]=>
    string(5) "praia"
    [2]=>
    string(5) "feliz"
    [3]=>
    string(5) "so"
  }
}

For the search use the following SELECT:

SELECT mensagem FROM mensagens
JOIN mensagens_tags ON mensagens_id_mensagem = id_mensagem
JOIN tags ON tags_id_tag = id_tag
WHERE nome = 'tag';

Example of the bank on sqlfiddle.

The next steps are to build the routine that will persist the tags in the database (insert the new tags and keep the old ones) and the system for the search. This I leave with you.

  • Thank you very much for the more than complete answer, but I had some questions: you said you would make a table only for tags, but, I do not see why, would be a table only for registration of words? In relation to class Mensagem, how he can return that being that you don’t make any return in the __construct, only in the duties getMensagem and getTags that are not even used? Thank you and sorry for my ignorance if that is the case.

  • It is because of var_dump, it is a special function for dump objects. To use with echo you use getMensage and getTags

  • I get it, thank you!

  • On the separate table, the idea is even normalization. So we do not have for example tag duplication. In the long run will facilitate the search as well

  • And if, in the future, I wanted to count the most used tags (similar to Twitter’s tranding tops), what would be the best way to go? Thanks again.

  • From the posted query, just one Count() and group by tag name, without the Where part.

  • I get it, thank you very much! If you encounter problems return here to ask for help, I’m still a little confused about some things, mainly the use of JOIN since this is the first time that I use the same, but I believe that in practice I can understand better.

  • Igor, the idea is that you create other questions. The stack overflow template is different from a forum. More info on [tour]

  • I’ll do it. Thank you.

  • There are other questions here that can help you. Just search a little if you can’t find the answer ask a new question.

Show 5 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.