Mysql has no function that allows us to split a string in multiple lines, so the work becomes a little complex:
INSERT INTO press_tags (press_id, tag_id)
SELECT
press.press_id,
SUBSTRING_INDEX(SUBSTRING_INDEX(press.tag_id, ',', n.n), ',', -1) tag_id
FROM press
CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a,
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(press.tag_id) - LENGTH(REPLACE(press.tag_id, ',', '')))
ORDER BY press_id, tag_id
Explanation
To sub-consultation with a pseudonym of n will generate in real time a sequence of numbers from 1 to 100, in this particular case, using [UNION ALL][4]
and CROSS JOIN
.
In the SELECT
experior, no SUBSTRING_INDEX()
inside, let’s get everything down to the umpteenth element in a list.
The SUBSTRING_INDEX()
will extract the most direct portion after the last delimiter, thus being able to receive the information of the nth element.
CROSS JOIN
allows us to produce a set of lines that is a Cartesian product (100 lines in n, and all rows in the table press
).
The condition in the clause WHERE
will filter all unnecessary lines from the result set.
This query will divide up to 100 tag_id
for each entry in the source table. For the case at hand it is sufficient, but if necessary, you can adjust the sub-queries.
Common scene
A common scenario with values separated by a delimiter in a column is the insertion of the combination valor
+ delimitador
, resulting in something like:
valor;valor;valor;
Where the presence of the last delimiter, with the above solution, will generate a blank entry in the target table:
┌──────┬────────────┬──────────────┐
│ id | press_id │ tag_id │
├──────┼────────────┼──────────────┤
│ 1 │ 1 │ │
├──────┼────────────┼──────────────┤
│ 2 │ 1 │ 2 │
├──────┼────────────┼──────────────┤
│ 3 │ 1 │ 3 │
└──────┴────────────┴──────────────┘
This problem can also be reflected in a type error:
Incorrect integer value: '' for column 'tag_id'
To avoid this, we can make use of another SELECT
to select the fields where the tag_id
not be ''
or NULL
:
INSERT INTO press_tags (press_id, tag_id)
SELECT
result.press_id,
result.tag_id
FROM (
# consulta aqui
) AS result
WHERE result.tag_id > ''
Which results in:
INSERT INTO press_tags (press_id, tag_id)
SELECT
result.press_id,
result.tag_id
FROM (
SELECT
press.press_id,
SUBSTRING_INDEX(SUBSTRING_INDEX(press.tag_id, ',', n.n), ',', -1) tag_id
FROM press
CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a,
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(press.tag_id) - LENGTH(REPLACE(press.tag_id, ',', '')))
ORDER BY press_id, tag_id
) AS result
WHERE result.tag_id > ''
Much of this solution comes from the answers in SOEN in this question.
+1 This method is simple to implement and flexible for many scenarios.
– Zuul