How to make a "rbind" in tables of an SQL base from R?

Asked

Viewed 441 times

6

I would like to "merge" two tables of a Sqlite "database" from the R -- and save it to a new table within the same database. Below, I send a minimum reproducible code:

    install.packages("sqldf",dependencies=T)
    install.packages("RSQLite",dependencies=T)
    library(sqldf)

    A <- data.frame(var1 = 1:5, var2=55:59)
    B <- data.frame(var1 = 11:15, var2=155:159)

    drv <- dbDriver("SQLite") 
    con <- dbConnect(drv, "basequalquer.db") 
    dbWriteTable(con, "TabelaA", A) 
    dbWriteTable(con, "TabelaB", B) 

The result I desire is this one:

    dbGetQuery(con, 'SELECT * FROM TabelaA UNION ALL SELECT * FROM TabelaB' )

       row_names var1 var2
    1          1    1   55
    2          2    2   56
    3          3    3   57
    4          4    4   58
    5          5    5   59
    6          1   11  155
    7          2   12  156
    8          3   13  157
    9          4   14  158
    10         5   15  159

I can do a query to get it, but I don’t know how to save it directly in the database (that is, without having to store it in a data.frame and then do dbWriteTable)

  • I think when you install sqldf Rsqlite comes as a gift as dependency. The sqldf cat jump is that you do not need to use dbWriteTable, when there is a data.frame X and you use sqldf("SELECT * FROM X") df X is automatically copied to Sqlite.

2 answers

7

Let me add my 2 cents: Carlos' solution is correct, BUT in this solution you will copy each entry of the two tables into a new table and they will no longer be related, that is, any changes in the original tables will not be reflected in this new table created in the merge.

One feature that Sqlite3 and other database management systems have, is VIEW (http://www.sqlite.org/lang_createview.html) which allows you to create the same merge as before, only the table is sensitive to changes in the original table. If you enter or remove records from the Table or Table, when consulting the VIEW the changes will be present there as well.

library(sqldf)

   A <- data.frame(var1 = 1:5, var2=55:59)
   B <- data.frame(var1 = 11:15, var2=155:159)

   drv <- dbDriver("SQLite") 
   con <- dbConnect(drv, "basequalquer.db") 
   dbWriteTable(con, "TabelaA", A) 
   dbWriteTable(con, "TabelaB", B) 

   ## Criando a VIEW
   dbGetQuery(con, 'CREATE VIEW
                    merge
                 AS
                    SELECT * FROM TabelaA 
                        UNION ALL 
                    SELECT * FROM TabelaB')

   ## Fazendo uma consulta na VIEW
   dbGetQuery(con, 'SELECT * FROM merge')

 row_names var1 var2
1          1    1   55
2          2    2   56
3          3    3   57
4          4    4   58
5          5    5   59
6          1   11  155
7          2   12  156
8          3   13  157
9          4   14  158
10         5   15  159

   ## Removendo uma linha da TabelaA
   dbGetQuery(con, 'DELETE FROM TabelaA WHERE row_names = 3')

   ## Consultando a VIEW novamente
   ## veja que o registro sumiu da VIEW também!
   dbGetQuery(con, 'SELECT * FROM merge')

 row_names var1 var2
1         1    1   55
2         2    2   56
3         4    4   58
4         5    5   59
5         1   11  155
6         2   12  156
7         3   13  157
8         4   14  158
9         5   15  159
  • Very interesting, @Flavio! But what if I want a third way: how to merge Tabelaa and Tabelab, making them now cease to exist (to occupy less memory). That is, how to stack the two, without creating copy? I know I could delete the original tables later. But if I only delete them later, for a few moments the originals and the copy will exist simultaneously. If the database is really too large, I imagine situations where that might not be feasible.

  • @Rogeriojb, it would be the case to use Carlos Cinelli’s command without UNION ALL, drop the table, and run an INSERT INTO Tabelac SELECT FROM Tabelab. But if you’re messing with databases so big you can’t do UNION ALL, it’s past time to trade R+Sqlite for Python+Nosql :)

  • Or use ff: http://www.r-project.org/conferences/useR-2007/program/presentations/adler.pdf

  • Hello @Lucass, thanks for the tip. ff I know. But I’m really having doubts on Sqlite+R... But I didn’t get your tip. How do I use CREATE TABLE without UNION ALL for this case?

  • Do in stages. First CREATE TABLE by inserting the data from table A, without UNION ALL. After that, table A is deleted. Then INSERT INTO SELECT * FROM Table B is deleted, and then table B is deleted. It is the only way to minimize replication without using Views (which would be my preferred approach).

5


It should not be the only way to do, but try putting the command CREATE TABLE in the query itself:

dbGetQuery(con, 'CREATE TABLE TabelaC AS
                 SELECT * FROM TabelaA 
                 UNION ALL 
                 SELECT * FROM TabelaB' )

Testing whether the table was created:

dbGetQuery(con, "select * from Tabelac")
   row_names var1 var2
1          1    1   55
2          2    2   56
3          3    3   57
4          4    4   58
5          5    5   59
6          1   11  155
7          2   12  156
8          3   13  157
9          4   14  158
10         5   15  159
  • 2

    Simple as that... thanks, Carlos

Browser other questions tagged

You are not signed in. Login or sign up in order to post.