How to save an Excel spreadsheet from R without blank lines?

Asked

Viewed 276 times

0

I’m working with a large database and it has a few lines with blank Ids. And I want to save only lines without blank ID in an Excel file.

Whereas the Ids are in the first column of my matrix, here’s what I’m doing:

k <- which(is.na(Dados[,1]))
if (length(k) > 0) Dados <- Dados[-k,]
library(XLConnect)
wb <- loadWorkbook("Pasta\\Nome.xls")
writeWorksheet(wb,data=Dados,sheet=1,header=TRUE)
saveWorkbook(wb)

I don’t know why, but the resulting spreadsheet contains all the content of the Data matrix plus as many blank lines as there were blank Ids.

For example, if the Data matrix had 1000 lines and 5 Ids in blank, I had an array with 995 lines, but the resulting Excel spreadsheet is getting the 995 lines of the Data matrix plus 5 lihas in white at the end!

Whereas I need to send a spreadsheet xls (not xlsx) blank lines, how can I fix this?

From now on, thank you very much.

  • The variable id is a string, number...? If you provide part of the database with dput(head(dados)) would help clarify.

  • 1

    If id == NA, then you can filter your data frame before saving to Excel. https://answall.com/questions/87730/comorremover-linha-que-tem-missing/87872#87872 . Or use na.omit()

  • That is if id == NA. It may be that id == "", for example. Since AP did not specify what blank id means, we cannot guess.

  • Let’s go: ID is a numerical column, with some blank cells. The first two lines of the code I posted are the data.frame. And the problem was much more strange, because as a test I replaced the filter of the first two lines by k <- which(Dados[,2]==5) and Dados <- Dados[k,] and the result was an Excel spreadsheet with lines that have 5 in the second column at the top and all other lines below.

  • But the Data matrix was formed from commands and operations on other matrices. As a test, I applied the is.na in a step further back and it worked, the problem was solved. Anyway, thank you so much for the suggestions.

  • For the record: Blank ID means NA.

Show 1 more comment

1 answer

0

You should initially clean gave data frame to later be able to export it. There are different ways to do this, here I will present 3.

  • Using the package dplyr
library(dplyr)
nome_seu_dataframe <- nome_seu_dataframe %>%
  dplyr::filter(id != is.na(id))
  • Using the package tidyr
library(tidyr)
nome_seu_dataframe <- nome_seu_dataframe %>%
  tidyr::drop_na(id)
  • Using the package data.table
library(data.table)
nome_seu_dataframe <- data.table::data.table(nome_seu_dataframe )
nome_seu_dataframe <- nome_seu_dataframe [id != is.na(id)]

When applying the package data.table your painting will pass the class data table.. If necessary to return to data frame, perform:

nome_seu_dataframe <- as.data.frame(nome_seu_dataframe)
  • Using resident function na.omit()
nome_seu_dataframe <- nome_seu_dataframe %>% 
  na.omit()

Notice I used the pipe %>% in all options. He is a package operator dplyr, but not necessarily you will need it to carry out the removal of NA.

For example:

library(tidyr)
nome_seu_dataframe <- tidyr::drop_na(nome_seu_dataframe, id)

Example with Toy date.

library(dplyr)
set.seed(1)
df <- data.frame(id = c(seq(1,9,1), rep(NA, 4), 
                        seq(10,20,1),rep(NA, 3)),
                 V = rnorm(n = 27))

df2 <- df %>%
  dplyr::filter(id != is.na(id))

df3 <- df %>% 
  na.omit()

library(tidyr)
df4 <- df %>% 
  tidyr::drop_na(id)

df5 <- tidyr::drop_na(df, id)

library(data.table)
df6 <- data.table::data.table(df)
df6 <- df6[id != is.na(id)]
df6 <- as.data.frame(df6)

Note that df2, df3, df4, df5 and df6 are the same.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.