How to Capture a File Extension in R

Asked

Viewed 56 times

4

I am having difficulty capturing the extension of imported files.

I would like to extract the file extension and store in a variable.

arquivo <- "dados/Inscritos.xls"
extensao <- ?

3 answers

3

You can use regex through the package stringr

  • The parenthesis defines the text extract part
  • \w+ defines it to be a word
  • $ defines that it is at the end of the string
    library(stringr)
    arquivo <- "dados/Inscritos.xls"
    extensao <- str_extract(arquivo, '(\\w+)$')

3

This function uses basename to get filenames without directories. Then check whether filenames have a dot or not. Finally, extract only what is between the last point "." and the end of filenames.

extensao <- function(x){
  x <- basename(x)
  y <- character(length(x))
  i <- grep("\\.", x)
  y[i] <- sub("^.*\\.(.*$)", "\\1", x[i])
  y
}

extensao(arquivo)
#[1] "xls"

fls <- list.files(full.names = TRUE)
fls <- fls[!file.info(fls)$isdir]
extensao(fls)

2

With R base, you can use the function strsplit:

arquivo <- "dados/Inscritos.xls"
extensao <- unlist(strsplit(arquivo, split = "\\."))[2]

As the result of strsplit is a list, it is necessary to transform it into vector through the command unlist and then remove the second element of that vector.

  • I thought of this solution, but it is always possible to have more than one point in the archives.

  • True. I didn’t realize it.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.