Doubt about conditional (if) in function

Asked

Viewed 618 times

3

I have a set of 300 and few worksheets in which we have to create a function with 3 arguments: the directory where the worksheets are, the variable that will be analyzed and the number of files that we want to analyze.

In this case we have two variables of interest, sulphate concentration and nitrate concentration.

I was able to equate the function for two parameters, in which I will return the average of sulfate and separately the average of nitrate.

Follows the code:

pollutant_sulfate<-function(directory, ID = 1:332) {
    files_list <- list.files(directory, full.names=TRUE)
    data <- data.frame()
    for (i in ID) {
            data <- rbind(data, read.csv(files_list[i]))
    }
    subset_sulfate<- subset(data$sulfate, data$sulfate > 0)
    mean (subset_sulfate)
}

pollutant_nitrate<-function(directory, ID = 1:332) {
    files_list <- list.files(directory, full.names=TRUE)
    data <- data.frame()
    for (i in ID) {
            data <- rbind(data, read.csv(files_list[i]))
    }
    subset_nitrate<- subset(data$nitrate, data$nitrate > 0)
    mean (subset_nitrate)
}

Now the third argument of the function that would be the determination of which variable I want to analyze (sulfate or nitrate) I’m having difficulties. I thought of building a conditioner if. I wrote a code that contains errors and I can’t understand what the problem is. Follow the code in question:

mean_pollutant1<-function(directory, pollutant, ID=1:332){
    files_list <- list.files(directory, full.names=TRUE)
    data <- data.frame()
    for (i in ID) {
        data <- rbind(data, read.csv(files_list[i]))
    }
    if (pollutant == sulfate){
        subset_sulfate<- subset(data$sulfate, data$sulfate > 0)
        mean (subset_sulfate)   
    }
    if (pollutant == nitrate){
        subset_nitrate<- subset(data$nitrate, data$nitrate > 0)
        mean (subset_nitrate)
    }
}

When I try to call the function I get msg error:

mean_pollutant1("specdata", sulphate, 1:2) Error in mean_pollutant1("specdata", sulfate, 1:2): Object 'sulfate' not found

Can someone help me get around the problem?

  • I think in the if you have to put in quotes: if (pollutant == "sulfate")....if (pollutant == "nitrate")and the same when calling the function: mean_pollutant1("specdata", "sulfate", 1:2)

  • I guess you haven’t created one the variables sulfateand nitrate, if they are variables. or as Daniel said, if they are strings quote.

  • Guys, problem partially solved! Thank you very much! I really have to insert as string. However, when I enter the function in "Nitrate" I get the correct answer. However, in the "sulfate" entry which is the first if the function response is a blank line. There is no error notification, just the clean command line. Any suggestions?

  • that means that in the spreadsheets there is no sulfate > 0. at first the code is correct. Or else you had some spelling error when writing 'sulfate'.. see if you put pollutant = 'x' you will also receive an empty line

  • In the spreadsheet there are values greater than 0 for sulfate. And in case any string other than "Nitrate" it results in a blank line. o. The

  • This happens because with any different string the function does not enter any of the if’s and then returns empty. Doesn’t the spreadsheet have an extra space in the column name or something? Pq apparently the problem is that it is not getting into any if when vc uses the string "sulfate".

  • Guys, a friend helped me. I can’t answer why, but we switched the second if for an if and it worked! God knows why two if’s in a row were resulting in error! Thank you to everyone who helped me!

  • @Danielfalbel I think you could put as a response your first comment. José Ferraz, it is always interesting to put a small example of the database to reproduce your error. http://meta.pt.stackoverflow.com/questions/824/como-crea-um-exemplo-m%C3%Adnimo-reproduces%C3%Advel-em-r/825#825

Show 3 more comments

1 answer

3

The problem with your code is that you are making a comparison with objects that do not exist.

In if (pollutant == sulfate), the object sulfate does not exist. You can solve the problem in two ways:

  1. Before the ifcreate the obejto, assigning the value of a string, ie put before the if one sulfate <- 'sulfate'
  2. Direct comparison with string 'sulfate', if (pollutant == 'sulfate').

This you must do for both if's.

The other problem with your code is that you’re missing one return. In the R, it is possible to return values from a function without using the return, but only if the returned value is the last function command.

In your case, the last command of the function is

if (pollutant == 'nitrate'){
    subset_nitrate<- subset(data$nitrate, data$nitrate > 0)
    mean (subset_nitrate)
}

Hence any string other than 'nitrate' will return NULL. To solve this just put one return() in each if. Therefore, the correct function would be:

mean_pollutant1<-function(directory, pollutant, ID=1:332){
    files_list <- list.files(directory, full.names=TRUE)
    data <- data.frame()
    for (i in ID) {
        data <- rbind(data, read.csv(files_list[i]))
    }
    if (pollutant == 'sulfate'){
        subset_sulfate<- subset(data$sulfate, data$sulfate > 0)
        return(mean(subset_sulfate))
    }
    if (pollutant == 'nitrate'){
        subset_nitrate<- subset(data$nitrate, data$nitrate > 0)
        return(mean (subset_nitrate))
    }
}

Of course there are other exits to avoid having to use the return, but this is the one that least modifies its original code.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.