3
I have a set of 300 and few worksheets in which we have to create a function with 3 arguments: the directory where the worksheets are, the variable that will be analyzed and the number of files that we want to analyze.
In this case we have two variables of interest, sulphate concentration and nitrate concentration.
I was able to equate the function for two parameters, in which I will return the average of sulfate and separately the average of nitrate.
Follows the code:
pollutant_sulfate<-function(directory, ID = 1:332) {
files_list <- list.files(directory, full.names=TRUE)
data <- data.frame()
for (i in ID) {
data <- rbind(data, read.csv(files_list[i]))
}
subset_sulfate<- subset(data$sulfate, data$sulfate > 0)
mean (subset_sulfate)
}
pollutant_nitrate<-function(directory, ID = 1:332) {
files_list <- list.files(directory, full.names=TRUE)
data <- data.frame()
for (i in ID) {
data <- rbind(data, read.csv(files_list[i]))
}
subset_nitrate<- subset(data$nitrate, data$nitrate > 0)
mean (subset_nitrate)
}
Now the third argument of the function that would be the determination of which variable I want to analyze (sulfate or nitrate) I’m having difficulties. I thought of building a conditioner if
. I wrote a code that contains errors and I can’t understand what the problem is. Follow the code in question:
mean_pollutant1<-function(directory, pollutant, ID=1:332){
files_list <- list.files(directory, full.names=TRUE)
data <- data.frame()
for (i in ID) {
data <- rbind(data, read.csv(files_list[i]))
}
if (pollutant == sulfate){
subset_sulfate<- subset(data$sulfate, data$sulfate > 0)
mean (subset_sulfate)
}
if (pollutant == nitrate){
subset_nitrate<- subset(data$nitrate, data$nitrate > 0)
mean (subset_nitrate)
}
}
When I try to call the function I get msg error:
mean_pollutant1("specdata", sulphate, 1:2) Error in mean_pollutant1("specdata", sulfate, 1:2): Object 'sulfate' not found
Can someone help me get around the problem?
I think in the if you have to put in quotes:
if (pollutant == "sulfate")....if (pollutant == "nitrate")
and the same when calling the function:mean_pollutant1("specdata", "sulfate", 1:2)
– Daniel Falbel
I guess you haven’t created one the variables
sulfate
andnitrate
, if they are variables. or as Daniel said, if they are strings quote.– Guill
Guys, problem partially solved! Thank you very much! I really have to insert as string. However, when I enter the function in "Nitrate" I get the correct answer. However, in the "sulfate" entry which is the first if the function response is a blank line. There is no error notification, just the clean command line. Any suggestions?
– José Ferraz Neto
that means that in the spreadsheets there is no
sulfate > 0
. at first the code is correct. Or else you had some spelling error when writing 'sulfate'.. see if you putpollutant = 'x'
you will also receive an empty line– Daniel Falbel
In the spreadsheet there are values greater than 0 for sulfate. And in case any string other than "Nitrate" it results in a blank line. o. The
– José Ferraz Neto
This happens because with any different string the function does not enter any of the if’s and then returns empty. Doesn’t the spreadsheet have an extra space in the column name or something? Pq apparently the problem is that it is not getting into any if when vc uses the string "sulfate".
– Daniel Falbel
Guys, a friend helped me. I can’t answer why, but we switched the second if for an if and it worked! God knows why two if’s in a row were resulting in error! Thank you to everyone who helped me!
– José Ferraz Neto
@Danielfalbel I think you could put as a response your first comment. José Ferraz, it is always interesting to put a small example of the database to reproduce your error. http://meta.pt.stackoverflow.com/questions/824/como-crea-um-exemplo-m%C3%Adnimo-reproduces%C3%Advel-em-r/825#825
– Carlos Cinelli