The phrase quoted above from Soen is correct. According to John Chambers, creator of r,
Everything in R is an object.
Everything that happens in R is called a function.
This creates the curious situation that a function is itself an object.
See the difference between the data frame. in the pandas (python, object-oriented):
import pandas as pd
df = pd.DataFrame({ 'A' : 1.,
'B' : pd.Timestamp('20130102'),
'C' : pd.Categorical(["test","train","test","train"])})
df
A B C
0 1.0 2013-01-02 test
1 1.0 2013-01-02 train
2 1.0 2013-01-02 test
3 1.0 2013-01-02 train
and in the r (bundle groundwork, more functional):
df <- data.frame(A = 1,
B = as.Date("2013-01-02"),
C = c("test","train","test","train"))
df
A B C
1 1 2013-01-02 test
2 1 2013-01-02 train
3 1 2013-01-02 test
4 1 2013-01-02 train
Now that we have our df
in the pandas
and in the r-base
, we can see the difference between functional and object-oriented approaches to, for example, check the type of information contained in each column.
In object orientation, the object itself has contained in itself a property (sometimes it is a method) that allows us to do this.
df.dtypes
A float64
B datetime64[ns]
C category
dtype: object
In the more functional orientation, this information is not accessed by the object itself, but by a function. That is, the "method" does not "live" inside the object, but outside and independent of it.
class(df)
[1] "data.frame"
It turns out that the function class()
only brings us the latest information that the pandas
brought (the last line). No R
, to see the column class, we must apply the function separately for each column. This is done via Map
(or more commonly, sapply()
), which is a common feature to find in functional languages. So we have
sapply(df, class) # ou Map(class, df)
A B C
"numeric" "Date" "factor"
The vectorization in R
only concerns the fact that the R
be able to relate two vectors of different sizes. Thus, it is not necessary to write a loop, for example, to sum a vector of 5 numbers with a vector of 1 or two numbers.
1:5 + 1
[1] 2 3 4 5 6
1:5 + 1:2
[1] 2 4 4 6 6
Warning message:
In 1:5 + 1:2 :
longer object length is not a multiple of shorter object length
As can be seen in the notice of the second example above, this vectorization may have some mishaps. To not extend this answer further, I recommend reading this answer
Can you describe what is missing in the answer?
– Tomás Barcellos
Your answer is excellent, @Tomás. However, I would like more people to be able to contribute other knowledge (theoretical, nonopinionated) of other programming languages to this answer. Something like citing the similarity and difference between encapsulation in Java and in R, and also the same for the polymorphism between these two languages. I know this is not part of the scope of the question, but if it is, I can edit it or feel free to do so. I thought about creating a new question for this purpose, but decided to offer the reward instead. Thank you for your attention.
– neves
I’m asking to be able to complement her. I like the idea of more people helping to discuss this issue with the commitment she deserves. I just want to be one of them :P
– Tomás Barcellos