8
In my case I have two date frames.:
> head(Trecho)
Xt Yt Zt
1 -75.56468 1.642710 0
2 -74.56469 1.639634 0
3 -73.56469 1.636557 0
4 -72.56470 1.633480 0
5 -71.56470 1.630403 0
6 -70.56471 1.627326 0
> head(TrechoSim)
Xs Ys Zs
1 -71.7856 -0.509196 0
2 -71.7856 -0.509196 0
3 -71.7856 -0.509196 0
4 -71.7856 -0.509196 0
5 -71.7856 -0.509196 0
6 -71.7856 -0.509196 0
The data frame Trecho
has approximately 5 thousand lines and the TrechoSim
has 20 thousand lines. Similar to PROCV
from Excel, I need to fetch the nearest value where Xt = Xs (in excel I use TRUE, and returns the first value closest to Xt). There is no tolerance for this proximity. I need all data frame values Trecho
with their respective value closer to TrechoSim
.
I tried to difference_inner_join
but it returns values NA
in some lines.
Grateful,
https://answall.com/a/124326/6036
– Daniel Falbel
@Danielfalbel this solution is for IDENTICAL searched values. In my case it is a similar value, because none will be identical, or if it happens will be few.
– Aurenice Figueira
sorry, li rapido! Maybe you’re behind this: https://github.com/dgrtwo/fuzzyjoin
– Daniel Falbel
@Danielfalbel already tested this. He asks for a distance. And my distance varies. Temp <- difference_left_join(Excerpt, Trechosim, by = c(Xt="Xs") , max_dist = .1 ). And some values return NA.
– Aurenice Figueira
How is the question of multiple matches for the objective of the problem? E.g., in the sample data, the value 5 of the database
Trecho
could be mapped to any of the six initial values ofTrechoSim
. It is acceptable that two entries ofTrecho
are associated with the same entry ofTrechoSim
?– Erikson K.
@Eriksonk. Disregard these initial values, as it is a data collection of a simulated car, at this time it is stopped. I did in Excel and it worked with PROCV.
– Aurenice Figueira
I think the simplest solution will be using the library fuzzyjoin. You can update the question including the code Voce used with
difference_inner_join
? It would also be good to update the question title to make it clear that you want to use Fuzzy match– rafa.pereira