1
How to find the index of the beginning of a substring starting from a given index of string?
In C++, for example, the method std::string::find
accepts a offset or index where the search should begin.
There’s something similar in Rust?
1
How to find the index of the beginning of a substring starting from a given index of string?
In C++, for example, the method std::string::find
accepts a offset or index where the search should begin.
There’s something similar in Rust?
3
I don’t know if this fits what you call "idiomatic," but let’s see...
Basically, assuming we want to find the index of a substring from a given offset
, we could do so:
let string = "ABC ABC ABC";
let offset = 3;
let idx = string
.chars()
.skip(offset)
.collect::<String>()
.find("ABC")
.map(|n| n + offset);
See working on Rust Playground.
Basically, it works as follows:
chars
).chars
, jump the number of elements corresponding to offset (chaining skip
).collect::<String>
).find
to find the index corresponding to the searched substring. This method returns a Option<usize>
, which means you will return None
when substring is not found.find
. For this, we use the method map
, implemented by Option
, that maps the value in the case Some
according to the function passed. In the case of None
, mapping will be ignored and map
will simply return his own None
.It may sound a little performance-y, but how Rust is so strict with his calls Zero Cost Abstractions, I suppose several optimizations are made during the intermediate phase of the compilation process. However, I did not benchmarks to confirm this hypothesis.
Browser other questions tagged string rust indexof
You are not signed in. Login or sign up in order to post.
Thank you very much. But one question, if I use the find in a subset like
string[offset..].find
, this would avoid allocation?– suriyel
You will not have the allocation, but keep in mind that when using this notation, you will not be iterating over the characters in the string, but over the bytes in the UTF-8 encoding. The character ``, for example, has 4 bytes. Already
á
, 2 bytes. Whilea
, 1 byte. Also, when using Slice, you must ensure that you are indexing "valid ranges" in relation to bytes... Anyway, it’s a hell of a complication. So, in this case, I think it’s worth using the methodchars
(to ensure that we are indexing on characters, not bytes).– Luiz Felipe
Behold this playground with a drastic example to better understand the "problem". See sections § 4.3 and § 8.2 of the Book to learn more. And also has the question of unicode normalizations, which may affect even the
chars
...– Luiz Felipe
This and this discussions may be useful for more information on the indexing of Slices string too.
– Luiz Felipe
@suriyel, I ended up taking the opportunity of this comment of yours to go a little deeper into this subject. I will add here another reference to another good article I found: Unicode String Models.
– Luiz Felipe