How to extract a file extension in Javascript?

Asked

Viewed 6,604 times

20

There are some cases where I need to capture a file extension (that is, a string with the file address) in order to validate the extension via Javascript.

For example of location:

var path = window.location.pathname;
// /foo/bar.html

In case I want to catch the html.

There is also the case of treating the extension of a link element, example:

var path = document.querySelector('link').src;
// http://algum.site/css/estilo.css

In case I would have to return the css.

What better way to get this extension?

Considered the URL cases:

'foo/bar.ext'     // 'ext'
'foo.bar/zaz.ext' // 'ext'
'foo.bar/sem_ext' // ''
'.sem_ext'        // ''
'foo/.sem_ext'    // ''

4 answers

22


To get the file extension in a practical way use:

var ext = path.split('.').pop();

In case the split divided the path into an array and the pop will remove and return the last element of this array, exactly the extension I’m looking for.

A more accurate version would be:

// Pegar somente a ultima parte, afinal podem ter pastas com . no caminho
var ext = path.split('/').pop();
// se não houver extensão, retorna vazio, se houver retorna a extensão
// sendo que .htaccess é um arquivo escondido e não uma extensão
ext = ext.indexOf('.') < 1 ? '' : ext.split('.').pop();

But it is also possible to do it using lastIndexOf with some mathematical operations to obtain best performance, example:

function ext(path) {
    var idx = (~-path.lastIndexOf(".") >>> 0) + 2;
    return path.substr((path.lastIndexOf("/") - idx > -3 ? -1 >>> 0 : idx));
}

In this second solution, I used the concept presented by bfavaretto but in a little more performative way.

Explaining the second solution

First we find the position of ., but how will we use the substr then it is important to know that in the substr if you put a value greater than the string, it will return empty.

So we use the operator - to turn the value into negative.

Then the operator ~ reversing the binary value (ex: 0010 flipped 1101) this operation is done in this way exactly to jump if the file starts with . or if you don’t have . in it give a negative value.

With the operator >>> we are moving the positioning in bits within unsigned value (positive), which in the case of being negative to 0 will give the largest possible integer less the value that is being passed in the result of the previous calculation and if it is positive nothing will happen be changed.

Then add 2 to compensate for operations ~- in the end.

In the next line we have a conditional one so that the position of the last / is less than the last point position or if it is a point then less than -3, so apply the same logic to the substr if the value is invalid giving a very large number to him.

  • It was a bit of an explanation, I think I could open questions and answers of these operations here in stackoverflow, because they should interest other programmers.

  • 1

    We can open, but I thought it was good size, your answer is quite complete now! If I could, I would vote again :)

  • The size of the answer is not necessarily a problem. I prefer a big but complete answer to a simple and that lets some information escape.

9

The logic to extract the extension with these requirements is to isolate the last part of the path, and check:

  • If it is blank, it starts with a dot or does not contain a dot: returns ''.
  • Else: returns what comes after the point.

You can break the path in arrays, as demonstrated by Gabriel Gartz. It’s the simplest way.

An option that does not involve arrays, only with string manipulation, is usually more performative. It’s about using lastIndexOf to break the path:

function ext(path) {
    var final = path.substr(path.lastIndexOf('/')+1);
    var separador = final.lastIndexOf('.');
    return separador <= 0 ? '' : final.substr(separador + 1);
}
  • +1 very cool this performance test and everything else you did :)

6

I like the regular expression approach:

function getExtension(path) {
  var r = /\.([^./]+)$/.exec(path);
  return r && r[1] || '';
}

The regular expression will look for a "dot", followed by any other characters except another dot or bar. The $ at the end of the expression requires it to be the end of the string.

If this pattern is found in the string, the function will return the first catch, i.e., the letters following the last point. Otherwise, it will return empty string.

Explanation regexper

  • cool vc return an empty string, so that its function respects the single responsibility rule. entra string sai string. para isso use a || '' at the end of Return. it’s nice to use regexp too. + 1

  • but it is the least performative solution of all: http://jsperf.com/lastindexof-versus-split/3

  • in my Chrome is the fastest of all: http://imgur.com/Ja65hcS

  • I find it clearer that the method returns null when the target of his search was not found, for 2 reasons: it is a relatively common pattern, especially in JS; and facilitates use in boolean expressions.

  • Well, if the user tries a probation with the property length if it is null you will always have to test twice to ensure it is a string res && res.length > 0, as far as if he wants to test as Boleano, keep as empty string works equal !'' === !null, that’s true. Finally we have performance, that when you change the type of a variable internally a new space in memory will be allocated to the variable that will replace the previous one, consuming more resources. But javascript gives freedom for this to be just a point of view.

3

How you want to take into account files without extension, as in the case of ". htaccess", requires a little more code:

var filename = window.location.pathname.split('/')[window.location.pathname.split('/').length-1].split('.');
var ext = filename[0].length > 0 ? filename[1] : '';
alert(ext)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.