Maybe regex is not the best solution. Instead, you could simply separate the data and treat it separately. Assuming that each record is in a row, a "skeleton" of the general idea would be:
$dados = <<<DADOS
30140083|MG 13407061|07309482697|12/01/86|EMILLY L V GOMES|4984012024889591|10/2018|557
3283005|165055881|07309480805|11/18/16|AILTON B TEGON|5549320054855241|6/2018|177
3691100|17714445|07309395883|01/07/67|MARGARETE P SAN|319150122924465|12/2023|8671
DADOS;
// para cada linha dos dados
foreach (explode("\r\n", $dados) as $line) {
// separar as informações (supondo que o separador sempre é "|")
$partes = explode('|', $line);
// supondo que os dados sempre estão nas mesmas posições
$numero = $partes[5];
$cvv = $partes[7];
// --------- validações ---------
// exemplo: Diners Club (coloque todos os valores que o cartão pode iniciar)
$tmp = substr($numero, 0, 3);
if ($tmp == '300' || $tmp == '301' || etc... ) {
// cartão é Diners
$tamanho_cvv = 4;
} else if ($numero[0] == '4') {
// cartão é Visa
$tamanho_cvv = 3;
} // etc (coloque as outras condições específicas de cada bandeira, ajustando o tamanho do CVV para cada caso)
// verificar CVV
if (strlen($cvv) == $tamanho_cvv) {
// tamanho CVV ok
} else {
// CVV com tamanho errado
}
}
Of course you can improve, this is just an initial idea, but once you have the card number and the CVV, you can validate it in any way you like. It could be for example:
function getTamanhoCvv($numero_cartao) {
$digito = intval($numero_cartao[0]);
if (4 <= $digito && $digito <= 6)
return 3;
if ($digito == 3)
return 4;
// adicione mais casos se necessário
// se chegou aqui, dá erro (número inválido?), ou retorna algum valor default, veja o que faz mais sentido no seu caso
}
function diners($numero_cartao) {
$tmp = intval(substr($numero_cartao, 0, 3));
if (300 <= $tmp && $tmp <= 305)
return TRUE;
$tmp = intval(substr($numero_cartao, 0, 2));
return $tmp == 36 || $tmp == 38 || $tmp == 39;
}
function getBandeira($numero_cartao) {
if (diners($numero_cartao)) {
return 'Diners Club';
}
// teste as outras bandeiras de maneira similar...
}
...
$numero = // obter número do cartão conforme já explicado
$cvv = // obter CVV conforme já explicado
$tamanho_cvv = getTamanhoCvv($numero);
$bandeira = getBandeira($numero);
// verificar CVV
if (strlen($cvv) == $tamanho_cvv) {
// tamanho CVV ok
} else {
// CVV com tamanho errado
}
etc...
And to check the cases that are dated after the CVV, just do something like:
// verifica se tem data após o CVV
if (preg_match('#(\d{3,4})\d{2}/\d{2}/\d{4}#', $cvv, $match)) {
$cvv = $match[1];
}
Since the idea is not to validate the date (just see if you have something that looks like "dd/mm/yyyy"), just check this and if so, remove from the CVV value.
Do you really want to use regex?
I don’t recommend it, because as you can see, there are a lot of different rules and, although it’s even possible, it would be giant, difficult to understand and maintain. Already the above code, in my opinion is simpler, not only to do, but also to understand and modify (to add new rules/flags/etc, just go incrementing the algorithms, creating new validation functions and so on). It gets more organized and in the end it ends up being more worthwhile.
Just to give you an idea of what it would look like, follow a "simplified" version with only 4 flags:
$regex = '#
(?: (?P<diners> 30[0-5]\d{12,13} | 3[689]\d{13,14} ) \| [^|]+ \| (?P<cvv_diners>\d{4}) ) |
(?: (?P<amex> 3[47]\d{14} ) \| [^|]+ \| (?P<cvv_amex>\d{4}) ) |
(?: (?P<mastercard> [25]\d{15} ) \| [^|]+ \| (?P<cvv_mastercard>\d{3}) ) |
(?: (?P<visa> 4\d{15} ) \| [^|]+ \| (?P<cvv_visa>\d{3}) )
#x';
$dados = <<<DADOS
30140083|MG 13407061|07309482697|12/01/86|EMILLY L V GOMES|4984012024889591|10/2018|557
3283005|165055881|07309480805|11/18/16|AILTON B TEGON|5549320054855241|6/2018|177
3691100|17714445|07309395883|01/07/67|MARGARETE P SAN|319150122924465|12/2023|8671
DADOS;
$bandeiras = [ 'diners', 'amex', 'mastercard', 'visa'];
if (preg_match_all($regex, $dados, $matches, PREG_SET_ORDER)) {
foreach ($matches as $m) {
foreach ($bandeiras as $b) { // verifica se os named groups foram encontrados
if ($m[$b]) { // se encontrou, preenche as informações
$numero_cartao =$m[$b];
$cvv = $m["cvv_$b"];
$bandeira = $b;
break; // se já encontrou uma bandeira, não precisa procurar pelas outras
}
}
echo "cartão $bandeira: $numero_cartao, $cvv\n";
}
}
I used the modifier x
, which allows you to write the regex in several lines and with spaces, so that it is a little less complicated to read (imagine if it was all in one line and no spaces).
Then I use alternation (the character |
, which means "or"), to get the various possibilities.
For example, for Diners, I see if it starts with 30[0-5]
(the digits 3
and zero, followed by a digit from zero to 5
), followed by 12 or 13 digits (\d{12,13}
), or starts with 3[689]
, that takes 36
, 38
or 39
, followed by 13 or 14 digits.
Then I take \|[^|]+\|
, which is the character |
(escaped with \
not to be confused with the toggle), followed by one or more characters other than |
, followed by another |
, and finally I get the CVV (which in this case has 4 digits).
After all this has an alternation (the |
at the end of the line) and then there’s another regex for American Express (3[47]
magpie 34
or 37
, followed by 14 digits - adjust the values for whatever you need, I don’t know if you can have less, for example), and so on (each flag has its rules, including the CVV size).
For every information I use named groups. For example, (?P<visa> 4\d{15} )
indicates that if regex finds a match for 4\d{15}
(the number 4 followed by 15 digits), the result will be in the group called "visa". The same goes for the CVV, each one has a name, so just check if the group is filled in to know if that information was found. And since I’m using alternation, then only one of the flags will be found at a time, so I just need to search the groups until I find one of them.
But note that I only put four flags. For each of them you would have to add another specific regex, and is it really worth it? For me, it’s already complicated enough, and adding more lines there will only make it worse. I still prefer to go with the first option (treat each line separately, break the data, create separate functions for each flag, etc). Regex is legal, but it is not always the best solution.
For example, to handle cases where you may have a date after the CVV, simply add (?:\d{2}/\d{2}/\d{4})?
after them (the ?
indicates that this whole section is optional). It would look worse than it already is:
$regex = '#
(?: (?P<diners> 30[0-5]\d{12,13} | 3[689]\d{13,14} ) \| [^|]+ \| (?P<cvv_diners>\d{4}) (?:\d{2}/\d{2}/\d{4})? ) |
(?: (?P<amex> 3[47]\d{14} ) \| [^|]+ \| (?P<cvv_amex>\d{4}) (?:\d{2}/\d{2}/\d{4})? ) |
(?: (?P<mastercard> [25]\d{15} ) \| [^|]+ \| (?P<cvv_mastercard>\d{3}) (?:\d{2}/\d{2}/\d{4})? ) |
(?: (?P<visa> 4\d{15} ) \| [^|]+ \| (?P<cvv_visa>\d{3}) (?:\d{2}/\d{2}/\d{4})? )
#x';
If the idea is just to take the numbers, it wouldn’t just be to add
{3,4}
at the end of the regex?/(\d{15,16})[^<](\d{1,2})[^<](\d{2,4})[^<](\d{3,4})/
.– Weslley Araújo
@Weslleyaraújo no, no, by the way I tried this way, but the way I receive the data can differentiate, for example, as well as "MG 13407061|0730948269" 13407061 is next to this delimiter | (which by the way may be different), next to the CVV may have other numbers, such as date and etc.. Type like this: "4984012024889591|10/2018|55701/01/2021" ai if in Pattern, it contains {3.4} that is it will try to catch the largest number, but by default, the Visa flag, started with number 4 has only 3 digits of its security code (cvv)
– gleisin-dev
consequently he would get "4984012024889591|10/2018|5570" which by the way is invalid.
– gleisin-dev
This example dated after the CVV is not in the question. I suggest [Edit] and put all possible variations, otherwise any answer will be incomplete... In fact, you just want to validate or want to take each information separately (the card number, the CVV, etc)?
– hkotsubo
I edited, and as I go on trying I will add.
– gleisin-dev
@hkotsubo necessarily need to obtain them separately, but that are part of a single group, in
for
orforeach
need to pick up each group (card, month, year, cvv) and add them to aArray Object
, because I do all four validations. 1° if the card is true (if it conforms to the valid algorithm), 2° if the expiration date, that is, the expiration date of the card is not expired (less than the current month or year) and if the cvv corresponds to the flag (be it Amex, Visa, Master, Hyper, Elo, etc...)– gleisin-dev
I hope that this data is fictitious
– Eduardo Bissi
@Eduardobissi for illustrations only, are true but invalid cards.. does not belong to a "physical" or "legal" person".
– gleisin-dev