TL;DR - (Workaround) use Comments:
select t1.nr_prpt_vcld as nr_prpt_vcld,
(select t2.nrcvn_srvc
from tabela_base t2
where t2.nr_prpt_vcld = t1.nr_prpt_vcld
order by t2.ult_acl_atv desc
limit 1
) as nrcvn_srvc,
(select t2.cd_cli_srvc
from tabela_base t2
where t2.nr_prpt_vcld = t1.nr_prpt_vcld
order by t2.ult_acl_atv desc
limit 1
) as cd_cli_srvc,
(max(t1.ult_acl_atv)) as ult_acl_atv
from tabela_base t1
group by t1.nr_prpt_vcld
Example in Sqlfiddle.
I have little familiarity with SAS, but from what I understand, it complies with SQL standards. I will respond in a more didactic way, so that you can identify where you might be missing and what to do to fix:
Overview of SQL queries
Abstractly, one can say that a query performs the following steps, in that order:
Discover the table(s) to be consulted by making the applicable junctions;
Ex.: from tabela_base
nr_prpt_vcld nrcvn_srvc cd_cli_srvc ult_acl_atv foo
23157 140594 504217876 30/12/2013 bar
23157 147914 510003171 01/01/2014 bar
23157 153120 511284856 02/01/2014 bar
27175 140676 203940185 01/07/2014 bar
27175 141805 931007593 02/07/2014 bar
27175 150291 510955695 03/07/2014 bar
29350 137242 508122116 16/02/2014 bar
29350 145502 30139405 17/02/2014 bar
99999 999999 999999999 18/02/2014 baz
Filters returned lines as per condition where
;
Ex.: where foo = bar
nr_prpt_vcld nrcvn_srvc cd_cli_srvc ult_acl_atv foo
23157 140594 504217876 31/12/2013 bar
23157 147914 510003171 01/01/2014 bar
23157 153120 511284856 02/01/2014 bar
27175 140676 203940185 01/07/2014 bar
27175 141805 931007593 02/07/2014 bar
27175 150291 510955695 03/07/2014 bar
29350 137242 508122116 16/02/2014 bar
29350 145502 30139405 17/02/2014 bar
----- ------ -------- ---------- baz (X)
Group according to the criterion of the clause group by
;
Ex.: group by no_prpt_vcld
nr_prpt_vcld nrcvn_srvc cd_cli_srvc ult_acl_atv foo tupla de agrupamento
23157 140594 504217876 31/12/2013 bar \
23157 147914 510003171 01/01/2014 bar (23157)
23157 153120 511284856 02/01/2014 bar /
27175 140676 203940185 01/07/2014 bar \
27175 141805 931007593 02/07/2014 bar (27175)
27175 150291 510955695 03/07/2014 bar /
29350 137242 508122116 16/02/2014 bar \(29350)
29350 145502 30139405 17/02/2014 bar /
"Flatten" the groups, transforming each one into a single row, adding columns according to the aggregation functions
Ex.: (max(ult_acl_atv)) as ult_acl_atv
nr_prpt_vcld nrcvn_srvc cd_cli_srvc ult_acl_atv foo
23157 [Achatado] [Achatado] 02/01/2014 [Achatado]
27175 [Achatado] [Achatado] 03/07/2014 [Achatado]
29350 [Achatado] [Achatado] 17/02/2014 [Achatado]
Filter again, now the "flattened" lines (not the original ones)
Ex.: having ult_acl_atv < '06/06/2014'
(imprecise syntax)
nr_prpt_vcld nrcvn_srvc cd_cli_srvc ult_acl_atv foo
23157 [Achatado] [Achatado] 02/01/2014 [Achatado]
----- ---------- ---------- 03/07/2014 ---------- (X)
29350 [Achatado] [Achatado] 17/02/2014 [Achatado]
Choose which columns to return; you can only use grouped columns (i.e. that were part of the group by
or aggregations of other columns (i.e. that had some value defined after "flattening").
Ex.: select nr_prpt_vcld, (...) as ult_acl_atv
nr_prpt_vcld ult_acl_atv
23157 02/01/2014
29350 17/02/2014
How this applies to your case
As you can see, you can’t add the columns nrcvn_srvc
and cd_cli_srvc
because they were "flattened" by taking several lines from the table and turning into one. The most you can do is aggregate them as well, if that’s what you want. Ex.:
select nr_prpt_vcld,
(min(nrcvn_srvc)) as nrcvn_srvc,
(sum(cd_cli_srvc)) as cd_cli_srvc,
(max(ult_acl_atv)) as ult_acl_atv format ddmmyy10.
nr_prpt_vcld nrcvn_srvc cd_cli_srvc ult_acl_atv
23157 140594 1525505903 02/01/2014
29350 137242 538261521 17/02/2014
Or, add one (or more) of them into the group - but by doing so, the grouping will be different. Ex.:
group by nr_prpt_vcld, nrcvn_srvc
nr_prpt_vcld nrcvn_srvc cd_cli_srvc ult_acl_atv foo tupla de agrupamento
23157 140594 [Achatado] 31/12/2013 [Achatado] => (23157,140594)
23157 147914 [Achatado] 01/01/2014 [Achatado] => (23157,147914)
23157 153120 [Achatado] 02/01/2014 [Achatado] => (23157,153120)
27175 140676 [Achatado] 01/07/2014 [Achatado] => (27175,140676)
27175 141805 [Achatado] 02/07/2014 [Achatado] => (27175,141805)
27175 150291 [Achatado] 03/07/2014 [Achatado] => (27175,150291)
29350 137242 [Achatado] 16/02/2014 [Achatado] => (29350,137242)
29350 145502 [Achatado] 17/02/2014 [Achatado] => (29350,145502)
etc. If you add other fields in the clause group by
, he will group by tuple fields - each different combination of the same will generate a grouping, which will later be flattened and may have its aggregate values.
Normally, a query such as the one you made would generate an error in SQL. But apparently, what SAS did was implicitly add the other fields in the condition of grouping, so that its query was made as if the group by
were like this:
group by nr_prpt_vcld, nrcvn_srvc, cd_cli_srvc
So that only lines that had these 3 values identical would be part of the same grouping. If there is no more than one line with these conditions, the result is as if it had not grouped anything, simply returned each line separately...
Workaround
I don’t know if what you want to do is possible within the clause itself group by
: obtain the value of a column corresponding to the maximum of another column. A workaround would be to use two Queries, one for each column; in each of them, take the value of the correct column corresponding to the maximum value of the date, for a specific code:
select t2.nrcvn_srvc
from tabela_base t2
where t2.nr_prpt_vcld = ### um código fixo ###
order by t2.ult_acl_atv desc
limit 1
This is done in the section select
- when the grouping has already been done and the line is being set up with the results. Rows are sorted by date (so the first will be the most recent), and the number of results is limited to 1
, so that only the largest of them will be returned. You take the field you want and use it as the column value:
select t1.nr_prpt_vcld as nr_prpt_vcld,
(select ...) as nrcvn_srvc,
(select ...) as cd_cli_srvc,
(max(t1.ult_acl_atv)) as ult_acl_atv
Final result (note that I gave a nickname to the original table):
select t1.nr_prpt_vcld as nr_prpt_vcld,
(select t2.nrcvn_srvc
from tabela_base t2
where t2.nr_prpt_vcld = t1.nr_prpt_vcld
order by t2.ult_acl_atv desc
limit 1
) as nrcvn_srvc,
(select t2.cd_cli_srvc
from tabela_base t2
where t2.nr_prpt_vcld = t1.nr_prpt_vcld
order by t2.ult_acl_atv desc
limit 1
) as cd_cli_srvc,
(max(t1.ult_acl_atv)) as ult_acl_atv
from tabela_base t1
group by t1.nr_prpt_vcld
It seems that he implicitly added the new fields in the
group by
, instead of returning an error (which, I believe, would be what a normal BD SQL query would do). By the way, that was the result obtained, right? What would be the expected result?– mgibsonbr
Tip: Edit your question by adding the relevant details, don’t try to post as a comment. If you are having difficulty formatting, select the text and use the "Code Sample" button (which looks like this: { }), and it will be formatted as code (in gray background, with monospaced font).
– mgibsonbr