Error when using If parole

Asked

Viewed 113 times

0

I am trying to analyze the data from the following table. This is a Trades record just to make it clear.

DATAFRAME:

inserir a descrição da imagem aqui

def lista_preco():
    
    
    #ainda falta resolver os trades depois de 19h
    
    lista_dias = df['DATE'].unique().tolist()
    lista = df['pair'].unique().tolist()
    lista_buy = []
    lista_sell = []
    
    
    for dias in lista_dias:
        for tickers in lista:
            if (df['pair']== tickers and df['type'] == 'sell'):

                mask1 = (df['pair']== tickers and df['type'] == 'sell')

                df2 = df[mask1]
                df2['price_vol'] = df2['price'] * df2['vol']
                preco = df2['price_vol'].sum(axis = 0) / df2['vol'].sum(axis = 0)
                lista_sell.append([dias,tickers,preco,"LONG"])

            if (df['pair']== tickers and df['type'] == 'buy'):

                mask2 = df['pair']== tickers and df['type'] == 'buy'

                df2 = df[mask2]
                df2['price_vol'] = df2['price'] * df2['vol']
                preco = df2['price_vol'].sum(axis = 0) / df2['vol'].sum(axis = 0)
                lista_buy.append([dias, tickers, preco, "SHORT"])

    return lista_buy, lista_sell

ERROR:


ValueError                                Traceback (most recent call last)
<ipython-input-16-d301c71c9e88> in <module>
----> 1 lista_preco()

<ipython-input-15-8ac3ff545e1d> in lista_preco()
     12     for dias in lista_dias:
     13         for tickers in lista:
---> 14             if (df['pair']== tickers and df['type'] == 'sell'):
     15 
     16                 mask1 = (df['pair']== tickers and df['type'] == 'sell')

~\anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1477     def __nonzero__(self):
   1478         raise ValueError(
-> 1479             f"The truth value of a {type(self).__name__} is ambiguous. "
   1480             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1481         )

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

  • It would help if you could provide CSV or a sample of it to test, and say what you want as a final result

  • What I want to do with this data is to calculate the average price (I already know how it does). But the point is that I need to calculate the average price for purchase transactions ("buy" = "LONG") and sale("sell" = "SHORT") separately for the same asset. In addition, I also need to record the date on which the operation took place and the volume("vol") in a list. type : [[DATA,PAIR,PRICE,"LONG"], ...]

  • You can provide csv pf, only a few lines serve

  • csv is in the following format: DATE;time;pair;type;ordertype;price;cost;fee;vol;margin;Misc 30/12/2020;15:11:05;ABEV3;buy;market;10;1000;12;300;0;0 30/12/2020;15:10:08;limit MRV3;;buy;20;15500;9;200;0;0 30/12/2020;15:09:11;PETR4;;limit sell;30;4516;8;100;0;0 30/12/2020;13:46:53;ABEV3;buy;market;40;45415;10;500;0;0 30/12/2020;13:20:07;TAEE11;buy;limit;10;5465;13;600;0;0

  • Is the average price per day for each of the types? Or the total average price for each of the types?

  • Average price per day for each asset with distinction between buy and sell. Here is how to calculate the average price: https://ajuda.easynvest.com.br/hc/pt-br/articles/360049317813-Como-realizar-o-c%C3%A1lculo-do-Pre%C3%A7o-M%C3%A9dio-

  • Checks pf if the answer solves the problem

Show 2 more comments

1 answer

2


I think you’re complicating more than is really necessary.

Having:

import pandas as pd
from io import StringIO

csv = StringIO("""DATE;time;pair;type;ordertype;price;cost;fee;vol;margin;misc
30/12/2020;15:11:05;ABEV3;buy;market;10;1000;12;300;0;0
30/12/2020;15:11:05;ABEV3;sell;market;15;1000;12;300;0;0
30/12/2020;15:10:08;MRV3;buy;limit;20;15500;9;200;0;0
31/12/2020;15:09:11;PETR4;sell;limit;30;4516;8;100;0;0
30/12/2020;13:46:53;ABEV3;buy;market;40;45415;10;500;0;0
31/12/2020;13:20:07;TAEE11;buy;limit;10;5465;13;600;0;0
31/12/2020;13:20:07;TAEE11;sell;limit;10;5465;13;600;0;0
""")

df = pd.read_csv(csv, sep=";")

Add the column total_price:

df['total_price'] = df['price'] * df['vol']

df:

         DATE      time    pair  type ordertype  price   cost  fee  vol  margin  misc  total_price
0  30/12/2020  15:11:05   ABEV3   buy    market     10   1000   12  300       0     0         3000
1  30/12/2020  15:11:05   ABEV3  sell    market     15   1000   12  300       0     0         4500
2  30/12/2020  15:10:08    MRV3   buy     limit     20  15500    9  200       0     0         4000
3  31/12/2020  15:09:11   PETR4  sell     limit     30   4516    8  100       0     0         3000
4  30/12/2020  13:46:53   ABEV3   buy    market     40  45415   10  500       0     0        20000
5  31/12/2020  13:20:07  TAEE11   buy     limit     10   5465   13  600       0     0         6000
6  31/12/2020  13:20:07  TAEE11  sell     limit     10   5465   13  600       0     0         6000

We can start by grouping and calculate the average price per asset per type and day, assuming that 'active' is the pair, and make a sum() of total_price and vol and average.

df_tmp = df.groupby(['pair','type', 'DATE'], as_index=False).agg({'total_price': 'sum', 'vol': 'sum'})
df_tmp['mean_price'] = df_tmp['total_price'] / df_tmp['vol']
df_tmp.drop(['vol', 'total_price'], axis=1, inplace=True) # drop das colunas visto que ja nao precisamos

df_tmp:

     pair  type        DATE  mean_price
0   ABEV3   buy  30/12/2020       28.75
1   ABEV3  sell  30/12/2020       15.00
2    MRV3   buy  30/12/2020       20.00
3   PETR4  sell  31/12/2020       30.00
4  TAEE11   buy  31/12/2020       10.00
5  TAEE11  sell  31/12/2020       10.00

As for the second theme commented we can change the values of the type column, since we will not use them for this problem, with "SHORT" or "LONG" depending on the requested condition:

df_tmp['type'] = df_tmp['type'].apply(lambda t: "SHORT" if t == 'sell' else 'LONG')

df_tmp:

     pair   type        DATE  mean_price
0   ABEV3   LONG  30/12/2020       28.75
1   ABEV3  SHORT  30/12/2020       15.00
2    MRV3   LONG  30/12/2020       20.00
3   PETR4  SHORT  31/12/2020       30.00
4  TAEE11   LONG  31/12/2020       10.00
5  TAEE11  SHORT  31/12/2020       10.00

And finally, convert them to the requested list:

lista = df_tmp.values.tolist()

list:

[['ABEV3', 'LONG', '30/12/2020', 28.75], ['ABEV3', 'SHORT', '30/12/2020', 15.0], ['MRV3', 'LONG', '30/12/2020', 20.0], ['PETR4', 'SHORT', '31/12/2020', 30.0], ['TAEE11', 'LONG', '31/12/2020', 10.0], ['TAEE11', 'SHORT', '31/12/2020', 10.0]]
  • Oi miguel, the average price in this case series the "mean_price"/volume traded. It is an average price weighted by volume.

  • What do you mean? For example, for the first line ABEV3 in which the mean_price is 11500, will 11500/300? Can you explain this calculation with an example from the original df pf

  • exactly that!

  • Pedro, I edited the answer, confers pf is the intended

  • I think you’re adding something up. The correct result for ABEV3 would be 28.75. line 1 + line 2 = (10 x 300) + (40 x 500) = 23000/(300+500) = 28,75

  • Pedro, reply edited, confers pf

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.