How to store results of statistical calculations (mean, correlation) for later use in graphs?

Asked

Viewed 84 times

1

[![Example of the Relevance Matrix][1]][1]

I want to create a graph that is technically called "Relevance Matrix". The concept of this graph is to position the KPI (performance indicator) on x-axis, this KPI is the result of the correlation value [corr()] of the metric with the average of the General Satisfaction [SG]. The code I started creating is like this:

    y = df['SG'].mean()
    x = df['Compra_Futura']
    z = df['Recomendação']
    plt.scatter(x, y, s=10, c='green')
    plt.scatter(z, y, s=20, c='blue')
    plt.xlabel('Grau de Importância (r)')
    plt.ylabel('Satisfação Geral (pts)')
    plt.title('Matriz de Relevância')
    plt.show()

However, it is generating error of value:

ValueError                                Traceback (most recent call last)
<ipython-input-14-099a22cf1d51> in <module>()
      4 x = df['Compra_Futura']
      5 z = df['Recomendação']
----> 6 plt.scatter(x, y, s=10, c='green')
      7 plt.scatter(z, y, s=20, c='blue')
      8 plt.xlabel('Grau de Importância (r)')
~\Anaconda3\lib\site-packages\matplotlib\axes\_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, **kwargs)
   4241         y = np.ma.ravel(y)
   4242         if x.size != y.size:
-> 4243             raise ValueError("x and y must be the same size")
   4244 
   4245         if s is None:

ValueError: x and y must be the same size


  [1]: https://i.stack.imgur.com/1XURD.png
  • The error says that x and y is not the same size

  • Well, this I know, but how to fix the mistake is what I expect as a proper response!

  • So man, you want to plot all the elements of the column compra_futura with column average GS, this media will return to you a single element, while to another colonist all that belong to her, that is your mistake. to correct this just both have the same size. a suggestion that I do is you debug your code and see what is in each variable.

  • I have already found the solution: store the results of the calculations in a list, and then call them in the drawing up of the graph. as soon as I finish the code, I will post the result here. Thank you!

1 answer

1

To create the y with the desired size just create a listcomp using the code below considering that you are wanting to lock the yas the average of the attribute SG.

y = [df['SG'].mean() for k in range(0, len(df['SG']))]

OBSERVING: This solves the question of the error presented, but from the statistical point of view if you want the correlation of the variables, perhaps it is not appropriate to make the correlation of the variable with the mean and yes with each value of SG.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.