2
In the Iramuteq
has a very common analysis, which is the Descending Hierarchical Classification (CHD) or also known for reinert’s method.
Here is an example of the same: https://www.youtube.com/watch?v=H9xliY7Zy40
It groups similar texts together, creating new "sections" (similar to factor analysis).
- How can I do the same in R?
- Has some English name for it?
Follow an example of the same:
The analysis begins with your Textual corpus:
library(quanteda)
dfm1 <- dfm(data_corpus_irishbudget2010)
and from it (I imagine), reproduces the CHD that has the following main results:
It cassifies its copurs/text in different sections (classes), similar to factor analysis factors. So that 30.4% of the text was classified in Class 4, 28.4% was classified in Class 3 (another subject). And they have a practical interpretation, according to the most frequent words that appear in each class, and is presented below the other very common table:
According to the word/attribute that appears in the class, they give name to the class, for example, Class 4 is about nutritional aspects, because the words that appear together basically consist of foods.
Does it need to be this hierarchical classification or can it be another method? About Hierarchical Clustering in
R
, see?hclust
.– Tomás Barcellos
is that I’m going to apply the technique to data that was collected on a lace table, that is, text. I even thought to calculate some distance from the
dfm
that is generated in text mining, and then apply thehclust
, but it’s not the same. This technique, as I understand it, groups the texts within a section, and then informs which word is associated with the section from the Chi-Square– Guilherme Parreira
Search improve your question, offering a reproducible example and as desired the expected result.
– Tomás Barcellos
I changed @Tomásbarcellos, but I don’t know much more than that
– Guilherme Parreira
OP, I’m not much in this area but I’ve used a text sorting algorithm that does something similar to your question, take a look at this chapter here here.
– JdeMello
Thank you @Jdemello, I took your advice, and managed to finish the analysis with this type of modeling. By the name of the title of the question I found nothing in the R.
– Guilherme Parreira
Soon you must paint a package. https://juba.github.io/rainette/
– bbiasi