In fact, what you want is impossible. What happens is that vc is creating a terminal node "DET -> lista_det.txt", in which the analysis will ask for this terminal list_det.txt specified by the non-terminal Det in the list. try to create a cfg or fcfg type file with the split elements and then call in a script, it will be easier.
For example.:
I create a file called Tester.fcfg with some grammar rules and lexical items with some strokes and a script x.py
My script will have:
import nltk
from nltk import grammar, parse, FeatStruct
sent = input('Digite uma sentenca ou palavra: ')
cp = parse.load_parser('tester.fcfg', trace=2)
tokens = sent.split()
trees = cp.parse(tokens)
for tree in trees: print(tree)
tree.draw()
And in the file Tester.fcfg:
##Regras Gramaticas##
Sentence -> SD[AGR=?a] SV[AGR=?a]
Sentence -> SD[AGR=?a]
Sentence -> SV[AGR=?a]
Sentence -> Nome
Sentence -> Verbo
Sentence -> PP[AGR=?a]
Sentence -> Pro[AGR=?a]
Sentence -> Pro[AGR=?a] SV[AGR=?a]
Sentence -> P[AGR=?a]
Sentence -> P[AGR=?a] N[AGR=?a] | P N
Sentence -> VBar
Sentence -> SD SV
SN[AGR=?a] -> SD[AGR=?a] | N[AGR=?a] | SD[AGR=?a] PP[AGR=?a] | N[AGR=?a]
SD[AGR=?a] -> Det[AGR=?a] N[AGR=?a] | Det[AGR=?a] | PP[AGR=?a] N[AGR=?a] | Det N
PP[AGR=?a] -> P[AGR=?a] SN[AGR=?a]
SV[AGR=?a] -> V[AGR=?a] SN[AGR=?a] | V[AGR=?a] PP[AGR=?a] SN[AGR=?a] | VBar
VBar -> Pro[AGR=?a] SV[AGR=?a] | Pro[AGR=?a] V[AGR=?a]
Nome -> N
Verbo -> V
##Tracos Lexicais##
Det[AGR=[NUM='sg', GND='f'],CAT =[Cat='Artigo']] -> 'a' | 'da' | 'na'
Det[AGR=[NUM='pl', GND='f'], CAT =[Cat='Artigo']] -> 'as' | 'nas'
Det[AGR=[NUM='sg', GND='m'], CAT =[Cat='Artigo']]-> 'o' | 'de' | 'no' | 'um'
Det[AGR=[NUM='pl', GND='m'], CAT =[Cat='Artigo']]-> 'os' | 'nos'
Pro[AGR=[NUM='sg', GND='m', PERS='3']]-> 'ele'
Pro[AGR=[NUM='sg', GND='m', PERS='1']]-> 'eu'
P[AGR=[NUM='sg', GND='m', PERS='3'], CAT =[Cat= 'Pronome', SubCat= Demonstrativo]] -> 'este' | 'aquele' | 'esse'
P[AGR=[NUM='pl', GND='m', PERS='3']] -> 'estes' | 'aqueles' | 'esses'
P[AGR=[NUM='sg', GND='f', PERS='3']] -> 'esta' | 'aquela' | 'essa'
P[AGR=[NUM='pl', GND='f', PERS='3']] -> 'estas' | 'aquelas' | 'essas'
N[AGR=[NUM='sg', GND='f'], CAT =[Cat='Substantivo', SubCAT='Comum']] -> 'biblioteca' | 'doutora' | 'leoa' | 'livraria' | 'professora' | 'lavadeira' | 'aluna' | 'madre' | 'menina' | 'mae' | 'mulher' | 'dentista' | 'juiza'
N[AGR=[NUM='pl', GND='f'], CAT =[Cat='Substantivo', SubCAT='Comum']]-> 'doutoras' | 'meninas' | 'mulheres' | 'juizas' | 'bola' | 'pata'
N[AGR=[NUM='sg', GND='m'],CAT =[Cat='Substantivo', SubCAT='Comum']] -> 'menino' | 'homem' | 'juiz' | 'doutor' | 'professor' | 'livro' | 'carro' | 'jogador'
N[AGR=[NUM='sg', GND='m'], SEMANTICA=[ ANI='animal']]-> 'pato' | 'cachorro' | 'gato'
N[AGR=[NUM='sg', GND='m'],CAT =['Substantivo Proprio'], SEMANTICA=[ ANI='humano']]-> 'Pedro' | 'Carlos' | 'Henrique'
N[AGR=[NUM='sg', GND='f'], CAT =['Substantivo Proprio'], SEMANTICA=[ ANI='humano']]-> 'Maria' | 'Veronica' | 'Lara' | 'Carla'
N[AGR=[NUM='pl', GND='m']] -> 'meninos' | 'homens' | 'livros' | 'carros'
N[AGR=[NUM='sg', GND='n']] -> 'estudante' | 'piloto' | 'presidente' | 'jornalista' | 'jogadora' | 'jornal'
N[AGR=[NUM='pl', GND='n']] -> 'estudantes' | 'pilotos' | 'presidentes' | 'jornalistas'
V[AGR=[NUM='sg'], CAT =['Verbo'], CP=['presente do indicativo']] -> 'comprar' | 'compra' | 'comprou' | 'pegar' | 'pegou' | 'ler' | 'leu' | 'ama' | 'amo' | 'amar' | 'jogar' | 'entrou' | 'amor'
V[AGR=[NUM='sg'], CAT =[Cat='Verbo', SubCat = ' Ligacao e adicao'], CP=['presente do indicativo']] -> 'e'
"""
Note that what will be called by the script will be the lexical items and grammar rules specified in the same file. The question is, what language models (in this case are traits organized by AVM [Value-Attribute]) you are following and for what type of computational implementation you want...
I don’t know if that’s exactly it, but from what I’ve seen, you’re trying to create beyond a corpus, forms of labelling and Parsing. See the NLTK documentation, plus some books to help better.
Can you explain better what is "grammar"? what is "opening a text" and what is "my non-terminal"?
– Sidon
Thanks for your patience, I edited my question. I don’t really express myself very well... Open a text: input, a text that will be parsed. Pass grammar: devise rules for the program to find in the text (Ex.: NP -> DET N must find all DET N sequences in the text) Not terminals: DET -> lista_det.txt, N -> lista_n.txt, V -> lista.txt
– pitanga
Sorry, I still don’t understand. What’s the main problem? What can’t you do? The problem is reading the files
lista_det.txt
,lista_n.txt
andlista.txt
, that’s it?– Sidon
A grammar is defined by terminals (N, DET, V...) and not terminals ('o', 'home', 'is'...). The terminals are my lists but it is not possible to place a list within a grammar (I need these lists because there are many entries and they will certainly increase, so difficult to put in a script). I need to read these files so I can apply the grammar rules to a text (Ex: NP -> DET N will find all DET N in the text. Sorry, don’t bother. The truth is I’m a beginner and what I need to do is more complex, I think. Thank you so much for trying to help!! :)
– pitanga