Is there a way to use Natural Language Processing (NLP) to transform strings into integers?

Asked

Viewed 78 times

0

I saw some examples of codes to transform strings such as "four" into an integer 4, but it was always quite manual. Have some more automatic way to do this using NLP?

One thing I noticed when I did the test below was that the spacyrecognizes both "four" and "4" as numbers, which is a good start, but is there any way to use this to make the transformation from one type to the other? I don’t know much about natural language processing, so I don’t know exactly what the name of the technique that would be responsible for this.

nlp = spacy.load("pt")

doc = nlp("quatro vídeos")
for token in doc:
    print(token.text, token.pos_, token.dep_)
# retorna 
# quatro NUM nummod
# vídeos SYM ROOT

doc = nlp("4 vídeos")
for token in doc:
    print(token.text, token.pos_, token.dep_)
# retorna 
# 4 NUM nummod
# vídeos SYM ROOT

Thank you!

  • A "more robust" and "more consistent" form may have - but "more automatic" I doubt - NLP and other techniques will involve cofiguration and parameterizations that will always be more complex than a dictionary match {'zero': 0, 'um': 1, ...} The reading of larger numbers written by exception ("two thousand one hundred and seventy-three"), I think yes, can be done in a simpler (automatic) way with NLP than with a common heuristica only with if and dicionários. But for a single digit, it certainly won’t be the simplest option (although it might be better for several other reasons)

  • got it. What I did to solve my problem was to create a dictionary, but I would have to be endlessly typing to cover all the possible cases. You can give me an example of these more consistent techniques that actually use NLP?

1 answer

0

I do not believe that NLP is used to solve this problem. The main uses of NLP are for text translation, text generation, caption generation for images... Currently, to solve these problems we use Recurrent Neural Networks (RNN), which are deep neural networks that have a high cost to be trained.

I believe that the simplest way to solve this problem is to actually use a dictionary. In this link there are several examples of python code that solve the problem. You will only need to translate the words from the dictionary into English.

I hope to have helped, in case you have any questions about NLP applications you may ask :)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.