What is the strategy to identify a right answer, without exact comparison of String?

Asked

Viewed 316 times

0

I have a program in C# that is like a QUESTIONNAIRE, but instead of alternatives there are times when the user should type a response. The answer to the question is stored in a database and at that time the program only considers correct the answer containing the exact STRING saved in the database.

How could I change that? I would like to use the same answers from the bank, but I need a margin of acceptance in the answers typed.

  • You can use algorithms for fuzzy matching, for example Levenshtein Distance, etc. Something more robust: http://lucenenet.apache.org/

2 answers

2

The algorithm @Bruno indicated in the comments (Levenshtein Distance) is a good algorithm for determining the similarity of two strings. There is another one a little more robust, called Damerau-Levenshtein which also considers the transposition of two adjacent characters - that is, takes into account some simple spelling mistakes.

But I suggest rethinking the questionnaire design.

Fuzzy search, and string similarity calculus, cause poor user experience in this case. Let’s say we used Levenshtein’s algorithm and determined that the answer given by the user may differ from the answer in the database in 10 characters, maximum.

What if my answer has 11 different characters? Is it necessarily wrong? Why is an answer with 10 different characters correct, and my answer is not?

Also, these algorithms just tell us how many characters are different - but they don’t tell us what, or what their meaning is. I can add 15 characters to an answer without changing its meaning - but I can also add just one comma, and radically change its meaning.

It is for these reasons that most of the computerised questionnaires are multiple choice - and the questionnaires with open answers are usually manually analyzed by a human being.

0

  • If the question is What was one of the designers of the C#language? and I reply "Alligator anta cobra Anders(space)(space)(space)Hejlsberg duck", this is correct?

  • ? @Marcelouchimura

  • If the answer to which the PO refers is ABC and someone answers XYAXBC, is the answer right or wrong, in your opinion? And if someone answers with a synonym for ABC, a K, for example?

  • As I said, it depends on the project. I gave some ideas. The answer to 'What is the capital of Brazil? ' can be 'Brasilia', 'Brasilia', 'BRASILIA', Brasilia' etc. In all cases, I personally think that any answer should be accepted, and so one can use simple tools such as omission of accents and conversion to low box. That depends on what kind of question we’re dealing with.

  • If one replies "Savior before, Rio de Janeiro then and currently, Bvrasi1ia", is it incorrect? Because, just as those who ask do not know what to expect from an answer, the person who answers does not know exactly what to answer.

  • I’m not the one designing the system, but in my opinion this answer would be wrong, because I asked which is the capital of Brazil. Besides, no doubt is not Bvrasi1ia. I still do not understand your point.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.