Posts by Thyago Oliveira Pereira • 49 points
5 posts
-
1
votes1
answer85
viewsQ: Delete words from a file
I have a file CSV and I have this list of words here: https://gist.github.com/alopes/5358189 The archive CSV has 3 columns, Text, user.name and Class and has about 100K of rows. I need to delete…
-
1
votes2
answers2008
viewsA: How to pre-process a text for the application in the Weka classification algorithms in Java?
So, man, I’m doing something similar, and I’ve come across the same problem. I collected the Tweets with Python and saved in a Json file, when I went to read the json on Weka it did not recognize. I…
-
0
votes1
answer128
viewsQ: Problem with a stopwords list in Weka
Hello, I have a problem which is as follows: I’m trying to apply a custom stopword list to a Weka filter and it’s giving me the following error: The list is a txt file I got from this site:…
wekaasked Thyago Oliveira Pereira 49 -
0
votes1
answer108
viewsQ: Delete single quotes, double quotes, commas, line breaks and records with the same value as a field in mongodb
I have a collection on Mongodb of Tweets, these records have a field called text, in this field I need to delete records that have the same value besides removing single quotes, doubles, commas and…
mongodbasked Thyago Oliveira Pereira 49 -
2
votes2
answers1499
viewsQ: Remove all line breaks from just one column of a csv file on Linux
I have a file csv with more than 500k of lines and need to remove line breaks from just one column efficiently in addition to deleting all the links it contains, a snippet of the file:…