0
I have an executable that I need to run a database giant. How the processing time would be unviable if I did it in a single Thread
.... Following the idea of this question, I decided to create an executable that does N tasks in parallel... so basically I always do the same thing for different time intervals.
static void Main(string[] args)
{
var dataInicio = DateTime.Now.AddDays(-35);
var dataFim = DateTime.Now;
var listaTasks = new List<Task>();
while (dataInicio > dataFim ? dataInicio >= dataFim : dataInicio <= dataFim)
{
var d = dataInicio.Date;
var meuBll = new MeuBll();
listaTask.Add(Task.Run(() =>
{
meuBll.Foo(d);
}).ContinueWith(
x => meuBll.Dispose())
.ContinueWith(x => GC.Collect()));
var meuBllNovo = new MeuBll();
listaTask.Add(Task.Run(() =>
{
meuBllNovo.Boo(d);
}).ContinueWith(
x =>
meuBllNovo.Dispose())
.ContinueWith(x => GC.Collect()));
if (listaTask.Count >= 2 * 5)
{
Task.WaitAll(listaTask.ToArray());
listaTask.Clear();
}
dataInicio = dataInicio > dataFim ? dataInicio.AddDays(-1) : dataInicio.AddDays(1);
}
Task.WaitAll(listaTask.ToArray());
So basically I chose to run, for example, a 10-day interval at once, as you can see in if (listaTask.Count >= 2 * 5)
, the problem is that my methods Foo
and Boo
has several connections to a single oracle database...
public class MeuBll : IDisposable
{
private readonly OracleDal _oracleDal = new OracleDal();
public void Foo(DateTime data)
{
var t1 = Task.Run(() =>
{
_listaSucesso = _oracleDal.ObterListaSucesso();
});
var t2 =
Task.Run(() =>
{
listaFalhas = _oracleDal.ObterListaFalhas();
});
t1.Wait();
t2.Wait();
foreach (var sucesso in _listaSucesso)
{
//uma logica para inserir objetos em uma "lista de merge"
}
if (listaDeMerge.Any())
Task.Run(() => _oracleDal.MergearLista(listaDeMerge)).Wait();
}
public void Dispose()
{
if (_hash != null)
_hash.Clear();
_hash = null;
}
}
And my method of merging:
public void MergearLista(List<MyObject> listaMerge)
{
using (var conn = new OracleConnection(Connection.ConnectionString))
{
if (conn.State != ConnectionState.Open)
conn.Open();
using (var oCommand = conn.CreateCommand())
{
oCommand.CommandType = CommandType.Text;
oCommand.CommandText = string.Format(@"
MERGE INTO MyTable dgn
USING (select id from another_table where field = :xpe) d ON ( TO_CHAR(dateHappen, 'DDMMYYYY') = {0} and id = :xId) WHEN MATCHED THEN
UPDATE SET OK = :xOk, dateHappen = SYSDATE
WHEN NOT MATCHED THEN
INSERT (fields....)
VALUES (values...)");
oCommand.BindByName = true;
oCommand.ArrayBindCount = listaMerge.Count;
oCommand.Parameters.Add(":xId", OracleDbType.Int32,
listaMerge.Select(c => Convert.ToInt32(c.Id)).ToArray(), ParameterDirection.Input);
oCommand.Parameters.Add(":xPe", OracleDbType.Varchar2,
listaMerge.Select(c => Convert.ToString(c.Xpe)).ToArray(), ParameterDirection.Input);
oCommand.ExecuteNonQuery();
}
}
}
The problem is that for each processing day, the process takes around 2 hours... and daily has a routine to back up the bank, which makes it out for about 10 minutes, which will make my executable launch some Exception
or simply lockar
some table..
What do I do? I stop the executable "in the hand" and put to rotate again taking the days that he has already executed... Also, many connections are simply open... so I kill these sessions at hand too...
How do I fix this? Is there a way to force all open connections to simply close? Force Dispose
theirs?
I have not read your question yet but I have struck my eye on one thing and I am telling you to read this first of all: http://answall.com/q/110854/101
– Maniero
I’ll read it! But so, I was bursting the memory due to the large amount of data... I saw no alternative but to force the GC so I no longer needed my objects..
– Marllon Nasser
If you’re bursting memory, it has two outputs, puts more memory (it doesn’t always solve) or changes the algorithm. I don’t know what the problem is yet, but probably the code as a whole has to be rethinking. The problem is not garbage collection, it is garbage produced. Do not try to end the symptom, cure the disease.
– Maniero
@bigown, yes...it was bursting the memory, then I improved the algorithm and apparently improved... but so, I had already implemented the issue of forcing the GC...and did not move again. The problem is in giving
Dispose
on all open connections...– Marllon Nasser
About the
dispose()
: http://answall.com/q/22284/101, http://answall.com/q/102609/101 and http://answall.com/q/98341/101. I did not understand its function in the code. You tried to make this algorithm withoutTask
first, let it round and then applyTask
? It’s almost always the right way to go, except, and look there, if you master the use ofTask
. There are several code snippets that don’t make any sense. Hence I conclude that if even in simple things you are confused, imagine in the complicated part.– Maniero
Its utility is to leave all objects
null
so that the GC forces their cleansing.... It already exists today withoutTask
, it just takes too long to process.– Marllon Nasser
But that’s just what can’t do. This alone can already be part of the problem. If you make a Profiling application will probably find that what is killing her is GC working like crazy. I have very little experience with code using
Task
, but I’ve never seen one the way it’s used. I’m not saying it’s wrong, I can’t say it, but it seems strange to me. Hard to understand at the very least. There’s something I don’t even know if it’s worth having the tasks (just speculating). Do you know at least where the bottleneck is? It can be the DB.– Maniero
From what I understand, what exists today, has all these problems. Clean them up and then think about
Task
. You must have read this: http://answall.com/q/1946/101. I think your case is wrong algorithm.Task
can even improve, but will not solve anything. Maybe when you find the solution,Task
is unnecessary. Almost all performance problems are algorithmic and not lack of competition. Start by finding out where you are slow in detail.– Maniero
@bigown, the problem is definitely on the bench... basically I deal with tables with more than 50 million records, where it is often necessary joins (most of the time by VARCHAR) with other giant tables.... It is really a problem more known... we need and we will remodel the bank.... But until then, this algorithm that helped... the biggest problem in it is controlling all the connections... I guess there’s no way to do that :(
– Marllon Nasser
If you do it right you might fall for a few minutes even with one.
– Maniero
po, big... you are a guy who knows too much, I have great respect and admiration for the work you do for this community... But I am of the thought that there will always be a better way to do.... and sometimes we need to work with what we have, especially when we "fight" over time... tight deadlines, etc... and it is a large application that we simply "fall" from parachutes in the middle of development, it is difficult to think about refactoring... (in my case, it would be refactoring refactoring)... Basically I need to sweep the entire base "for yesterday". it is complicated rs
– Marllon Nasser
Yeah, sure, but a lot of times the deadline comes due because you’re fighting with the wrong thing. My TCC in college was solving a problem that was too slow takes about 80 days to complete. A team was working for 6 months and could not solve. I narrowed it down to 1.5 minutes (on current computers it would do something around 1 second). I’m saying this not to say that I’m good (I don’t think so), until what I did was too simple, it took me 2 days, most of it to understand what they had done. I mean that often solving the problem takes less work than getting around it.
– Maniero
I understood.... but I don’t know, I don’t know.... I think if the bank were structured in another way, more than half of the problems would be solved.... but as I said is complicated... the lack of a DBA gives it.... rs
– Marllon Nasser