Java Multithreaded delete on same sets of table

I had to cleanup database ( few tables with given condition , where columns for conditions are always same ) e.g.

delete from table1 where date < given_date1 and id = given_id
delete from table2 where date < given_date2 and id = given_id

Where given_id and givendate relation varies on both table by table and id by id.

The actual delete condition is not always where date < givendate , I just wrote for example, so say one id has got 300 days of data, and other of 500 days of data, the where condition is allowed to delete oldes 10 days of data where 10 is a variable, based on user input, so at one iteration all nodes are processed with deleting oldest 10 days of data and thus query changes for each id, but the fact is that it would be on same sets of table

earlier that script was written in as sql script and doing its operation but was taking time, Now I have implemented a multithreaded java application where the new code looks like

for(i=0; i < idcount ; i++)
{
   //launch new thread and against that thread call
   delete(date,currentid);
}

function delete(date,id)
{
    delete from table1 where date < given_date and id = given_id
    delete from table2 where date < given_date and id = given_id

}

after implementing this I found deadlock on sql table, which was solved by indexing the tables, but still its not fast as it is supposed to be, If I have 500 threads they are all launched one after other, and obviously running on same sets of table. and sql is not actually executing in parallel on each table ?

When I monitor my java.exe and sqlserver.exe, its not busy at all ? I hope it is supposed to be.

Could anyone tell me what could be best approach to implement multithreaded delete on same sets of table, so that I can bump up the thread and do deletion in parallel and consume available resources

Answers


If all the actions are delete on a given id the I would just do a delete on each table doing all the ids at once.

e.g.

delete from table1 where date < given_date and id in (given_id1, given_id2 ..... )

If there are lots of given_ids the first insert them into a temporary table then execute each delete by joining the table to have deletions with the temporary table

Also if trying to use multiple threads then the improvement is really only expected if you act on a table in a thread so there will not be contention in the database.


Ignoring the problem you created...

Why not use the IN statement?

delete from table1 where date < given_date and id IN (id1, id2, id3, ...)

Update based on clarification: Based on the explanation in the comments, my guess is that you don't have good indexes and every delete statement is resulting in a table scan. Each table scan locks the table and thus the database can only process one statement at a time. Index the date and id columns along with any other column used in the where clause of your delete statement.


In my personal experience, I make a class to manage my queries and the communication with the database. I use a thread pool to manage my threads and simply have the threads make calls to my static database manager. The manager should have a synchronized method in it that acquires a lock() on to the database connection. The threads will then be able to access the database and their actions won't conflict with each other.


If you dont care about making all command in one transaction unit so put the delete in its own transaction (small one).


Need Your Help

Luabind Function using std::string& Reference with pure_out_value policy not possible?

c++ stl boost lua luabind

I'am trying to return a string from a function but it doesn't compile.

How to pair and connect a device programmatically using bluetooth in Android?

android bluetooth

Well there are plenty of question already have been asked on StackOverflow about how to paired &amp; connect a remote bluetooth device with android. I have tried all the of them, haven't find any p...