Opened 6 years ago

Closed 5 years ago

#6062 closed defect (fixed)

DRM4G is instable when thoundsand of jobs are submited

Reported by: antonio Owned by: carlos
Priority: blocker Milestone: DRM4G-2.6.2
Component: DRM4G Keywords: drm4g, scheduler
Cc: carlos, minondoa

Description

I have create the following temaplate job

date.job
EXECUTABLE=/bin/date

and submited a job with 10000 tasks

$ drm4g job submit --ntasks 10000 date.job

after all tasks are been created, the gwd is quite instable giving some socket timeouts.

Also the MAD apears to be broken.

After restating (drm4g restart) some jobs are sumbitted, but it stops to execute them

And the worst part is the log:

-rw-r--r--   1 antonio antonio           0 dic  4 21:03 drm4g_commands.log
-rw-r--r--   1 antonio antonio           0 dic  4 21:03 drm4g_communicator.log
-rw-r--r--   1 antonio antonio           0 dic  4 21:03 drm4g_configure.log
-rw-r--r--   1 antonio antonio          75 dic  4 23:22 drm4g_em.log
-rw-r--r--   1 antonio antonio           0 dic  4 21:03 drm4g_im.log
-rw-r--r--   1 antonio antonio           0 dic  4 21:03 drm4g_manager.log
-rw-r--r--   1 antonio antonio         576 dic  4 23:22 drm4g_tm.log
-rw-r--r--   1 antonio antonio      240275 dic  4 23:23 globus-gw.log
-rw-rw-r--   1 antonio antonio     2445086 dic  5 15:28 gwd.log
-rw-rw-r--   1 antonio antonio           4 dic  4 23:22 gwd.pid
-rw-r--r--   1 antonio antonio          15 dic  4 23:22 gwd.port
-rw-r--r--   1 antonio antonio 26253341636 dic  5 15:28 sched.log

the sched.log is 26GB big

Change History (2)

comment:1 Changed 6 years ago by minondoa

  • Owner changed from minondoa to carlos
  • Status changed from new to assigned

The solution is being worked on in the branch DRM4G_scheduler #6063

comment:2 Changed 5 years ago by minondoa

  • Keywords drm4g scheduler added
  • Milestone changed from DRM4G-X.X.X to DRM4G-2.6.2
  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.