slurmdSection: Slurm Daemon (8)
Updated: Slurm Daemon
- Used with configless to set an alternate AuthInfo parameter to be used to establish communication with slurmctld before the configuration file has been retrieved. (E.g., to specify an alternate MUNGE socket location.)
- Report node rebooted when daemon restarted. Used for testing purposes.
- Clear system locks as needed. This may be required if slurmd terminated abnormally.
- Print the actual hardware configuration (not the configuration from the slurm.conf file) and exit. The format of output is the same as used in slurm.conf to describe a node's configuration plus its uptime.
- --conf <node parameters>
- Used in conjunction with the -Z option. Used to override or define
additional parameters of a dynamic node using the same syntax and parameters
used to define nodes in the slurm.conf. Specifying any of CPUs,
Boards, SocketsPerBoard, CoresPerSocket or
ThreadsPerCore will override the defaults defined by the -C
option. NodeName and Port are not supported.
For example if slurmd -C reports
NodeName=node1 CPUs=16 Boards=1 SocketsPerBoard=1 CoresPerSocket=8 ThreadsPerCore=2 RealMemory=31848
the following --conf specifications will generate the corresponding node definitions:
--conf "Gres=gpu:2" NodeName=node1 CPUs=16 Boards=1 SocketsPerBoard=1 CoresPerSocket=8 ThreadsPerCore=2 RealMemory=31848 Gres=gpu:2
--conf "RealMemory=30000" NodeName=node1 CPUs=16 Boards=1 SocketsPerBoard=1 CoresPerSocket=8 ThreadsPerCore=2 RealMemory=30000
--conf "CPUs=16" NodeName=node1 CPUs=16 RealMemory=331848
--conf "CPUs=16 RealMemory=30000 Gres=gpu:2" NodeName=node1 CPUs=16 RealMemory=30000 Gres=gpu:2"
- --conf-server <host>[:<port>]
- Comma-separated list of controllers, the first being the primary slurmctld. A port can (optionally) be specified for each controller. These hosts are where the slurmd will fetch the configuration from when running in "configless" mode.
- -d <file>
- Specify the fully qualified pathname to the slurmstepd program to be used for shepherding user job steps. This can be useful for testing purposes.
- Run slurmd in the foreground. Error and debug messages will be copied to stderr.
- --extra <arbitrary string>
- Set "extra" data on node startup. If this is a json string and SchedulerParameters=extra_constraints is set in slurm.conf, then jobs may use the --extra option to filter based on this "extra" data.
- -f <file>
- Read configuration from the specified file. See NOTES below.
- Start this node as a Dynamic Future node. It will try to match a node definition with a state of FUTURE, optionally using the specified feature to match the node definition.
- Print Generic RESource (GRES) configuration (based upon slurm.conf GRES merged with gres.conf contents for this node) and exit.
- Help; print a brief summary of command options.
- --instance-id <cloud instance id>
- Set cloud instance ID on node startup.
- --instance-type <cloud instance type>
- Set cloud instance type on node startup.
- -L <file>
- Write log messages to the specified file.
- Lock slurmd pages into system memory using mlockall (2) to disable
paging of the slurmd process. This may help in cases where nodes are
marked DOWN during periods of heavy swap activity. If the mlockall (2)
system call is not available, an error will be printed to the log
and slurmd will continue as normal.
It is suggested to set LaunchParameters=slurmstepd_memlock in slurm.conf(5) when setting -M.
- -n <value>
- Set the daemon's nice value to the specified value, typically a negative number. Also note the PropagatePrioProcess configuration parameter.
- -N <nodename>
- Run the daemon with the given nodename. Used to emulate a larger system with more than one slurmd daemon per node. Requires that Slurm be built using the --enable-multiple-slurmd configure option.
- Change working directory of slurmd to SlurmdLogFile path if possible, or to SlurmdSpoolDir otherwise. If both of them fail it will fallback to /var/tmp.
- Verbose operation. Multiple -v's increase verbosity.
- -V, --version
- Print version information and exit.
- Start this node as a Dynamic Normal node. If no --conf is specified, then the slurmd will register with the same hardware configuration as defined by the -C option.
- The location of the Slurm configuration file. This is overridden by explicitly naming a configuration file on the command line.
- Specify debug flags for slurmd to use. See DebugFlags in the slurm.conf(5) man page for a full list of flags. The environment variable takes precedence over the setting in the slurm.conf.
- SIGTERM SIGINT
- slurmd will shutdown cleanly, waiting for in-progress rollups to finish.
- Reloads the slurm configuration files, similar to 'scontrol reconfigure'.
- Reread the log level from the configs, and then reopen the log file. This should be used when setting up logrotate(8).
- This signal is explicitly ignored.
If you are using configless mode with a login node that runs a lot of client commands, you may consider running slurmd on that machine so it can manage a cached version of the configuration files. Otherwise, each client command will use the DNS record to contact the controller and get the configuration information, which could place additional load on the controller.
This file is part of Slurm, a resource management program. For details, see <https://slurm.schedmd.com/>.
Slurm is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
- ENVIRONMENT VARIABLES
- CORE FILE LOCATION
- SEE ALSO
This document was created by man2html using the manual pages.
Time: 20:19:17 GMT, November 20, 2023